This analysis provides a sharp technical breakdown of how Instagram’s multi-stage funnel replaces social connections with interest-based embeddings. It is a concise masterclass in the system design that powers modern algorithmic virality.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
0 to 10 Million Followers in 4 Days: Inside Instagram's Recommendation EngineAdded:
This week, an Instagram account went from zero to over 10 million followers.
Not in a year, in 4 days with about just 50 posts.
It crossed the follower count of one of the biggest political parties in India, an account that has been posting for years. But that is not what we are looking at today. We are looking at the system underneath because a 4-day-old account started zero followers.
So, how does it reach 10 million feeds?
Instagram, over the last few years, slowly changed from a social network into [music] a recommendation system.
And today, we are going to break down that system step by step.
Let's go back to 2010 when Instagram first launched.
The feed was incredibly simple.
You follow some people, you open the app, and you see their post.
Newest post at the top. That's it. Under the hood, this was basically just a follow graph. Think of every user as a node in a graph.
And when you follow someone, that creates an edge between you and them.
So, when Instagram builds your feed, it simply looks at all the accounts connected to you through those edges.
Grab their latest post, sort them by time, and shows them to you. Now, how do you make this fast?
Here's the trick Instagram used.
When I post a photo, the system does not wait for my followers to open the app.
It immediately copies that post into a ready-made list of each of my followers.
So, when you open Instagram, your feed is already prepared. You only read it.
And this is called fan-out on write. You do the work at write time, when the post is created, so that reading the feed later stays cheap. A feed is read far more often than it is written. You post maybe once a day. You open the app many times a day. So, it makes sense to do the heavy work once when the post is created and keep every feed open cheap. But this design has a well-known weak point.
Think about an account with 50 million followers.
One post from that account now means 50 million copies. 50 million writes for one post.
And this is called the celebrity problem. And it is one of the most common questions asked in system design interviews. Large platforms usually solve it with a mix.
Normal accounts use a fan out on right.
Very large accounts are handled the other way. Their posts are fetched only when a follower opens the app. All right. So, now the app keeps growing.
People start following hundreds of accounts, and Instagram notices a problem. By their own number, people were missing about 70% of their posts in their feed.
The post you would actually care about were getting buried. Only because they were posted at the wrong time. So, in 2016, Instagram made its first big change.
It stopped sorting the feed by time. It started sorting by predicted interest.
The post you are most likely to engage with goes on top. It changed the order of the feed. It did not change where the post came from.
You were still only seeing accounts you follow, and that is the real ceiling of this design. If there was no edge between you and an account, that account cannot reach you at all.
It does not matter how good the post is.
For a company that wants you on the app for a long time, and is now competing directly with TikTok, that is a serious limit. So, Instagram did a much deeper rebuild.
Not the ordering this time, the sourcing.
Earlier, the question was, what did the people I follow post?
The new question is, out of all the content on the platform, what should this person see right now?
And this is a different graph.
We call it an interest graph.
The connection here is not that you follow someone. The connection is that your interest and a piece of content match. Now, think about the scale of that question.
There are billions of posts. You cannot run a heavy calculation on billions of posts every time someone opens the app.
It would be too slow and too expensive.
So, Instagram uses a funnel with four stages.
Retrieval, first stage ranking, second stage ranking, and final re-ranking.
Each stage takes a large set of candidates and passes a smaller, better set to the next stage. This funnel is the first system design idea to take away.
When you cannot afford to run your best and heaviest check on everything, you run a cheap and rough filter first to cut the numbers down.
Then a slightly better filter.
You save your most expensive and most accurate model for the small set that survives.
The same pattern shows up in search engines, in ad systems, in fraud detection, and it is everywhere.
Now, let us look at stage one, retrieval.
The job is to go from billions of posts down to a few thousands that are roughly relevant. The main tool here is model called a two-tower network. The first tower looks only at the user, your interest, your recent activity.
It turns you into a list of numbers, a vector.
We call this a user embedding.
The second tower looks only at piece of content and turns it into a vector in the same way. The model is trained so that when a user is likely to engage with a post, their two vectors come out close to each other.
The content tower looks only at the content, never at the user. So, Instagram can run it ahead of time.
It can calculate the vector for every post in advance and store it. Because none of that depends on who is watching.
Then, when you open the app, the system has to do only one fresh calculation, your user vector.
After that, finding posts is just looking up stored vectors that are close to yours.
So, a design choice that looks like a limitation, keeping the two towers apart, is actually what makes the whole thing fast enough to work.
If you had let the model mix user and content together in one network, every score would have to be calculated live, for every post, every time.
At this scale, that is simply not possible.
Now, one more small point.
Finding the closest vectors out of billions still sounds slow. It is handled by something called approximate nearest neighbor search, or ANN.
Notice the word approximate. The system does not promise the perfect closest matches. It promises very close matches very fast. And this is the same trick I mentioned in my YouTube recommendation system design video.
All right. So, retrieval already narrowed millions of posts down to a few thousand candidates.
Now, ranking decides which 10 or 15 posts actually make it onto your screen.
And ranking also happens in stages.
First, a lightweight model quickly trims thousands of candidates down to a few hundred. This technique is called knowledge distillation.
Then comes the heavy ranking model.
This one looks at much richer signals.
Not just whether you'll like a post, but whether you'll save it, share it, or even tap show fewer posts like this.
All those predictions get combined into a single score called the value model.
A like's adds value, a save adds more, a share adds even more. Negative feedback reduces the score. Finally, reranking cleans everything up. It removes unsafe content and adds diversity so your feed doesn't feel repetitive.
Remember the fanout on write from the old design?
You post and the system pushes your post into readymade list.
That is a push model.
But you cannot push a feed of strangers.
The system has no way to know in advance which stranger's post is right for you.
It only knows that when you actually open the app. So, the new feed is a pull model. The moment you open Instagram, the whole funnel runs and your feed is built right then for that moment. This is the same right time versus read time choice from earlier, just flipped. The old feed did the work at write time. The new feed does the work at read time.
Push gives you cheap reads, but it cannot handle strangers. Pull can handle the entire platform, but now every single feed open triggers real work.
Instagram even has names for those two kinds of content. Posts from people you follow are connected reach.
Posts from people you do not follow are unconnected reach.
For most users today, the unconnected part is a larger part of the feed. Now guys, quick note before we proceed. Byte Mark Academy two courses live right now.
Cybersecurity for developers and system design mastery. Fintech, DSA, AI, and LLM systems are next on the road map.
Join Byte Monk Unlimited today and every future course is included. Locked in at today's price. The price goes up with each new course we add. And because I keep this group small, early members will get most of my time and attention.
So, if you have been thinking about your next career move, this is the moment to jump in. So, what happens with a post that is 1 hour old? From an account nobody has heard of.
It has no history at all.
And this problem has a name. It's called the cold start problem, and every recommendation system has to deal with it.
Instagram handles cold start from two sides. First, the content can be understood on its own. Even with zero engagement.
The system looks at the reel and builds an embedding directly from what is inside it.
The audio, the images, the text on screen, the topic.
So, even a brand new post can be placed into the interest graph and matched to people without waiting for likes.
Second, Instagram runs a test. A new post, specially a reel, is shown to a small group of non-followers first. And the system watches one number carefully.
Not total likes, the rate.
Likes and sends per view.
If a small test audience is sending a new post to their friends at an unusually high rate, that's a strong signal.
So, the system shows the post to a larger audience.
If the rate stays high, it shows to an even larger audience.
Each round feeds the next round.
A good rate earns a bigger audience. A bigger audience produces more data. If the rate holds, the audience grows again.
This is why a post can move from a few hundred people to a few million in short time.
So, now we can directly answer the question we started with. A 4-day old account with no followers reached 10 million people because of three things working together.
Number one, the interest graph rebuild.
An account no longer needs followers to be eligible to reach you. The follow edge is not required anymore.
Two, content understanding and the test on non-followers. A brand new account with no histories could still be classified and given a first small audience.
Number three, the rate-based feedback loop.
Content that people were sending to each other quickly kept earning larger and larger audience.
So, the post getting attention is just one part of the story, but the reason that attention could reach 10 million strangers in four days is the system.
[music] And the same system is deciding what is in your feed right now.
>> [music]
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











