The video masterfully exposes the invisible machinery that transforms a simple click into a massive engineering feat of attention management. It serves as a sobering reminder that a seamless user experience is often just a highly optimized trap for human psychology.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
The Ridiculous Engineering Of YouTube VideosAdded:
This is the most watched user interface in the world. Humanity spends more time watching YouTube than Netflix, Disney Plus, Tik Tok, Instagram, even TV. And it's absolutely not by chance. Take a look at this. If you hover on the video on desktop or you wait a few seconds on mobile, you'll see this, a preview of the video. You have the title, thumbnail, few seconds of video. Great.
Well, what I just showed you isn't the interface you have today. Because since 2021, it looks like this. Well, what they realized is that choosing a video to watch is actually one of the highest friction points in the entire experience. The preview player casually has the exact controls you need to actually watch the entire video without even committing to it. You have a sign bar to move around. It starts with subtitles, but you can turn them off and actually turn on the audio itself right from the thumbnail. I found myself watching minutes, even entire videos in this little hover preview. And I'm sure I'm not alone. And this is by design so much that just like a regular video, after a few seconds, even if you watch in this little preview window, it's counted as a view and saved in your watch history. And this is just the start. Every button, every interaction, every seemingly inconsequential decision is engineered to make you stay up at 3:00 a.m. deep in the rabbit hole watching a random documentary about shipping containers. I'm on this series, we go behind the scenes of the insane design, engineering, and psychology of the tech products you use every day.
because there are three layers that make YouTube videos work and only the first one is what you can actually see. Now take a look at this. If I want to close this video, I literally can't. I can press here and it will go into picturein picture mode while I look for something else with my video still playing. And only by pressing a second time on the X, I can make it go away. You've probably done this thousands of times, but 99% of people never notice this extremely important detail. Mobile UI navigation is gesture based rather than tab based.
And the two main gestures you can do here are either swipe down to keep playing picturein picture or swipe up to go full screen and both action do not close the video. There's no button to close it other than well closing the app. This is YouTube's latest redesign on TV. And one of the biggest changes is that now when you press any key, it brings up the controls. And look what's peing underneath here. Recommendations.
But even while you're scrolling through recommendations, they're taking over most of your screen, but your main video is still playing. This is friction engineering. See what's going on here is the same principle that you've seen with the playable thumbnails, which is minimizing nonplaying time. You might think YouTube is optimizing for increasing the amount of time you spend on the app, but that's not actually the case. You know, Netflix, when you spend 20 minutes browsing for something to watch, yes, you're still spending time on Netflix technically, but that's a time that's actually hurting your perception of the product. It gives you the idea that, hey, there's really nothing to watch here. And this is the same for YouTube. They're obsessing and optimizing not over time on the app but over playing time on the app. And they can achieve it in two ways. By growing this, but also by minimizing this. This is why the thumbnails play the video now. Why even while you're browsing recommendations, you want the video to be visible. See, on one side we have the OG YouTube. Just a grid of videos and a player. Very simple. It's what made YouTube.
When it was born, they were definitely not thinking about optimizing the thumbnail with a playable video. Their main problem was that they had literally 40 videos on the platform. But on the completely other side, you have actual people and people's behavior changed because you know who doesn't have this problem of non-playable time? Tik Tok, Instagram, and basically all of short form platforms including YouTube shorts.
Users have gotten used to this idea of autoplaying, just getting the videos fed to them, not making a choice. So the challenge they have in designing YouTube is to maintain what makes the platforms the most loved and the one with the best content. but at the same time designing for today's user behavior. Take a look at this. This is a clip captured by a user on Ax in 2024 when they got a test version of YouTube where you would scroll up and down. It doesn't close the full screen yet actually doom scrolls long form glue YouTube videos. Of course, that never happened. That never shipped. But at any given time, they're probably running dozens of these AB tests at scale to see how people react.
PewDiePie recently made a video where with a bunch of extensions he got rid of shorts of the algorithmic homepage, got rid of all the fancy features and get back to OG YouTube. And you might think, yes, that's what I want. And listen, there's a part of me that really wants this. I mean, look at my profile creation. But at the same time, I build tech products for a living. And the gorland rule of product management is to never listen to your users. And instead, you should maniacally observe what they do. Data speaks for itself. All these things that we've seen so far, they're here because they work. In 2014, they introduced autoplay for the next video and the countdown was 10 seconds and now it got to five. And while people out there are writing custom scripts to change that, the reality is that these things work. The thing that triggers me most, for example, is that they changed the quality picker from resolution to some generic higher quality, lower quality. And this to me is infuriating.
But I understand why they did it. See, all of this design, engineering, and constant experimentation is to keep a balance between getting people to watch and not turning YouTube into a doom scrolling slopfest. Now, I could go on and on about the little design details in the UI of this unassuming player. The real magic here is not really in this, but rather in this, cuz see, the fact that this is now the most washed media in the world, it means that not only the interface and the platform have been hyperengineered, but also the content itself. Let's rewind this for a second.
Everything I just showed you, this hook, the reveal, the order I put them in, the fact that I made you look at the picturein picture button before telling you what it was weird about it, every single decision, I engineered all of it.
It was planned. Many of your favorite creators videos look organic and random.
But I can assure you that behind that there's hours of work, possibly an entire team of people dedicated to this.
So YouTube's superpower is that it's the only place with such highquality in-depth content where you choose what to watch. and your ability to choose is what makes this place the goat. It's why you can find in-depth tutorials for anything. A 12-hour video essay about Star Wars and this. But this choice that you have, this choice of what to watch also creates something else. This is a chart of one of my own videos. For 55 days, it got 3,776 views. It was basically dead. And then this happened. It ended up having 3.5 million views. It's the same video. The content didn't change except here. Here I changed the thumbnail. Actually, for this one video, I tried 20 different thumbnail. For this one, all of these.
For this other video, all of these. Let me show you just how deep the convince you how to click and make a choice rabbit hole goes. Since most people use dark mode, lighter backgrounds are better because they stand out more. Red attracts attention if you contrast it with other colors. The reason why these worked, it actually doesn't matter if things are not realistic. The goal is not to be realistic, but to be recognizable in 0.1 seconds. This is why the UI in this video's thumbnail was clearly exaggerated. What's more, these drawings don't really mean anything.
Their goal is to convey to you the idea that there's something hidden in this video to discover about YouTube. Mr. Beast ran an entire full-blown test on all his videos and found out that having a mouth closed was slightly more performing than having mouths open. So now it doesn't do that anymore. Creators go out of the way of creating 10, 20 variations, changing shirt colors, positions of the frame, anything to make the thumbnail more clickable. Now I know what you're thinking. Let's go back to old YouTube without clickbait, where everything is organic and real. I'm the first one that would love that. These videos would be way easier to make. But let me ask you this. If this video was really organic and descriptive and OG, it would look like this. And this is its actual thumbnail and title. Now, be honest. Which one would you actually have clicked and which one would you have skipped? The fact that this platform has become the biggest in the world and is optimizing to get people's attention as a consequence meant that the videos themselves have becomes hyperoptimized as well. Probably even more engineered than the platform and YouTube itself. And this is layer 2 content engineering. This is the retention curve for one of my videos. I can literally see for every single phrase I say whether people drop, they remain, and what you guys actually like or dislike. From the YouTube Studio app, I can see all sorts of metrics compared with any other metric. I have extensions installed that make me see even more metrics. This is more powerful than many other data analytics softwares out there. And if you ask, wait, isn't this overwhelming for creators? Well, yes.
And YouTube not only knows it, but it's trying to manage that. Take a look at this. This is YouTube Studios homepage.
And here you basically have a feed of your latest comments across all your videos. A really useful feature to see at a glance what's going on. But in the past video, I ran the numbers and on my channel, 21.4% of comments are positive.
17.1% are neutral, but 61.5% are negative ones. And this is normal. It's called negativity bias. Why you're more likely to leave a one-star review if you had a bad experience at a restaurant, then leave a fivestar review if you had a good one. But since a few months ago, I noticed something weird. I would look here and only see positive comments.
Then when I expand and look at all of them, I see the negative ones as well.
But YouTube is actually filtering the comments you see here, effectively shielding creators from negativity. Now, I cannot officially confirm this because YouTube hasn't mentioned anything about this change. And if you're a fellow creator, let me know in the comments.
But the thing is, YouTube knows this.
They know that if creators stop creating, the platform is going to die.
And if new creators don't have a chance to start on YouTube and have success, the platform is also going to die. Have you seen this new hyping feature where you can give props to a video you really like? Well, it's been built exactly for this. It boosts your video in the algorithm, and the boost is inversely proportional to the amount of subscribers you have. The less subscribers, the bigger the boost. But there's one less layer behind YouTube.
And you're actually looking at this right now. Because without it, what you would see is well this. See, behind the scenes, a YouTube video, it's actually not a video. But before we get to that, let me show you what the sponsor of this video and friend of the channel at Gamma just launched. So when talking to the people at Gamma, they warrant some metrics about my channel. But not everyone understands these random screenshots from YouTube Studio. So in the old world, I would have chosen one of my design tools or a slidem tool, choose a template, import the data, create some charts, type my notes, and eventually I'll have a presentation to send them. But instead, with Gamma, I can just take the raw screenshots, paste them in, give it some basic information about what I want to convey, hit generate, and look at that. Gamma suggests ideas, generates multiple styles, and helps you explore creative direction. Let's pick this one. I'll keep it brief for the number of slides.
And there we go. Full presentation in seconds. You know Gamma for presentations already, but with Gamma, imagine you can do infographics, posters, social media content without any design skills required. It's basically an always on design partner for anything you need to create. You can join the 70 million users that already use Gamma with the link in the description. But now let's get back to the video because layer 3 is where things get messy. Let me ask you what happened when you clicked on this video.
Well, of course the page opened and the video loaded. And what do you mean what happened? Well, when you clicked, your browser wrote a get request to YouTube servers with your user ID, info about your browser, your cookies, and the ID of this very video. Now, YouTube did not send you back the video because a YouTube video isn't actually a video.
It's actually hundreds of videos. See, when I upload a video to YouTube, say a 20 GB file at 4K, it's basically impossible to stream as it is. So, YouTube creates multiple versions, one in 4K, 1080p, 720p, 480, 360, and even the version in 4K that you're seeing here is of a lower quality than my original. And then it chops every single one of these versions into tiny 2 to 10 second segments. So, one upload actually becomes hundreds of files. So, what YouTube sends back from your request, it's called a Dash manifest. It's essentially a restaurant menu. all the links to all the versions of your video in all the resolutions. Then your browser looks at your screen reser solution, how fast your internet is at the moment, what device you're on, and sends a second request. This this time for the first chunk of, say, the 1080p version just for the first two seconds of the video. The chunk arrives, it plays, and while you're watching the video, your player is already fetching the second chunk, chunk two, while checking your connection, and this keeps happening over and over and over. the little gray line that you see next to your playhead, it's actually the chunks that have already been downloaded and ready for you to play. And you know when the quality drops when you're downloading something heavy, for example, well, in that case, what happens is that the browser notice that and it just downloaded the next chunk at a lower quality. And that's not all because YouTube also discriminates. When you upload a video with zero views, it starts with H.264, the cheapest and lowest quality encoding. But if your video passes 3,000 views around 3,000 views, it re-encodes it in the background to VP9, which is mid tier.
Then if your video passes 1 million views, it gets re-encoded again to AV1, the highest quality tier. So more popular videos scientifically look better. So if you want to see this video in its highest quality encoding, leave a hype and we'll get to enjoy that when it reaches a million views. Now, all of this that I just showed you is the simple version. We skip through CDN, codeex, and a bunch of other stuff. But the point is that getting a video to you in a way that looks easy and seamless and happens in 0.1 seconds is exactly the opposite of easy and seamless. And now this is not only happening for just one video, but for 106,000 years worth of video watched every single day. And this is exactly the reason why YouTube almost didn't happen. Because these are the founders of YouTube worrying that they only have 40 videos on their site.
But a year after they had the opposite problem. They're making $30 million a year in revenue. Amazing. But they're spending $1 billion a year in infrastructure costs to deliver all that video. They ran the numbers and every single view at the time cost them more or less a penny to deliver to the user.
In 2007, YouTube alone was consuming as much bandwidth as the entire internet in the year 2000. And by 2014, it was responsible for 20% of all the bits that got transported for the entirety of the internet. So when Google got acquired by YouTube in 2006 for 1.65 65 billion wasn't just a business deal. In 2006, there was no cloud. If you wanted to host a video platform, you better start buying some data centers, lay your own cables, and buy your own servers. And basically, the only ones that had the technical infrastructure in place to do this at scale was Google. To this day, managing the scale of YouTube and it huge library of videos is one of the hardest problems in tech. They even built a custom chip that's purpose-built to do only one thing, which is video encoding. compressing this YouTube video that you're watching into hundreds of little segments so that you can watch it while you're on a bus in a remote mountain in Peru without the little spinning wheel. But YouTube is not the only thing that hides an unreasonable amount of design, engineering, and psychology. Because there's something else that you look at every single day that hides much more than you think.
Loading screens. And you can learn all about it in this video right here. I'm Mo and I'll see you in the next
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 viewsβ’2026-05-28
How agent o11y differs from traditional o11y β Phil Hetzel, Braintrust
aiDotEngineer
450 viewsβ’2026-05-28
Re: π£οΈπthepropheduπ2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 viewsβ’2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanationπ―β
LearnwithSahera
1K viewsβ’2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 viewsβ’2026-05-29
Search Algorithms Explained in 60 Seconds! π€π¨
samarthtuliofficial
218 viewsβ’2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 viewsβ’2026-05-30
Instagram accounts got PWNed
EricParker
13K viewsβ’2026-06-03











