This project brilliantly weaponizes video compression artifacts to turn technical constraints into a medium for hidden storytelling. It is a sophisticated reminder that our digital reality is entirely shaped by the underlying algorithms processing it.
Approfondir
Prérequis
- Pas de données disponibles.
Prochaines étapes
- Pas de données disponibles.
Approfondir
Changing the Quality Changes this VideoAjouté :
This is the quality at which you are currently watching this video or at least an approximation. If you're seeing the text 1080p, then you might be watching at either 720p or indeed 1080p depending on whether the text is striped. But if you instead see the word lower, then you're likely watching this at one of the lower resolutions. And if you check the timeline near the current part of the video, you should see an entirely different bit of text. Now, depending on the device you're using and the properties of your display, some of these effects might not be as clear as the rest. And if you're watching this well into the future, some of the tricks used here might be broken for one reason or another. So, let's not waste time with crude demonstrations and instead take a look at how all of this works in the first place.
We'll start by taking a look at YouTube's list of quality settings. Each entry is named after its resolution, but for 720 and 1080p, there's an additional number showing the frame rate. For all the other entries, their frame rate is half of the maximum, which in this case is 30. In case you don't know what these terms mean, here's a quick refresher.
The resolution represents the amount of pixels on the screen, while the frame rate represents the amount of times that those pixels are updated every second.
Of course, this isn't entirely accurate due to factors like bit rate and refresh rate. But in general, resolution refers to image quality, while frame rate refers to smoothness of motion. Now, remember, our goal is to make one video file show something different based on which setting you have selected. And these two factors are pretty much the only way to distinguish that. More specifically, the video file gets uploaded to YouTube at the highest possible quality and gets compressed the further down you go. Normally, video compression removes detail from the video. But our goal is to somehow make it so that new information is added when the quality is reduced. To get an idea for how something like this could work, let's look at the simplest example, which focuses on just the frame rate for now. As mentioned before, a clip that started at 60 frames pers gets cut down to just 30 at lower quality settings.
YouTube does this by simply dropping every other frame. Normally, this just means that the video playback gets a bit choppier. But what if we stored our video in only the dropped frames? This way, at 60 fps, you can still see the original clip, albeit with some flickering, while at 30 fps, it disappears entirely.
Creating a video file with such specific conditions is also not a straightforward process. After all, if I were to simply weave together individual frames in a video editor, it would take me hours to cover just a minute of footage.
Naturally, the solution was to write some code. In this case, that's a Python script that takes two images as an input and produces an interleved video as the output. Scripts like these were very important throughout this project. And frankly, every other project on this channel has relied heavily on programming, which is something that you too can master with the sponsor of this video, boot.dev. I've always said that the best way to learn how to code is to build real projects. And that's exactly what boot.dev offers. It's not just theory. It's not just boring lectures and tutorials. You get to do the dirty work handson. They've got courses in anything from JavaScript and Python for when you need quick and dirty solutions to memory constrained C, like the kind I use to make the world's smallest Minecraft server. Every course is handcrafted by human experts. And there's a Discord community full of real humans ready to help if you ever get stuck. Or you can also just look at the example solution on every challenge.
After all, the point isn't to memorize what characters to write. What's important is learning how to actually build software, and that can only be done through trial by fire. This isn't a boot camp with one simple trick. It's real learning with real results. And look, you don't have to take my word for it. Boot.dev is free to try. Pretty much all of their actual content is readily available. It's just the interactive stuff that's behind a paid membership.
If you do decide that you want to get serious about this, then you can use code Portalrunner or the QR code on screen to get 25% off for an entire year.
All of this should hopefully give you an idea of how something like this is even possible in the first place, but it doesn't quite get us to the solution just yet. If all we're doing is interle frames, then not only does the flicker get kind of annoying, but it also doesn't explain how we get a custom message at 30fps. If we try drawing something on those other frames, the 60fps playback would now blend and blur the two shapes together. We could try to work around this by subtracting the unwanted frame from the desired output, creating a sort of optical illusion that cancels out certain parts of the image.
But this makes the flicker even worse and pausing the video at any point could expose the image that we're trying to hide. So clearly this isn't the full solution. And in fact, if you go back to the demonstration at the beginning of the video and step through it frame by frame, you'll notice that there's definitely a lot more going on, especially at 1080p. To understand this, we'll have to take a look at the other factor that is resolution.
Scaling down a video might seem like a fairly simple task. Just reduce the amount of pixels in the frame, but what's not so simple is figuring out which pixels to keep. If you're going from 1920x 1080 to 640x 360, then that's three times fewer pixels per axis, or 9 times fewer total. This means that a 3x3 grid of pixels would get condensed down to just one pixel. You could do this by arbitrarily picking one of the pixels in the grid and drawing that. But then you're effectively discarding most of the information, which might misrepresent the actual contents of that area. Instead, the most common way to do this is to average the color value of the whole area. This seems to be the approach that YouTube uses, and it's often the best balance between speed and accuracy. But regardless of how you do it, this is a destructive process with an unavoidable caveat. One output can have multiple inputs. For example, a perfectly monotone kernel will obviously come out to a pixel of the same color, but a noisier area with a distinguishable pattern could still average out to that same color.
Mathematically, this makes sense. But if you surround the noisy pattern with a solid background, then even though they average to the same color, you can clearly spot the pattern at full resolution. It's also worth noting that some patterns are more visible than others. A checkerboard or grid might feel very noisy and contrasting close up, but when you zoom out, your eyes start to blur it together. However, regular patterns like straight lines are easier to see even at small scales because your brain is less likely to filter that information as noise. And the choice between horizontal and vertical lines also matters. On certain displays, changes in viewing angle can impact which pixels you see. This effect is most noticeable if the offset axis is perpendicular to the lines. In other words, horizontal lines permit horizontal view offset and vertical lines permit vertical view offset. I figured that it's probably far less likely that you're watching this with your screen turned to the side, unless there are multiple people slide around the same screen, but I'll take my chances. It is far more likely, however, that you're watching this on a phone, a laptop, or a TV that isn't exactly eye level. And because of this, I chose the vertical stripe pattern. So then to draw an image using this trick, we can take a mask of the intended shape and use it as a reference for the placement of the 3x3 kernels. So again, if you're watching this at anything other than 1080p, then the pattern probably just disappeared.
But if you are watching at 1080p, then you should see a reasonably clear image of that same mask. Now 3 pixels is actually a very small quantity. And it's possible that even if you're watching this at 1080p, other factors like the size of your browser window or the orientation of your phone or even the settings on your TV might impact the visibility of the pattern. To address that, I decided to combine this trick with the previously mentioned frame interle trick, placing the original mask on every odd frame to make it show up only at 60fps. This also reduces the flicker because this time a part of the image is always visible, but it still doesn't solve our earlier problem of getting an image to show up exclusively for lower resolutions. Right now, we can get a pattern to show up for 1080p, but at any other resolution, nothing shows up. That's because, again, downscaling is a destructive process. We're actively losing information at lower settings. So to make a custom image show up at a lower quality, we'd have to find a way to somehow reveal new patterns by dropping information. At first, I tried creating a field of high contrast white noise where some kernels contained one white pixel more than the rest. The idea was that when this is scaled down, the noise would average out and the result would be a clean image. And while yes, that did work, you can also clearly see the pattern at 1080p. The brain is just very good at finding such patterns even though the difference is just one pixel per kernel. So okay, let's take a step back and approach the problem from a different angle. What characteristics are unique to lowquality images? Sure, you can talk about pixel count and whatnot, but more generally highquality images are sharper than their lowquality counterparts. We can express this more formally as the frequency of details decreasing as you go down. For example, think back to the times that you've seen a weird looking cloud in the sky and thought, "Hey, if I squint hard enough, it kind of looks like a puppy." And if you only had the blurred version of that picture, you might as well associate that with a child's painting of a dog.
Whereas at full quality, it's obvious that these are clouds. The point is that is an example of a case where lower image quality actually reveals more information. Applying this to our problem, you can imagine that if our pattern had some blurry text in the background, it would be harder to make it out than the much more clearly visible foreground pattern. And the brain is lazy, so it'll almost always choose to interpret the thing that it sees first. But if you remove the highfrequency foreground pattern, it has no choice but to focus on the low frequency background. Of course, this optical illusion isn't as strong as the rest, so you do have to be a bit clever in how you use it. The word lower was chosen deliberately because it aligns perfectly with the characters in the foreground text. L is very similar in shape to one, just with an extra leg. O is the same shape as zero. W fits in the lower half of the eight. E can be rounded down to fit mostly within the second zero. And R is again the same shape as P, just with an extra leg. When layered on top of the line pattern on only even frames, it becomes nearly invisible at 1080p. But at 360p, it is the only visible thing remaining.
Now, all that's left to discuss is how I got the timeline to show something that's completely different from any frame in the video. While the techniques we've talked about so far have been rooted in optical illusions and steganography, this next part involves an exploit hidden in the video file itself. You see, while putting together this video, I ran a lot of tests on an alt account to determine how exactly YouTube handles different kinds of video files. Specifically, I was looking for a way to make the 60 and 30 fps streams capture a completely different set of frames. For example, if you uploaded a 90fps video, there would technically be enough room to assemble two distinct 60 and 30fps clips with no overlap. So, I was basically hoping to find some way to trick YouTube into doing this.
Unfortunately, what I found indicates that YouTube first transcodes the video to 60fps or below and then takes every other frame from the result of that.
This guarantees that the lowquality and high-quality video streams both contain the same content. And it's also probably much cheaper for YouTube to reprocess their compressed version instead of the original material. But there is one place where they do seem to be using the original clip, the timeline.
You see, the average MP4 file doesn't just contain screenshots of every video frame. No, that would take up far too much space. So instead, videos consist of three types of frames. I frames, B frames, and B frames. I frames are what you would traditionally consider a full frame or a key frame. They contain full image data for the entire screen and are thus the least common type found in the average video file. P frames, on the other hand, contain only the differences from the last frame. This naturally enables far better compression, but also means that they cannot stand on their own. And finally, B frames can contain information from both past and future frames. These compress even better, but are a lot more complicated to decode. If you're a YouTube engineer looking to extract a representative timeline preview as cheaply as possible, it might be tempting to simply extract the frames without doing any actual video processing. But you also shouldn't trust these eye frames fully. It is entirely possible to create a video that consists of just B frames with only a single I frame at the very beginning. So if it seems like the frames are particularly unreasonable, it makes sense to re-encode the video first. Of course, I can't know for certain that this is the process that YouTube uses, but my testing indicates that it's likely pretty close. So, as you might suspect, the key to exploiting this is to create a high frame rate video and insert unique I frames among the frames that are going to be dropped when it gets re-encoded to 60fps. If the new frames are sensible enough, YouTube may choose to extract them directly before they get completely removed from the final video.
Now, if I'm making this process sound fragile, well, that's because it is. If any of these tricks fails over time, I'm willing to bet that this will be the first one to go. All it takes is for YouTube to re-encode the video once.
Even just using YouTube's built-in editing tool might ruin it. So, here's hoping I don't have to blur or cut anything out of this video. Anyway, now you know how I set up that demonstration at the beginning, but to be clear, this isn't the only way to do it.
Technically, there is a way to show two completely different video streams, but it comes with a few caveats. Let's come back to that frame interle method from earlier. The problem with this one was that the 60fps stream would invariably contain frames from both clips. However, assuming a 60 Hz display, the viewer can choose to filter out half of the frames by doubling the playback speed. But there are two major issues with this.
First, that 60 Hz clause is a pretty big assumption. Many of you are likely watching this on way faster displays.
So, doubling the playback rate wouldn't change which frames you see. But perhaps more importantly, it also doesn't actually remove any frames from the video. All it does is simply doubles the time step between frames. So, not only is it impossible to guarantee that you're starting on the correct frame, it's also possible for the time step to desync if the playback ever stutters, thus randomly changing which clip you see. Here, I'll let you see for yourself in just a moment. If you're watching at 30 fps, you'll see a 30-se secondond speedrun of the map sealing catapult from Portal 2. But if you're watching at 60fps and 60 hertz at two times playback rate, you might see the same run sped up, or you might see a 15-second run of pit flings, or some combination of the two. And if you're watching at any other configuration, you'll likely see a flickering mess. You've been warned.
Thanks to Burger 40 for the gameplay clips. Now, obviously, this approach is a lot closer to what you might expect from this kind of project, but the general inconsistency and hardware assumptions are the main reasons why I ended up going for an otherwise far more complicated and less flashy demonstration for the intro. I also tried uploading at 4K, at high frame rates, and a bunch of other stuff. So, I'm pretty confident that I haven't missed anything major. If you think I have, let me know in the comments. Oh, but do check the pinned comment first. I always try to post the answers to the most common questions there. So, if you found any of this interesting, please consider subscribing. This is exactly the kind of stuff that we like to do around here. And if you happen to be looking for a tutor or consultant on your personal projects, you can now pay me money to book my time at learn.pr3.com.
Or if you instead want to learn how to code with expertly crafted hands-on courses, check out boot.dev by using the link in the description.
Vidéos Similaires
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03











