A brilliant dissection of how a minor cryptographic oversight translates into total system compromise via page cache corruption. It proves that in kernel security, the most obscure architectural flaws often provide the most direct path to root.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Copy Fail Explained SimplyAdded:
I'm pretty sure you've heard this massive Linux vulnerability. What it does is to give you instant access to root without any password.
In this video, we will understand how this vulnerability and the exploit work under the hood. In a normal Linux box, switching to root requires a password.
If you input the wrong credentials, then the colonel will reject the access.
Let's see what happens if we run the exploit script. This will fetch a small Python program from the following URL.
Then we will pipe it to Python to run it. After that, we will attempt to execute again SU.
Running that leads us directly to root without asking a password. If we go to another terminal and run SU again, the same thing happens.
The exploit corrupted a cached copy of the su binary allowing anyone to log in directly as root. That is the gist of the vulnerability.
In order to revert the pvious state of the system, we need to drop the cache.
If we run again su, it no longer allows us to log in as root without entering a password.
In the next section, we will understand how this is possible. In Linux, a page cache is a location in memory that is shared amongst different programs in the system. When a program runs, the kernel will copy it to the page cache. Then the next time the program is run again, instead of getting it from disk, the in-memory copy will be used instead.
This is done like that to make succeeding access faster. The vulnerability happens inside the page cache. An attacker is able to corrupt the in-memory copy of a program. An example would be replacing the whole program with something that leads us directly to root bypassing the credentials check or injecting a tiny shell code. The next question will be how is the attacker able to do that?
Let's go a bit deeper. There are many ways on how an attacker can manipulate data inside the page cache. In copy fail vulnerability, a socket is used to perform that. There are different sockets in Linux. Network sockets are used to connect to remote systems. Unix sockets are used so that programs running on the same system can communicate with each other. Then there is a special type of socket used for encryption and decryption. This is the one used to manipulate and corrupt the page cache. This type of socket is AFG.
It exposes the kernel crypto subsystem so that programs can leverage it to gain several benefits. It can perform hardware acceleration if your CPU supports it. For example, the kernel can hand over the data to the CPU via a special type of lane, making the process faster. Another thing is having a uniform interface. It allows different programs to access different crypto algorithms in a uniform way. Lastly, it also performs optimizations. Even if your CPU is not fast enough, it can do tricks such as copying data within the kernel space, bypassing the user space.
This is something related to the vulnerability. So, please take note of this for later. The way we use AFALG socket type is similar to other sockets.
We first open a socket then we wait and accept connections and we can also send messages to the socket. In opening the socket we specify the address family which is AFG.
After opening it we bind to it and pass two parameters. First one is AE A which means authentication and encryption with associated data. This is a form of encryption that not only hides data but also ensures data is not tampered with during the process. Under AEA, there are different types. To specify the type we want to use, we put it on the second parameter. This is using authentication encryption with extended sequence numbers. This is where the logic flaw happens which allows attackers to manipulate the page cache. We will discuss this more shortly. After opening the socket, we can now perform several operations such as accepting connections or sending data for encryption. Now let's understand how the logic flaw happens on this cryptographic template.
This is a type of AEA that works against messages of this format. The first part is AAD or the additional authenticated data. This doesn't need to be encrypted.
Examples are public information such as IP headers, packet metadata, and sequence numbers. Later we will see how this part plays a crucial role for attackers to inject the shell code. The middle part is the cipher text. As we all know a cipher text is the encrypted form of a data. Good examples would be TLS data, credentials, and other confidential information. Compared to AAD, this part is not that crucial, so we will skip most of the details. The last is the message tag. This is the check sum of the combined AAD and cipher text. It ensures those previous two parts are not tampered during transit.
Like with AAD, this is also important in the execution of attack, which we will see shortly. The attack leverages the decryption mechanism. So, let's understand it. Any kind of decryption process involves an input that contains the encrypted data. It also involves an output which is where the original plain text will be sent or where the decryption will be carried out. In the context of this cryptographic template, the output serves as a scratch space.
These input and output are represented in SGL's. SGL is short for scatter gather list. The purpose of this is to locate scattered data in memory and represent them as one contiguous block.
The input SGL contains the AAD, the cipher text, and the message tag. Both AAD and cipher text are in memory buffers because they are the active data being worked on by the cryptographic process. On the other hand, the message tag is inside the page cache. That's because a message tag being a check sum is just a static value that doesn't need to be changed. It is assumed that any future process will just read it. When the decryption kicks off, it will copy the AAD and cipher text to the output SGL, but it will only reference the message tag. This is an optimization done on the logic to conserve memory.
Let's see how the logic flaw happens this time. Let's look at the decryption process from a different view. That is which part is read only and writable.
The input SGL is read only, but the output is writable since it needs to perform writes during decryption. What happens is that since the message tag is being referenced from the input SGL, that area becomes writable as well.
Since this is a page cache, the kernel will not do necessary checks on whether you can write on a page or not. This is the actual logic flaw. Due to this optimization, the assumed to be read only region of memory becomes writable.
This is the point where attackers can inject crafted shell codes. Aside from decrypting the cipher text, it also needs to compute the check sum and see if it matches the one from the input sg.
The HMAC computation involves arranging several pieces in the right order. The AAD is divided into two parts that is the high and low bits of the sequence number with each being four bytes in size. Sequence numbers came into the picture since this cryptographic template is mainly used for IPSC traffic. The right order will be the high bits first followed by the message tag then the low bits. It is during this rearrangement process that attackers can inject shell codes in the page cache. On a high level, there are four steps on how attacker inject the shell code inside the page cache. The first step is to load an SUID binary into memory. This is the program that an attacker will try to corrupt. The major requirement is that the binary must have an SUID bit set. So when the malicious shell code is injected, running it under the same process will give you root privileges.
And of course, all users must be able to read and execute it. So a typical candidate would be binaries like SU or bin pass WD. The next step is to prepare the shell code. This is a very small program that involves running a shell.
Later, when we look at the exploit, we will see the technical details of this shell code. Once the shell code is prepared, the attacker will then use the cryptographic template to copy this into the target destination. As we saw in the previous slide, this includes opening a socket type of AFG, binding to the socket, and sending the payload to that socket. During that process, the HMAC validation will kick in during decryption process which will copy the shell code to the target destination.
We'll look at the actual code shortly.
It is also important to note that not all the programs code in memory is overwritten. Typically, only the first 100 bytes will be corrupted. The remainder of the program code will be untouched. This is enough to trigger a privilege escalation. Once the kernel parses the initial headers, the shell code put there will be read and the instruction will jump to a specific address that leads to a shell. Lastly, once the program's code is modified in the page cache, an attacker can now run it, which will lead to privilege escalation. Now that we have a high-level understanding about the vulnerability and how to exploit it, let's now go and see what the actual exploit looks like. What you are seeing here is the original exploit from the researchers. There is nothing much to see other than the exploit script and some information about the affected versions. If we open it, we see that they packed it into a very small 732 byte script. This is very hard to understand and not suitable for learning purposes. So I asked AI to beautify the script for us. Most of the comments here are automatically generated, but I added a few more at the bottom. Within the script, there is a small helper function that converts a hex string to bytes.
There is a function that performs the actual process of injecting the shell code to the page cache using the cryptographic template and a function that contains the actual shell code.
This first function used for bite conversion is self-explanatory. So let's not dive into too much detail. Let's try to focus first on the run exploit function which contains the shell code.
As we just discussed in the last slide, the first step is to load a copy of an SUID binary into memory. In this case, we use the user bin SU. Then next is to construct the shell code. This is in hex but in compressed format. So we use the z-lib module to decompress it. If we convert it to bytes, it will look like this, which is something hard to identify. But if we decompress it, we are starting to see more information such as the ELF header. At the end, we also see the shell used when dropping to root. Later, we will find a way to decode this properly and understand what is inside. After that decompression task, we have a loop that sends the payload to the page cache. If you also notice that this sends the payload in four byte chunks. There are a few reasons for this. First is that if you recall the low bits of the AAD is only four bytes and in general the crypto operation works on four bytes at a time.
The actual function that sends the payload accepts several parameters. The first one is a file handle returned when opening the su binary.
We need the file handle so that we can determine the exact location of the binary in the page cache. The next parameter is an index which is initially set to zero. Lastly, we send the four byte part of the payload on each iteration. Now let's understand the code of the exploit step function. This is the main driver for corrupting the page cache of a file. First is to open an AFG socket type. Then we bind to that socket specifying the vulnerable cryptographic template. Before moving further, we need to set some options. First is that we need to set a fake key during the decryption process. We don't care whether we decrypt data successfully or not. What we care about is to overflow the part past the message tag so we can inject the shell code on the low bits of the AAD. So this key can be anything we want. We also need to tell it to use a scratch space of four bytes. After setting the options, we are now ready to accept connections. We have some variables here that make sure we are always performing operations on four byte chunks and a variable for null bite conversion. In order to send data to the socket, we use similar method on how we do it in network sockets which is by calling the send message. We pass three arguments which are the actual data, a set of control messages and a flag. In the first argument which is the data buffer, we construct the A a but we don't care about the high bits. So we will just send random data. The low bits is what we want. So this is where we will set the four byte chunk of the payload. The second argument are control messages. We are setting here a blank initialization vector decryption as the mode and a a length of 8 bytes. For the last argument, we are setting a special number here that tells the algorithm not to decrypt the data yet as we will still send more message. Sending a four byte payload chunk to the socket is not enough. We need to find a way to find the location of the su binary in the page cache and point the socket to write to it. This is the purpose of the next three lines. We first create a pipe.
This will produce two ends, one for writing data and another one for consuming data from the pipe. The splice function allows us to connect the file handle to the AFG socket. So the sequence would be sending the SU file handle to the right end of the pipe.
Then go to the other end, consume it and point it to the socket. During this process, there is no data copied.
Everything is being linked through the pipe and we are just pointing to the page cache where the SU binary is located. Lastly, we will trigger the decryption by receiving the message.
This will loop through all chunks of the payload. Once that is done, the copy of SU is now corrupted and we can now run it to achieve root access without entering any credentials. Before ending the video, let's have a look at what is inside the payload. We have here the payload in hex.
Let's again decompress it to make sure we see the ELF header.
Since the output looks fine, let's put it inside a variable.
Then let's write that into a file.
After writing, we also see here that the entire payload is 160 bytes in size, which is relatively small. In another terminal pane, let's confirm if the file is here.
Since this is an ELF binary, we can use read ELF command to confirm the headers are intact. This binary contains a small code that will drop us to a shell.
Since there is no set UID bit set, we just drop to our own user and not root.
Now to see the actual instructions inside, we can run the following command. Credits to oxdf for providing this.
We will not deep dive into too many details of this assembly code, but I will just give a high-level overview. It calls the set UID by moving 69 to AL register and invoking the sysol instruction. Then the instruction next to that is to load the address of bins.
There is another SIS call here, but this is for running binsh through execve function. And the last system call is to exit. If you want to see videos about assembly programming, feel free to let me know in the comments below.
As we saw in this video, copy fail is a very sophisticated vulnerability that affects any Linux distribution that came after 2017. The immediate fix is to prevent the AE a module from being loaded or to upgrade your kernel. You can see more details about the instruction in the technical analysis page I included in the description of this video. I hope you learned something today. If you find my content valuable, please support me by liking this video and subscribing to my channel. See you on the next one.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 viewsβ’2026-05-28
How agent o11y differs from traditional o11y β Phil Hetzel, Braintrust
aiDotEngineer
450 viewsβ’2026-05-28
Re: π£οΈπthepropheduπ2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 viewsβ’2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanationπ―β
LearnwithSahera
1K viewsβ’2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 viewsβ’2026-05-29
Search Algorithms Explained in 60 Seconds! π€π¨
samarthtuliofficial
218 viewsβ’2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 viewsβ’2026-05-30
Instagram accounts got PWNed
EricParker
13K viewsβ’2026-06-03











