A masterful display of architectural depth that exposes the superficiality of modern high-level abstractions. It is a sharp reminder that true engineering begins where the standard library ends.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Hello world in 300 lines of code to piss off vibe codersAdded:
The year is 2016, and this is you, a software engineer who spent years learning data structures and algorithms, and spent hours and hours practicing problem-solving to land a decent job in one of the FAANG companies. Fast forward 10 years with the rise of LLMs, this guy shows up. A vibe coder who claims that he can do your job very easily with the help of LLMs and AI agents. He claims that he uses AI agents to build production-ready apps with just multiple prompts. So, you go ahead and approach him and ask him to write a hello world program in C for Windows. He goes ahead and prompts the AI to write a hello world program in C for Windows, and gives this back to you. You smile and whisper in his ears, "Watch how it's actually done." Unlike the countless AI slop videos that flooded YouTube in the past couple of years, this video will be more of an anti-vibe coding video. We will take a simple program like this hello world program and turn it to a proper hello world program that vibe coders cannot generate. To do this, we first need to discuss some Windows internals. First, you need to understand that on Windows, in order to print something to standard output, you have to use the WriteFile Win API. This Win API exists in the kernel32.dll library.
On modern Windows versions, the kernel32.dll library acts as a forwarder, that forwards most of the Windows APIs to another DLL called kernelbase.dll. Inside this DLL, you find the actual logic of the WriteFile Win API. This logic is responsible for validating the function parameters and error checks. And once this is done, the kernelbase.dll will forward the call to another DLL called ntdll.dll.
This DLL is the last layer before the call is passed to the Windows kernel. In this DLL, you find most of the function names prefixed with NT, RTL, and ZW. In our case here, the call from the WriteFile API in the kernelbase.dll is forwarded to an API called NtWriteFile in the ntdll library. Now, before the call is passed to the Windows kernel, the NtWriteFile API will first move the system service number, or the system call number of the anti-write file to the EX register. After that, we'll check the system call flag in a read-only structure called KUSER_SHARED_DATA.
This flag indicates whether the system supports or should use the modern syscall instruction or the old INT 2E instruction. 99% of the time you'll find it set to zero, which indicates that it supports the modern syscall instruction.
And after the syscall instruction is executed, the kernel will take the syscall number and look up its entry in the system service descriptor table in space and execute the actual implementation of the anti-write file function. And once it's done, the kernel will pass the execution back to the user space and resume execution of the program. Now, since the printf function is not an actual Windows-based function or API, it has to pass the call to an actual Windows API, which is the write file API. For example, if we compile this program and run it in a debugger like X64 Debug, we can step into the printf call in the main function. And in here, we can see that we have three function calls. We're only interested in this STDIO common vfprintf function.
This function is basically responsible for setting up the arguments for the WriteFile WinAPI. And once it's done doing that, it will forward the call straight to the kernel-based DLL to invoke the WriteFile WinAPI, bypassing the WriteFile call from the kernel32.dll.
We can confirm that by setting a breakpoint on the WriteFile API in the kernel32.dll and on the kernel-based DLL.
And once we run, we can see that the breakpoint from the kernel-based DLL is triggered. This is done just for optimization and performance sake. And as we discussed earlier, the WriteFile API in the kernel-based DLL will forward the call to the anti-write file in the NTDLL, which moves the syscall number of the anti-write file syscall to the EX register and invokes the system call with the syscall instruction.
So now, here's what you're going to do.
We will create a hello world program that invokes the anti-write file system call bypassing the kernel 32 and kernel base DLLs. And we won't be using any other Win APIs or functions that make the process easier for us. Just pure Windows internals vibes. So now here's what we need to do. We first need to get the base address of the NTDLL library in the process address space at runtime.
Then we will loop through the functions in this DLL until we find the address of the entry right file function. Then we will pass the necessary arguments for this function and then we invoke it.
Let's start by getting the address of the NTDLL in memory. To do this, you first need to understand the PIP or the process environment block. The PIP is a data structure associated with each running process. It contains stuff like a pointer to the process launch parameters, a pointer to the process heap, etc. We're only interested in the LDR member which contains information about the loaded modules or DLLs. This member has three circular doubly linked lists. These three lists contain information about the same DLLs. The only difference is how they order or sort these DLLs. In the first list, the DLLs are sorted based on the loading order or the order in which the DLLs were loaded in memory. In the second list, DLLs are sorted based on their virtual addresses from lowest to highest addresses. The first loaded module is in the lowest address and the last loaded module is in the highest one. In the third list, DLLs are sorted based on the execution sequence or which DLL was initialized first. Each of these linked lists has two nodes, a Flink and a Blink or a forward link and a backward link.
The Flink points to the next node in the linked list and the Blink points to the previous one. And they're called circular linked lists because the last node in each list has a pointer to the first node and the first node has a pointer to the last node. So they keep circling around. Now, to get the address of the NTDLL library, we need to pick any linked list of those three and loop through its nodes. Let's go with the in-memory order module list for example.
Now, each entry or node in this linked list has information about a specific module or DLL. What we need to do is to check the full DLL name member of each node and see if it matches into dll.dll.
If so, we return the base address of the DLL stored in the DLL base member. And this is what everything looks like in code. We first read the PIP or the process environment block using the segment register. In 64-bit processes, the segment register contains a pointer to a data structure called the TIP or the thread environment block. The PIP is located at offset hex 60 from this address. In 32-bit processes, the PIP is located at offset hex 30. So, back here, we use the readGSKeyword function to access the address in the segment register and read the PIP at offset hex 60. Then we access the LDR member to get the address of the first node in the in memory order module list linked list.
Then we loop through the linked list and check if the current entry is the first entry that we started with. If so, we break the loop. Otherwise, we keep iterating through it. Now, since the address returned by the Flink is located at offset hex 10 or 16 in a data table entry structure, we need to move this pointer back 16 bytes so that we can fix the alignment and read the data properly. And after we do that, we read the full path to the DLL file in the full DLL name member. And then we extract only the name after the last backslash. Then we remove the backslash and check if the DLL name matches into dll.dll. If we have a match, we return the base address of the DLL in the DLL base member. Now that we have the base address of the NTDLL library, the next thing we need to do is to get the address of the NtQuerySystemInformation function. The functions of any DLL exist in a specific location inside this DLL called the export directory. And in order to access this directory, we first need to walk through the DLL structure.
Since we already obtained the base address of the ntdll.dll, we can use it to access the export directory. We first read the DOS header to obtain the e_lfanew member, which points to the start of the NT headers.
Inside the NT headers lies the optional header, which contains an array called the data directories array. The first entry or element inside this array is the address of the export directory.
Inside the export directory, we have three important arrays: address of functions, address of names, and address of name ordinals. Those three arrays are mapped or synchronized together. For example, let's say we want to obtain the address of function five. We will first loop through the address of names array and check if the function name matches function five. If so, we will take the index of this function and use it to obtain the function ordinal from the address of name ordinals array. Ordinals are just another way used to look up function addresses instead of their names. Once we obtain the function ordinal from the address of name ordinals array, we will use it to look up the address of function five in the address of functions array. This will return the RVA or the relative virtual address of the function. So, we have to add that to the base address of the DLL to get the full virtual address of the function in memory. Let's go through a practical example so you can fully understand this. If we look up the NtQueryInformationFile function in a debugger like x64dbg, we can see here that the NtQueryInformationFile function is located at this address and its ordinal is 688. Now, how did x64dbg figure out the address of this function on the fly like this? It's actually pretty simple. We first take the base address of the ntdll.dll in memory and add to it the address of the address of functions array. And then we take the function ordinal, which is 688, and subtract the value of the base member from it, which is eight. The base is essentially the offset from which the ordinals start. For example, if the index of a function is 10 in the address of name ordinals array, its actual ordinal is 18, which is the index plus the base. So, now, after we subtract the base from the ordinal, we will end up with the index of the NtQueryInformationFile function in the address of name ordinals array. Now, each entry in this array is 4 bytes in size. So, to get the address of entry 680, we need to multiply the entry number or index by four and add the result to the DLL base address and the address of the address of functions array. The resulting address will be a pointer to the NtQuery function RVA or the function relative virtual address.
So, the last thing that we need to do is to add this RVA to the base address of the ntdll.dll to obtain the full virtual address of the NtQuery function in memory. Simple as that.
And here's the code that does all of that. We first read the DOS header of the ntdll library to extract the ELF and new member which contains a pointer to the NT headers. From the NT headers, we access the optional header to read the address of the export directory from the data directory array. Using this address, we read the export directory which contains the three arrays we discussed earlier. We then loop through the function names inside the address of names array until we find the NtQuery function. And once we do, we get the function index from the address of name ordinals array and calculate the function address the same way we did earlier. Now that we have the address of the NtQuery function, we need to pass to it the necessary arguments before we invoke it. However, since Microsoft doesn't want anyone to mess around with their operating system, they completely avoid documenting NT APIs like NtQuery file. But, luckily for us, nerds have already done the heavy lifting and documented most of the NT API function definitions. Here, for example, I have the function definition for the NtQuery API. So, we'll copy this into our code.
The next thing that we need to do is to get the system call number of the NtQuery API. This can be done in two ways. We can simply hardcode the syscall number in our code, but the issue is that this number can change from a Windows build to another. So, instead, we'll extract this number dynamically at runtime from the ntdll library. We just need the virtual address of the NtQuery API memory, and we already obtained that here. This address literally points to the first byte of the NtQuery API stub.
Most of the NT APIs have the same identical stub. The only difference is syscall number. So, here we check or make sure the first four bytes equal to hex 4C, hex 8B, hex D1, and hex B8 respectively. If that's the case, we move the pointer four bytes forward and make it point to the fifth byte in the anti-write file stub. Now, since this syscall number takes up four bytes in size or a dword, we need to type cast it to a pointer to a dword and then dereference it here to obtain the actual syscall number. If we don't do that, we will end up reading only one byte from the four bytes. By the way, if you actually read only one byte, it will still work because the syscall number for the anti-write file is eight. But, if you are trying to extract the syscall number from another API that is greater than 255, it will fail because the maximum number that can be represented by one byte is 255. So, the best practice is to read the whole four bytes just to be safe.
The last thing that we need to do is to pass the correct arguments to the anti-write file API. It takes in nine parameters. We only provide four and the rest is set to null. The first parameter is a file handle to write to. Since we want to write to standard output, we give it the handle to standard output using the get STD handle Win API. The fifth parameter is a struct just to hold the result code and the number of bytes written. The sixth and the seventh parameters are a pointer to the message that we want to print and the length of that message. Now, we need to write some assembly code in order to invoke the syscall. So, in here, we have two functions. A function that moves the syscall number into a dword variable and a function that invokes the syscall, which also mimics the stub behavior in the anti-write file API. Now, to call both of these functions, we have to reference them using the extern keyword or directive. And down here, we pass the syscall number to the set SSN function, which will move the syscall number to the SSN variable and then call the anti-write file syscall function, which will invoke the syscall and then return to finish execution. Now, before you compile, if you're using Visual Studio in the solutions build customization, make sure the MASM assembler is checked and the assembly file is not excluded from the build. And also, the item type is set to Microsoft macro assembler.
Now, we can finally compile and build the solution. Once the build is complete, we can run the executable to get back our hello world message. I was also curious about the performance difference between these implementations of a quote-unquote printf and the normal lib c printf. So, I measured the CPU cycles they both take during execution.
And turned out that our implementation takes almost half the CPU cycles to finish execution. This makes sense because, as we discussed earlier, any normal printf call has to go through these layers before invoking the syscall. While our implementation completely bypasses these layers and jumps straight to the ntdll library and invokes the syscall. Now, can a vibe coder use any LLM to generate such an implementation with one or two prompts?
Well, if you ask LLM such as ChatGPT, Claude, or Gemini to write a hello world program for you in 200 to 300 lines of code in Windows, they will just generate random garbage for you. To get an LLM to generate a similar implementation for you, you need a prompt like this. And if you already have the knowledge to write a prompt like this, you're not a vibe coder. Case closed. Now, I think it's time for me to touch some grass. See you soon.
>> [music]
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











