This video demonstrates how a use-after-free bug in Way of the Samurai 3 causes crashes when entering player names, showing how developers can debug such issues by analyzing the call stack, understanding string encoding (UTF-16 vs UTF-8), and applying binary patches to fix memory management problems.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Why Typing Your Name Crashes This GameAdded:
I feel like being able to enter your name into a game is a pretty standard feature and hard to mess up. And yet, here we are. On my Discord, someone suggested we look at Way of the Samurai 3 as there's sometimes a null pointer dereference when you go to enter your name. Looking at the Steam forum, there seems to be a few posts with known issues and workarounds, but mixed results.
First things first, let's buy the game.
15 quid for a game that crashes when you enter your name? I don't think I've ever spent this much on a game for a video.
Well, I guess we have no choice.
So, I can't reproduce the crash. It certainly lets me enter my name fine.
Back to the known issues Steam page, and it mentions quitting to the menu, but I can't see that as an option. Okay, I found a guide on how to quit to the menu, which also mentions the crash.
To go from the game back to the menu, you have to press space to open the menu, a Bohemian choice of a hot key, open the map, click world map, and then select leave Armana.
Simple.
Now, after we quit to the menu and try to start a new game and enter a name, we crash. This seems reliable and reproducible. First things first, let's see what we're working with. CFF Explorer says it's a 32-bit game.
Apparently, it's also generated a crash dump, which is nice. No idea where, but it's a nice thought.
Let's run it under a debugger. There is indeed a null pointer dereference here.
The EDX register is zero. Looking at the disassembly, we can see std::string base exception, which is the MSVC call for throwing an out of range exception from a string. So, this function is doing something string related, which makes sense if it crashes when entering a name. So, what does EDX hold on the happy path when we don't crash? Ah, the breakpoint isn't hit, so the logic must diverge somewhere up the call stack.
We're looking at the symptom, not the cause. We need to work our way up the call stack. At each entry, set a breakpoint, then run the game through the happy and unhappy path to see where it diverges. This is going to be tedious. Oh, actually, the calling function is called on both, which means the divergence is within this function itself. That was lucky.
Drilling down, and we can see the jump here is taken on the happy path, so ECX is zero. Let's try and understand this function a bit more. This code has a loop, which increments an index by one each iteration. It then checks if that's greater than some value, and if it is, it throws an MSVC range exception. Based on how std::string is usually implemented, I suspect this is an inlined call to std::string at. From this, we can assume that param1 + 0x40 is the length of the string. If we look in the debugger, it seems like it could be stringy Okay, let's anchor this back to reality.
Raymond Chen from the Old New Thing blog gives us the string layout in MSVC. By the way, if you want to fall down a rabbit hole of Win32 internal curios, you should definitely check out his blog. Anyway, if the size check is using param1 + 0x40, then we know the string itself lives at param1 + 0x30, which we know by subtracting the size of this union. It's probably the member variable of some class. If we resolve the address of the union, then we can see the string value. It's UTF-16. You can tell by all the null bytes. So, technically, it's a std::wstring, not a std::string. If none of that made any sense, let me attempt to elucidate. Unicode is the standard that attempts to codify every form of written text in human history, from cuneiform to hieroglyphics to emoji. The standard assigns a unique number, or code point, to the individual components of all languages. When you combine these code points, you get the unique graphical units of the language called graphemes. So, we've got almost 300,000 code points. How do we represent these in a computer? Well, like endianness and line endings, there's multiple ways of doing this, and we've got a miserable fractured system. Arguably, the simplest is to start off representing all the lower values with a single byte, then overflow to two, three, etc. as needed.
This is called UTF-8 and is the default of Unix-like systems. However, Windows is different, and Raymond Chen has the answer for us again.
Windows adopted Unicode before most other operating systems. Windows used UCS-2 as the Unicode encoding.
This was the encoding recommended by the Unicode Consortium because Unicode 1.0 supported only 65,536 characters.
The Unicode Consortium changed their minds 5 years later, but by then it was far too late for Windows, which had already shipped Win32, Windows NT 3.1, Windows NT 3.5, Windows NT 3.51, and Windows 95, all of which used UCS-2.
Windows NT also then shipped with UTF-16 support. UTF-16, as the name suggests, uses a character size of 16 bits. So, when it represents the lower end of Unicode points, which also happens to be the ASCII text most Latin-based languages use, you'll get these extra null bytes. In C++ land, std::string is roughly analogous to UTF-8, and std::wstring is roughly analogous to UTF-16. Although, there's more nuance than that, and the language encoding nerds will tell you how wrong I am in the comments.
One final point in this drawn-out explanation, Microsoft has attempted to rectify this in the Win32 API. When you see a function ending in A, then it's the ASCII version and takes a char. When you see a function ending in W, it's a wide version and takes a wchar.
Okay, so now that I've tricked you into learning more about text encoding than you ever wanted to know, let's get back to the bug. Oh, wait, one more thing that you'll find interesting.
std::string or std::wstring is effectively a dynamic array, so it's heap allocated. However, statistically, most strings used in a program are quite short, so library implementers use something called the short string optimization or SSO. This allows a small string to be embedded directly into the string object itself and therefore eliding the need for heap allocation.
And that's what this union is doing.
The reason I bring this up is that you can see this code here resolving the correct string address based on the length, i.e. checking if it's in the SSO or not.
But it's checking if it's less than eight, whereas the SSO size is 16. And after staring at this for a bit, I realized that the SSO is always 16 bytes regardless of string type. So there's seven wood chars and a null byte.
Anyway, I just wanted to share my mistake with all of you.
The next bit of code just runs through each character in the wood string and for each one it then loops through some other lookup. If the character is in the lookup, then it writes this offset to another array. Else it writes the constant 0x18.
In Ghidra, the first global value is zero, but at runtime we can see that it's 0x15. So we end up with an array of either 0x18 or 0x15 depending on whether each character of the string is in some other array.
The if statement that guards all of this is checking if the length of the lookup array is not zero. So when we crash, we've got a length, but the data itself is zero.
In fact, in this instance we have a massive size, but a null pointer data.
This all suggests some sort of use-after-free bug. The size variable is being filled by something else.
We can patch this to always fail, i.e. not do any of this lookup. It's just flipping a bit in the binary to always jump to the else clause instead of conditionally jumping.
In order to exert some control over the binary, I'm doing something called DLL hijacking. Basically, I can see that the game links to dinput8.dll.
So if I create a DLL the same shape as that and drop it into the game's files, Windows will load it for me first. Now I can load the real DLL from the system files and proxy the calls through. But this does allow me the opportunity to execute any code I want within the game.
So here I patch the game's code.
Now we're into a new crash, which is possibly more use of the stale pointer.
There's another if statement which guards the crash site, but if we do the same trick as before and just patch it so the if always fails, then we don't crash, but we also don't get the character selection UI, which is less than useful.
There is a size check in the code, but it's using a signed comparison, so it will pass if the text is huge, which it is.
We can use another satisfying one-byte patch to make the conditional an unsigned check, which means large values are always treated as large positive values, not large negative values. And it seems to work. I bundle this up and sent it over to the chap on Discord. And it doesn't work.
He pointed out that switching between upper and lower case causes a crash, which is fair and something I did not check.
It's in a new place, and again, like before, I patched it to just jump over the problematic area, but then the grid isn't rendered correctly.
We need to go a bit deeper.
If we log the parameters for the crashing function, we can see it's a heap address, and it changes when you select between upper and lower case.
However, for given run, it's always the same two values. The first four bytes of each allocation is a pointer to an array of characters, upper and lower respectively.
However, if we stare at this long enough, we can see something interesting. The crash only happens when we click the done button, and when you click done, you get a blank UI element.
We also get a different pointer in the logs when we click done, so the blank UI element is a third state along with the upper and lower case states.
This has an invalid size when it crashes. Again, probably a use-after-free issue. The data here is always a null pointer. It's the size that's the issue. I develop an ASM patch that handles this, and we seem to be onto a winner. Poof this over to Discord, and they seem to agree.
Now, the correct fix for this is to track down the source of the use-after-free and properly zero out the state when the original data gets released. However, in order to get people enjoying their games again, this will do for now. That being said, if you want to see how I fixed a bug in a game that had a soft lock, then check out this next video.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 viewsβ’2026-05-28
How agent o11y differs from traditional o11y β Phil Hetzel, Braintrust
aiDotEngineer
450 viewsβ’2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanationπ―β
LearnwithSahera
1K viewsβ’2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 viewsβ’2026-05-29
Search Algorithms Explained in 60 Seconds! π€π¨
samarthtuliofficial
218 viewsβ’2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 viewsβ’2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 viewsβ’2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 viewsβ’2026-06-01











