This video provides a clear and practical breakdown of complex CFG transformations that are usually buried in dense textbooks. It’s an essential watch for anyone looking to understand the elegant logic behind modern compiler optimizations.
Inmersión profunda
Prerrequisito
- No hay datos disponibles.
Próximos pasos
- No hay datos disponibles.
Inmersión profunda
reducing the irreducible (CFG)Añadido:
Yeah, in real code it's not very common because spaghetti code spaghetti gotos have to be in essence yeah I'm sorry about that but once the idea got in my head I just I just had to try it. Hello internet and welcome to the groundup coder. Welcome to the groundup coder. Today we're going to be talking about irreducible control flow graphs. So originally we this was going to be the stackifier video. In fact, I was just going to have one stackifier video, but as I've been learning more and more, I've realized that we need uh both preparation videos and also the stackifier video itself needs to be split up, right? So, why is this not the stack of fire video? Stackifier is easier to implement when you're working with a reducible control flow graph.
Strictly speaking, there are variants of it which can handle irreducible control flow graphs while you're working on it.
But for our purposes, when you know your control flow graph is reducible, it gives you a lot of it gives you more flexibility regardless of whether we do this before the stack of fire or within the stack of fire, we need to pick a strategy of how to handle irreducible control flow. If if we were trying to translate a programming language where irreducible control flow was essentially impossible, this wouldn't be so much of an issue. But what we want to support, I want to support go to switch and Duff's device and all that goody stuff. And that that does require dealing with irreducible control flow graphs. So that begs the question like what exactly is irreducible control flow. So it's reducible if every loop has exactly one entrance. Let's look at a concrete example. So you have some code here that turns into this control flow graph. Now if you follow the arrows you'll actually see that these three blocks they form a strongly connected component. So from the outside the only way to get in actually is here the while head block.
So you can get from this block to this block to this block and then this block unconditionally jumps back to this block. But this is kind of the classic demonstration of what you might see as a connected component. The thing that makes this this one reducible is that there's only one way to get into this set of blocks from the outside. You can you can have multiple arrows coming into this block but there can be only one block in which you enter from the outside. Intuitively what reducible gives you is that you can pick one of the blocks to be kind of the starting point of your loop. If you have multiple places coming in you kind of have to pick okay well how do you how do you start entering the loop? Uh so let's look at an irreducible graph. So yeah, so like let's look at the actual loop in here as well, right? Just just to to clarify what's what this on without looking at the control flow graph itself, let's first look at the the source, right? This is essentially a dowh loop. So from the source side, a reducible control flow means that there's more than one entry into the loop. This forces you to have some sort of control flow that will jump into the middle. So let's see what that looks like in the actual control flow graph.
You can actually see this loop here be reflected in these three nodes. The section starting label A is this block here. Section starting label B is this block here. If the condition is true, you branch into label B, which is this jump right here. If the statement was not true, then you you jump into label A, which is up here. This loop has two different ways in which you can enter it. If you're if you want to structure it with while loops, the while loop requires that you put something at the top. There's some point at which you enter the Y loop naturally top to bottom. But if you have two different entry points, you have to pick one. So if you have an irreducible control flow graph, how do you turn that into a reducible one? There are two general strategies. One is to clone some nodes also called node splitting and the second approach is to add a dispatcher node. Let me explain approach one first.
So what does it mean to clone nodes? So the problem fundamentally is that you have these two blocks say A and B and there are two different paths into the block, right? That that was kind this is kind of our fundamental tension here.
You have a loop and then there are two different ways you can enter it. the the way that we solve it by cloning it is we pick one of these entry point nodes as the canonical one and say okay when you're entering this loop you always enter through this this one and then how do you solve how do you solve the the the cases that fall through the other one well you you make a clone of that block and then it turns out you might need to make a clone of a few others as well and this set of blocks instead of looping it within itself will redirect all the cases where you'll fall into a to block A essentially you you kind of need to clone enough of the graph so that it will bring you back to A. In a simple case, you might only need to clone B. And because A will never point back to B prime, B prime is not part of any loop. The tricky thing about this is that what what do you do about the edges where B points into C because if if you don't clone C and if you just have B point into C, then you created the problem of of having a second entry point again, right? So you actually need to clone C as well and then have that map back into A. So the nice thing about this is that if you can do it, it gives you practically no runtime cost except for the fact that your code bloat. In the worst case, you actually can have an exponential explosion in the number of nodes. So there's a there's a bit of a simpler approach approach if you're willing to pay a little bit of a runtime cost. And the the idea is to have just instead of having a a cloning a multiple collection of nodes, you actually create only one extra node that serves as the entry point into the loop and dispatches back into all of the blocks that used to be the entry points. So what does this cost us? Well, you need one extra variable whenever you jump into one of the entry points. any any place you used to jump into the entry points, you actually need to update a single variable that indicates which one you want and then you jump into the dispatcher and the dispatcher will look at that local variable and and pick out which which block to jump into. So what does that look like in the graph? So here's here's our previously irreducible graph here. We had this this loop here.
Um we had two different entry points, label A and label B. Once we do the transformation, you'll see that we only have one extra node here. Everything else essentially looks the same, right?
Everything before the loop, everything after inside and after the loop looks very similar except that you have this one extra block that we just created.
This is the solution I ended up going with for for this one. So in a real production compiler, they actually do switch they use both. If the cost of node splitting is cheap enough, that's what they'll do. If the cost of node splitting looks like it's going to explode, then you'll go with the dispatcher approach. For now, I've decided to just go with a dispatcher approach and then come back and add node splitting if it turns out that's something that would significantly improve the the performance. And even with the dispatcher approach, there's there's some subtleties here. There are actually two different sets of edges that come into the entry points. One are the ones that come edges that come from outside the the the loop or the the strongly connected component. Every every edge that comes from the outside has to come into this this block, right?
So so here you see the then end if, right? But then there's a question like, well, okay, like the whole point of this is to funnel all the outside uh edges into the loop. Why not have all the other guys still keep the loop, right?
Because if you jump into the dispatcher block, you pay a price. Like you have to pass in dummy variables uh dummy values for the the branch that you don't use and you have to do a switch a switch on which which block to enter, which is, you know, you have to determine runtime which one you're entering. That's that's going to cost something, right? But you you you can see that like you have to have at least one edge up here, right?
The whole point is that you want to have a strongly connected component with with only one entry point. Well, if you don't make this block a part of the a strongly connected component, then you you're coming back to the same problem. All the old entry points are still the entry points, right? So, you need to you need to have some some of the edges be part of it. However, it turns out you don't need to have every edge go into it, right? So, like there there are some cases where you can avoid paying that extra price by not having some of the edges go back in. It turns out for some of the internal edges you can avoid them. There's an there's also even with a dispatcher approach there's an optim optimization we can do where we can pick out which edges actually do have to be routed through and which edges don't.
But it turns out to actually optimally figure that out turns out to be an MP hard problem. So even in production compilers you only do back edges even if that's not the optimal set. Okay. So what does this actually look like in code? The top level set of code actually is pretty straightforward to follow at Tarjan. This is a standard algorithm for finding the strongly connected components. You loop through the strongly connected components to find one that might be irreducible. If the size is length one, a single block uh strongly connected component can't be irreducible anyway. A single block is always reducible. So we skip those. But if we find one turn out to have more than one entries and entry is pretty straightforward to to check as well. You look through all of the the blocks. You check the predecessors of the blocks and then see if the predecessors are included or not. If you have a if a block has a predecessor that is not included inside the strongly component strongly connected component itself, then that block is an entry, right? So you can you can list all the entries fair in a fairly simple way. If you have more than one entry point, then then that's irreducible. Uh you have to do something about it, right? And then the thing you do is we insert the dispatcher there. And then we also have to we we have to make sure that we get all of them. Part of the tricky thing is that like if you create a new new block that changes the shape of the graph, fixing one irreducible sec might reveal another sec that is irreducible that you have to fix. There are some smarter ways to do this. In this case, I just wanted to keep things simple. We we we just go back and just make sure that as long as we see any irreducible components, we keep on we keep on repeating until we we get rid of them. So, so the hidden complexity here in this uh nice looking set of of lines that fit in one slide is this one here. Uh and it's not hard code per se like so in inserting that dispatcher node the actual code where you create the block itself and then at the very end you do something like this where like you need to take a snapshot of the previous entries and then you need to redirect their their edges. So conceptually it's it's not it's not hard per se. It's just a little bit messy and you need to kind of do it carefully, right? So what is the result? I guess this is going back to the the previous slide. The whole process essentially boils down to okay, you detect an irreducible graph. You add an extra uh uh block here. Make this block the sole way to get into the the nodes that used to be the entries and then now your your graph is reducible. Okay. So when will we see the stack of fire? Before I I kind of said that that this video was going to be the stack of fire video. But as I'm learning more, I'm as I'm studying the stack of fire more and more, I'm coming to realize there are just a lot of different pieces that you have to put together. It would be useful to have one video that conceptually explains what it is and what we're trying to achieve. And then there's uh the the tricky part of ordering the blocks, which I think deserves its own video. Um and then also something I kind of missed uh doing earlier which I think would have been useful uh because this is not a a standard C feature but many other languages have it which I think will be nice to to use when describing what the source looks like is is a concept of wild labeled while loops right so like if you've seen like Java code JavaScript has this too I think when you have like nested while loops and sometimes you want to break outside the outer loop that feature turns out to be very useful in order to describe what it looks like cuz that's essentially the web assembly looping primitives allow you to do. I don't know if the labeled while loops are going to be a separate video. Probably not, but I think it it does it does need to be covered at some point in one of the videos as well. So yeah. Um yeah, I hope that's better.
We'll see. We'll see. We'll see how it goes. Yep. Until next time.
Videos Relacionados
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03











