Casey’s breakdown exposes the superficiality of modern post-mortems by distinguishing between a mere trigger and the actual architectural flaw. It is a vital reminder that true engineering requires understanding how a system fails at its core, not just describing the symptoms.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Casey breaksdown AWS outage (Whiteboard Edition) | Standup #40Added:
This episode of the standup is going to be extra special because Casey is going to do the intro.
Casey, what are we talking about today?
Hello everyone and welcome to the standup. The number 40 five six best tech podcast on Spotify according to the most recent uh something.
True.
Uh anyways, I'm sorry. Today on the standup, I wanted to cover something.
I'm going to talk about the AWS outage that happened in October.
But I'm doing so because I kind of wanted to talk about a bigger thing, which is the idea of actually understanding something versus saying you understand something.
So like one of the things that happens a lot, especially >> [clears throat] >> I think to people who are uh earlier in their programming career. Like if you're if you're a junior programmer or something like coming in um and I know this was certainly true of me.
Is you you want to seem like you know stuff. Right? Like you don't want to seem like you don't understand what's going on. So there's a lot of like external pressure, whether it's really there or not. You feel like you should kind of say that you understood something or or pretend to understand something even if it's like a little bit hazy or you didn't quite get it.
And even if it wasn't your fault. Like even if the thing wasn't explained properly or didn't include like important information, you're still incentivized to basically act like you knew what it was, right? Cuz it just makes you seem smarter or something or at least doesn't make you seem junior, right?
And so one of the things that at least I've found as I got older and programmed had more programming experience and things like that is nowadays I like almost over ask for for things to be explained. Like I'll like I don't care about looking dumb at all. I'm like, wait a minute, go back.
Like I didn't understand that part. Like what do you mean by this or like what's that term mean or whatever.
Uh because now I just don't really care about that. Like I'm not as worried. And I want to actually know because I've had so much experience programming where I thought I knew something or I pretended I knew something and it came back to bite me. I'm like, I want to actually know. Like I want to be sure that when I have an explanation of a bug or I think I know the reason of a performance slow down, I always in the back of my head I'm like, if I haven't really gotten to the bottom of this, it could be something else. It could be It could be that the real thing is still hiding in there and I just don't know because I haven't really looked at it all the way. I'm just I'm moving on cuz it's convenient or whatever.
And so uh the reason that I wanted to talk about the DynamoDB outage is because recently there's been kind of a a string of high profile outages. So there was like a big one that took down Google and it turned out it was a thing where like they they didn't handle a field being empty, right? So that their programming, the way they were programming, they were like, okay, we have this thing we we load some JSON and if there's nothing in the JSON, it's just we like we deref a null pointer or something, right? It was like literally that, right?
Um and then there was one with CrowdStrike where they were like they took down the entire world with blue screens and that was they gave a very good It was like a really good explanation of it. They were like, we had Here's we do this certain array sizing thing and we had too many rules so it like overflowed the array, right?
And so these were like pretty good when they when they gave what they call RCA's or root cause analysis, right? When they said like, here's why we went down. When I read them, I didn't feel like there were a lot of unanswered questions in my mind. Like maybe I didn't know like literally the line of code that because they maybe didn't publish literally the piece of code, but they gave me enough that I was like, okay, I understand how someone wrote this code and I understand the stupid thing that that they did, right? That like, okay, don't do that thing. I understand and I'm totally like, okay.
With the DynamoDB one, because it came up on this podcast, right? We talked about it when that that dude at the Guitar Center, right? Was like, I overheard someone talking.
>> Yes.
>> [laughter] >> At the pub, right?
>> Yes. Incredible.
Here we see the elusive programmer, a simple creature that spends most of its time working alone, often in darkness.
But what's this? Someone being wrong on the internet. [music] Our coder springs into action, reaching top speeds of 120 words per minute before flash. A light mode website, the natural enemy of these code lovers, stuns our friend. The chase is called off. We'll have to get them next time.
When not on their computers, [music] they can spend hours drawing crude symbols on something they call whiteboards.
Researchers [music] have discovered thousands of dialects, often with more than a dozen used in a single office.
However, no linguist [music] has yet deciphered what their purpose is.
Vain creatures, their bodies have evolved over a millennia to be able to sit in unusual postures while looking at themselves on monitors. This will often last for many hours [music] using the excuse they're waiting for code review.
When pressed to why they're so inactive, and finally, after a long day [music] of accomplishing very little, our keyboard warrior's ready for bed.
Quick read and it's lights out.
Good night, little coder.
So how do I sleep so well at night?
Well, I have Sentry to help me crush those bugs and I'm not I'm not talking about like little teeny tiny South Dakota bugs that die in the winter. I'm talking about big mean jungle bugs.
And I'm not scared of any of them, by the way. Just But I can squash those bugs with Sentry by Sentry.
So I was kind of a little more motivated about that one to go like, okay, let me go see like what how much information they have posted.
Uh and I had read I had already kind of read afterward they had a summary where they posted an RCA and it was very vague. Like the RCA just did not really explain very much.
I then noticed that they posted a a full presentation like at re:Invent in December. They or I guess I don't know if re:Invent was in December, but the video went up in December of the re:Invent presentation where they covered this outage. So I went and watched all of that.
And after having read the entire RCA and watched the entire presentation, I still was left going, I don't see an actual explanation of the bug here, right? Like I'm trying to figure out what the actual bug was and it just wasn't ever explained. And so what I kind of wanted to do was just talk about that, go through why I don't think they explained what the bug was and just use that as an example of like, I don't think people should just go, oh okay, I get what the bug was. Cuz people have like replied to me and gone, oh, here's let me explain to you what the bug was. And then they just explained the same thing as the person. I'm like, that's not the bug, right? So everyone see is like incentivized to go like, I understand it cuz I It's like, no, if you can't tell me what the actual bug was, then we're not done here, right?
Like we should have that fuller explanation. So does that all make reasonable sense like what I'm saying?
Yeah, first off I just want to say I knew exactly what you WERE SAYING, CASEY.
>> [laughter] >> LIKE RIGHT FROM THE START, RIGHT, TJ?
Like right away you you were like, okay, I know I know what the >> exactly what you're saying. No questions on my end. No blockers. Thanks, everybody. I'm I'm great. I'll see you guys tomorrow. You know, no problem. I just want to say I really like listening to Casey talk. On the podcast when I listen on Spotify, but also just right now. Like I could listen to you talk for an hour. Great shoutout too for the Spotify podcast.
>> I was just going to say I was going to say like especially when you listen on Spotify.
Your choice. The quality is incredible.
>> [laughter] >> You also get the bonus extras, right?
You get all the banter before and after the actual main show.
>> Oh, really? Yeah.
>> [laughter] >> We started posting longer longer versions on Spotify that are like more of the extra yap time. Less of the on topic It's not on topic stuff, but a little more yappenings Yes. Because the live audience gets the yappening. They get to come in here. They get to hear about Trash and his Pokémon addiction, which you probably don't even know about because you were you were listening to this on YouTube, right? You don't you you don't get to hear all the fun stuff.
There are but that's kind of That's kind of a hard sell for the first 10 minutes of a YouTube video. It is a very [laughter] hard sell for a YouTube video. Be like, I'm going to watch four guys talk about something I don't even understand and it's called DynamoDB.
>> [laughter] >> Yep.
>> Since we're starting the podcast, maybe we should introduce Adam. Oh yeah.
That's a very good point. We haven't done introductions at all.
>> Hello.
Tell us a little bit about why we brought you on to the podcast today. Cuz I'm at [clears throat] TJ's house.
And [laughter] then you you have to That is number one. That is number one reason.
Why you are not for this today. TJ requires all people who visit his house to be on the podcast. It's been >> [laughter] >> awkward at a couple times. Yeah.
Yep. Okay, but Adam, who are you really other than an AWS hero? I'm not even that. I was an AWS hero. RIP. Too soon.
You get kicked out of the superhero group? Like how's that work? You just don't you don't get renewed. I was a one term hero.
And they decided >> Oh, you Is it like a paid up thing? You pay to be a hero?
>> No. No, I just did I didn't really care about AWS anymore or talk about it ever.
So they were like, maybe he's not a hero anymore. That's fair. Now he's a villain. Woah.
Casey looks like he's part of like some mis- murder mystery He's standing there.
Yeah. Oh, dude, we're we're about to get uh like the What is it? Nick Hill?
What's the person that does all the like drawing on the board and then it shows up? Casey Muratori. That's what you're thinking of. Muratori. Is [laughter] it Muratori or is it Moratori? Oh my god, you're about to do visuals, aren't you?
Yes. I know. This is the best podcast.
It's literally this is the best one to be a part of.
>> [laughter] >> Uh it's pronounced Meeratory by my family, like almost like there was a Y there, like Meeratory, but >> it doesn't really make any sense because if in Italian, it's an Italian name, and in Italian it'd be Meeratory. Meeratory.
No, no, no. It's Meeratory in Italian.
It's Yeah.
>> [laughter] >> Doesn't make any So, why how it got Meer, I have no idea. That was some Italian-American like immigrant thing that happened, I guess.
>> I don't know.
So, here's effectively what they said.
They have these things called end API endpoints, they call them, right? And these are the domain address, like if you look up in DNS, it's the name that you're going to look for to know who you're supposed to send like your DynamoDB requests to. And these things, I guess, look like this.
And Adam can probably confirm this because he is or was a uh hero.
They look like what Oh, it's behind. Yeah, we're we're a few seconds behind cuz our video disappeared on Riverside.
>> yeah. Okay.
>> Oh, there we go. So, they look like dynamodb.us-east-1.api.aws or something like this. And I guess it depends whether you're using IPv6 or IPv4, like they have different names depending on things, or whether you're using like a specific like uh they talked about it governments use like a different one or whatever. So, these names are like names that you effectively hardcode, I guess, into your application where you're like, "When I need to do something with DynamoDB, I'm going to like ask for this." Does this make sense? And does that sound right, Adam, to like cuz I don't use AWS stuff, right?
Yeah, yeah. That's all right. So, you know, you you ask for something like this, and you're going to send >> perfectly.
>> [laughter] >> What? That's just I do. Sorry. I mean, I know what he's saying. Yeah. Yeah.
So, that then is going to redirect you somewhere because obviously there isn't like one machine that's going to handle all the DynamoDB traffic in the entire universe, uh even if you subdivide it by region, which you can see here, you're kind of supposed to pick a region. I guess you don't You don't send it to some main address, you send it to a regional address, or maybe you there is a main address you can use that will figure it out, I don't know. But anyway, at some point you're talking to this, and this needs to point to effectively like a load balancing scheme.
So, this thing is supposed to point to effectively what they called a DNS tree, although they never really explained the tree nature of it at all. It sounded more just like a like a a weighted array, if you will, where you just said, "Here's a bunch of machines, and you're going to pick those machines based on weights that we set so that we can load balance, right? So, if a machine gets behind, maybe we set its weight to lower, and if a machine seems kind of empty, we set its weight to higher."
And so, they called it a tree, so I'm assuming it's a tree. They never explained what the tree part of it was.
But this name is supposed to point >> for one quick second? Uh by the way, someone did get their L6 promotion based on that tree. So, I do think next time you should find out what that tree is cuz that meant a lot to somebody, okay?
There was a package, and engineers happened.
>> [laughter] >> Okay. I do agree the tree is probably important. It's just not important for the bug. And even the That So, that I will say there was no need for them to to explain the tree, so I'm okay that they skipped out on what the tree is doing.
Um but but >> question as well. Yes. Is it called a tree because it's a root cause analysis or no? Yeah, perhaps. [laughter] Yeah.
Okay, no more jokes, we're too off topic. I'm sorry. I'm sorry, guys. I'm sorry.
So, anyway, this is supposed to point to that. And uh that that sort of uh uh this this load balancing scheme basically of DNS entries.
And the way that they described this in like their presentation is they would use a thing like, we'll say plan145.
uh dynamodb, like ddb.aws, right?
Now, this is the root of that tree, I guess. Not root cause analysis, but like this tree this would contain like this is the top level record of a bunch of records that allow it to do its load balancing. And I assume Route 53 kind of has this load balancing capability. I'm reading between the lines of the presentation, they didn't say that outright. But I'm assuming Route 53, which they're doing all this through, you know, which is their own DNS thing, is allows that load balancing to happen by you just set stuff up in here that says how the load balancing should sort of be working right now, and then it will pick the correct machine based on like some kind of randomization in the weights or whatever.
Now, what they said was this >> [clears throat] >> name, which really does exist, and apparently there's a tree or something like this, this name is one that they just kind of used for the presentation. They never actually used a human-readable name for this plan like 145 that I've written here or whatever.
It was really a hash of something. So, it would really be like, you know, 0afe12, you know, 9a or something like that, right? Is actually what would be there.
So, if you went and looked, you would not see a human-readable name, or at least at that time you wouldn't, I guess. You wouldn't see like plan 145, you'd just see that.
And so, the idea was, "Okay, a user goes to use it. They query this name. Route 53 will direct them like to here, and this thing is some kind of a load balancing tree that Route 53 can use that will allow you to get where you need to go, right? To They'll give you an actual machine you can send traffic to eventually."
Again, I they did not describe any of that, so I have no idea how any of that works. I've never touched or used Route 53, so I have no idea. But we'll just assume that that happens because it doesn't matter for this bug. Uh we do have an AWS hero. So, if you do if you are confused, you can always uh ask Adam, and he may have further insights.
I mean Yeah, go for it. Well, Route 53 does have a lot of different ways you can like split the traffic. So, yes, weighted is one of them, and that sounds like what they described. So, somehow they've set up these records with that, and they just didn't say how, but something Something in a tree format >> [laughter] >> did [clears throat] that. My guess is there's like a weight like the tree has like weighted like there's a couple weights at the top that branch out to more weights or something like that because that's easier for it to deal with cuz if there's a lot of them or something, who knows? Anyway, I have no idea. Point being, this is what's supposed to be happening normally.
Now, the reason that this is called plan 145 here, even though it actually would have been some hash code, but they referred to it as like plan 145, is the load balancing, >> [clears throat] >> as you might imagine, has to be kind of continuous because the DynamoDB machines are like doing stuff all the time.
They're becoming more overloaded, there's machines are going down or crashing or who knows what, right? Could be happening, being taken offline.
>> [clears throat] >> New capacity can be added. And so, this stuff has to be updated constantly, like all the time.
So, this main API endpoint that you connect to, it constantly has to have that tree that it's pointing to be adjusted.
And so, the way that they do that is they create another tree, the tree that they're going to move to, right? They create like, you know, plan 146 or something.
And they make the whole tree here, and then when they're ready, like when this tree is done, they take this, you know, this record here, and instead of it pointing to that one, they point to this one, right? They make the new one, and they move over to it by just changing that name.
Now, for some reason, and this reason is not really explained, the way that they've set up that process is they split it into two pieces.
There's something called a planner, which figures out what the new tree should look like, basically. So, you can imagine there's some machine called a planner, and I don't know if it's an actual machine or if it's just a process running on some machine that's running other things, who knows.
But there's something called a planner, and as far as I could tell, there's only one.
Meaning there's just a planner that sits there and figures out what should the new plan look like that we're going to switch to.
And it's constantly doing this. So, it generates plan 145, then it generates plan 146, then it generates 147, 48, 9, you know, blah blah blah blah blah, right?
And it just keeps putting out plans for all of eternity cuz that's its job.
Now, it never actually creates them, apparently. It It's job is not to ever make them in Route 53. It's just to figure out what they would be if someone were to put it in Route 53.
Then, they have three enactors.
These enactors get the plan from the planner, and they put it into Route 53.
Does this make sense? Now, one planner, as far as I am to understand presentation, three enactors.
There was no explanation for why this would be the case. They said the reason there are three enactors is because it's supposed to be fault tolerant, like if one of them goes down or something, but they never explained why you wouldn't then need three planners because if the planner went down, then the enactors have nothing to enact, so it didn't really make any sense. So, there wasn't an explanation in the thing about why the structure looks the way it does.
It's not really that important to the bug that it looks this way, although it kind of is, as we'll see later, so I was a little weirded out by the fact that they didn't justify this, but that's fine.
So, >> [clears throat] >> hopefully that makes sense. We have a planner, we have three enacters, the enacters are all trying to enact this plan, right?
Now, what happens here is that for again reasons that all the only thing they said in the presentation was it makes it easier to reason about.
This is what This is the >> [laughter] >> only information we got. They said it makes it easier to reason about.
Because it makes it easier to reason about, these enacters use serialization.
So, instead of them just trying to create records, and if the records are already there, just not creating them or something. In other words, I have three people running. We all want to create, you know, let's say this top-level record plan 146.ddb.uh, you know, a a aws, right?
We all are trying to do that. One of us does it first, the next person tries to do it and it's already there or something, right?
We're all trying to create the same record, so in theory we could just have three people randomly hammering on whatever part of the plan they're trying to hammer on, and in theory it should kind of all work, right?
And I sort of got the sense, although you didn't come out and say it, I sort of got the sense from the presenter that he would agree with what I just said.
Meaning that they could have just had them run arbitrarily and it would or should be okay.
But, he said, they use serialization to make it easier to re reason about. What that means is instead of these enacters just hammering on it like that, what they do instead is they attempt to acquire a lock for whatever the endpoint is that they're trying to update.
So, in other words, if this if this person is trying to update one of these things, and I I got the sense that it was if you're trying to update this one, but it could have been if you're trying to update this one, or it could have been on both. I They never really 100% said, if I remember correctly, exactly where the locking was occurring.
But, the locking occurs by them going, "Okay, I'm going to create a a lock that is a DNS record."
And by using the fact that Route 53 has the idea of an atomic, which is, you know, I can do two things and if they both wouldn't succeed, then it won't do either of them.
They basically made a locking system that locks via Route 53. So, Route 53's DNS records are actually the lock record, if that makes sense. Can I Can I ask a quick question? Yes.
>> You said it does this through serialization?
I I don't quite understand what that means because I thought serialization is just converting from one memory to a different memory representation of some and I'm struggling on that part. Different serialization. So, uh yes, that is serialization.
>> Oh.
Say it again. That's serial. Okay, yeah.
In this case, we we literally temporal serialization, meaning they wanted these enacters to have some kind of a way in which they would organize their behavior into a into an order rather than just being arbitrary. And the way that they did that was was locking. Okay. So, what will happen is instead of this person just doing whatever it is they're going to do, like, "Okay, I'm going to like I finished this. I'm going to point this guy at plan 146 now." Instead of doing that, it attempts to acquire a lock on like this, right?
And if it doesn't get the lock, it won't make the change.
So, only one of these enacters can be in the process of updating this at any given time.
Does that make sense?
Mhm. Now, again, exactly what they were trying to do with that was never explained. They just said makes it easier to reason about and left it there. So, I don't know why they thought this was an improvement. And amusingly, it's what ends up uncovering the bug, so it wasn't an improvement. If anything, it was probably bad. But, So, Casey, are you saying they don't have like they don't have a good reason for they're saying we're going to make the enacters run almost like one at a time. Why do they have a Why do they have three enacters?
I don't under Like, why do they not just have one? They just don't say that. We don't know why. And they didn't quite explain like I didn't really uh hear an explanation for how you have three concurrent enacters, you expect them to be able to go down, which is why you have three, Right. but they're taking a lock. So, what happens if this guy takes the lock and then goes down? Like, I didn't hear an explanation for that either. So, this was all very confusing to me. Like, I would I I'm not complaining about it as part of what we're talking about here because it's not important for the cause to me, but as a presentation, I had so many questions. Like, I was like, "I don't understand why you did any of this." To be completely honest, right?
Um And maybe that's again part of it could just be that I don't use AWS services.
It might be that some of these things would be obvious if you are someone who regularly uses Route 53 or something.
You'd be like, "Oh, it's because locks can be set to time out or I mean, I don't know, right? Um But, anyway, So, yeah. So, they're doing that.
And what ends up happening, I for for uh this but the thing that uncovers the bug is that what ends up happening is these enacters, when they don't get the lock, they just do like a back off, right?
They'll basically just do like, "Okay, let me wait and I'll try again."
So, enacter This enacter tries to get the lock, but somebody else already has the lock, so he just waits a little while, he tries to get the lock again.
That's what will happen, right?
And what they said happened was they hit a pathological case, quote unquote, where one of the enacters is, you know, has enacted some plan, and that plan, let's say, was pretty old. I think they used 110 was an example uh that they used. So, it enacted plan 110.
And it wants to point, you know, it's like, "I got to set the API to point to my 110."
Tries to get the lock to update dynamodb.us-east-1 or whatever, and fails because someone else is enacting plan 111 or something like that, right?
Or plan 109. Could have been a previous plan.
So, the other enacters are doing it. It can't do it. It backs off, right? And remember, this enacter here, we're on 110. It's trying It's It really wants to enact it. It tries again, someone else has the lock now. It tries again, still locked. This person sitting on 110, desperately trying to enact it, can't do it. Apparently, this just happened so many times that the other enacters and the planner is just turning out new plans this whole time, right? The other enacters, they get up to like 145 or something and 146. They're enacting plans that are like way ahead of 110, right? And this guy still stalled because he just unluckily never gets the lock, right?
Finally, at some point after like plan 145 is already been enacted and pointed to by some other enacter and all that stuff, plan 110, this enacter, it's still trying to do it, finally gets the lock.
And he's like, "Yeah."
And so then he says, "Okay, we're pointing to 110 now. Yes, right?"
So, now it's on a super old stale plan.
But, this really shouldn't be a problem, right? Because eventually, the next time some enacter has something, it's going to be a much later plan. They'll just enact plan, you know, 146 or seven or eight or whatever, and we'll repoint it back to this, and we're back to a fresh plan. So, everyone will just have bad load balancing for like a few minutes, but then it'll be fine, right? They did have bad load balancing for at least a few minutes, right?
>> Yes, true.
Uh well, it's a lot worse than that.
>> [laughter] >> That's what was supposed to happen, right? Meaning that's how they would expect this to work, too. Mhm.
Okay. The problem is these They also didn't want Route 53 to become clogged with all of these records. Because if they just left them around, eventually, after, you know, 3 months, you have like 8 billion records that you stuffed into Route 53 for every, you know, couple minutes you're putting in this big tree of weights and stuff.
They were like, "Okay, at some point we should just clean up these plans."
So, enacters also look for plans that are older than a certain amount. And if they are older than a certain amount, they'll delete them.
So, what happened was they pointed to plan 110. This enacter finally gets the lock. It points to 110. Another enacter is like, "Oh, wow, 110. Man, that is old. We should get rid of that." And deletes it.
So, now [laughter] dynamodb.us-east-1.api.aws is pointing at a record that can't be resolved, right?
It's just something It would actually again It wouldn't look like plan 110. It would look like oafe129a some hash. Right? Right? dw ddb.a a ws.
Mhm. But, it's pointing at that name, and if you ask for that name, you get nothing.
So, what would happen at that point is everyone who was trying to get a endpoint to send stuff to would get back a unresolvable name, basically, right?
And I don't really know what happens in Route 53 when that occurs, but you would basically be getting back something that you either couldn't use or just got gobbledygook for an IP, who knows? But, whatever it was, if you attempted to actually use it, you weren't going to get a response, >> Interesting. Is this because AWS doesn't use enough Rust? Because that's obviously a use-after-free bug. Yes.
>> I think Rust would have solved that, right?
>> If If you rewrote Route 53 entirely in Rust, obviously, all of these problems are not there. Uh No, uh to to uh to be specific, I do think in the presentation they did say not about Rust, but they did say what would happen specifically, which is I think when you asked for this thing or either this thing or this thing. I don't know which one they were referring to uh cuz I can't quite remember. You would just get it back, I think, that says no records found.
So, that's That's the end game of what would happen, whether it was from asking for this or asking for that, I'm not sure, but they'll just get back no records found.
That's what That's what you would have received when your when you were trying to call that API. So, whatever like whatever like library you were using to like use this to use DynamoDB, it would just be like, "Hey, no records found, bro. Sorry." Right?
So, this, if you ask anyone on the internet, right? Um they're all like, "Yes, they explained the bug." That That's the bug.
Like, the bug is that there was this race condition, right? Everyone cuz everyone as soon as you say race condition, everyone's brain shuts off.
They're like, "Oh, okay. Well, it was a race condition. Done. Like, nothing to see here." Right?
Um so, they're like, "It's a race condition. They explained it." It's like, "No, they didn't explain it."
Because, if you think about what would happen here, immediately after this, everyone's getting this, it's a new enactor. A new enactor will just enact a new one, right?
And so, the bug, right? Is why didn't that occur? That's the actual The actual RCA that I wanted to see is why didn't the next enactor come and fix it? Can I Can I throw out something else? Wouldn't Wouldn't also be a bug like why write a record so old that it should be deleted immediately?
Well, it wasn't. It was cuz it was it was this guy had written it quite a long time ago, and it was it the way it Well, I mean, if you're asking why didn't they write enactors with better code? Yeah, that's a very good Okay. Okay. Cuz that's >> [laughter] >> Okay, fair. Cuz it seems like if you're updating to something that should be deleted immediately, isn't like that's like that feels like the problem right there. You've done something wrong long before. Yeah, even though it doesn't really fix the theoretical structure of this thing, a simple check in this guy when after he finished backing off on the lock, he should maybe check to see whether he's about to set this to something that he would delete if he was running his deletion [laughter] code is probably a good safety measure. But, yeah, so 100% agree with that. Okay.
Okay. That enactor worked really, really hard to get that record written.
>> [laughter] >> YEAH. IT'S BEEN WAITING A LONG TIME, AND IT'S GOING TO have its Pokémon cards.
Anyone ever waited? So, just let him write the record, okay?
So, so I want to hear about that.
Unfortunately, if you look at the presentation, and you look at the RCA, it's nowhere to be found. The presentation at least has one 12-second little tiny chunk where it does say where the bug roughly would be. And so, let me explain what that is.
So, what apparently occurs alongside this. So, when when you do DynamoDB US East 1 when you point that at your plan, you also do another operation at the same time.
And that operation is to set rollback.
Uh I think it's uh dd Is it ddb.aws?
I don't remember exactly what it is here. There is a rollback record.
It sets that record to whatever the old plan was. So, if we were here pointing at 145, and we're now going to point at 110, right? This old enactor is like, "I'm moving to the 110."
It attempts to set it take whatever this name was, right? Currently, and move that new that name, which would have been plan 145, move that so that the rollback address points at the old plan.
Right? And this is just for debugging, or you know, it's it's basically just for operator ease, right? If they want to roll back to the previous plan or something like that, or if you just want to know what the previous plan was, you can see it here, right?
Mhm.
That's part one of how the what they said about the failure. I would want to point out one thing here is this also didn't make any sense to me because I was like, "Okay, you're telling me that these things update every like minute or something.
What good is it to have one of those?
Like, by the time you even logged in, it's been updated from the one that you wanted to roll back to to some new thing that's actually the plan you don't want because everything went down, right?
Like, it's right it you you don't want this. You just want these names in a list so you can be like, "What was it at at 12:30?" Like, that one, right?
>> [snorts] >> So, this made no sense to me. I have literally no idea why why this would ever be good, right? It did not sound like it would do the thing you actually want, which is to be able to mark a point in time and go, "We need to go back to 1:00 p.m. because everything went to to crap after that." Right? Um anyway, so that didn't make any sense to me, but again, not exactly specific to the bug, so I didn't ask why. I'm just saying, "Okay, that's what thing it had to do."
So, it can only roll back one version is what you're saying. Yeah, even though the other trees do exist. So, you easily could by just knowing what the name was.
So, all this is isn't is putting a human readable name on something you almost certainly don't care about. Right?
But, they don't really They can't really store that much stuff, Casey. I don't think they can really put like I don't know. Adam, like they they don't have a lot of scale there, right? Like, they can't store 400 terabytes. God, that's a lot of bytes. If it were me, I would have just made this a timestamp, if that's what you wanted, right? I would have said, "When did the planner or or when did this person point to this thing?" Like, when you got the lock, you change this name to the timestamp, and update this in one atomic. So, then you just know if I want to roll back to 1:00 p.m., I just look for like whichever had the timestamp just, you know, the the the earliest timestamp not after that time, and that's what we were running at that time. That's what I would have done, right? But, I don't know. So, I have no idea why they did this. They did what they did. I You know, maybe it might make perfect sense. Again, I have no knowledge of their system. All these things may make perfect sense, so I'm not really I'm just saying I don't understand them. I don't They might not be bad ideas, right? They might be good ideas if you understood the rest of the system. So, anyway, so what they say, and this is all we get, is this operation, meaning setting the rollback to point to the old plan that was being, you know, which in this case would have actually been newer in some cases, right? So, it's not really the the previously pointed to plan, which may be older or may be newer.
Doing that activity, if that plan no longer existed, meaning like it had been deleted like this, then the enactor stops permanently.
So, every time like once you get into a state where DynamoDB right? So, we we do the whole sequence of steps that we said here. This plan gets deleted. So, now this is pointing at an invalid like unresolvable name. We cannot resolve plan 110, which is actually some hex code, but whatever that was, we can't resolve it anymore.
Once that state is true, then the next time an enactor comes and tries to make it point to a new plan, whatever that new plan is, it cannot like when it actually gets this far and tries to set the rollback, that will crash it permanently.
Therefore, all three enactors will now stop cuz eventually all three will try to enact a new plan. They will try to set the rollback first to point to whatever the the old plan was, find that there's no plan there, and that apparently is just a hard crash.
>> [laughter] >> Now, that's crazy. I thought the three enactors was supposed to make it so that it had redundancy.
>> [laughter] >> Now, again, this is why I get grumpy with people online who are like replying, they're like, "It was a race condition."
It wasn't a race condition. The race condition is not necessary for this. The race condition is just why you ended up with this name being unresolvable.
But, if you didn't have whatever code did this badly, it would have just worked. You never would have known. You would have had a momentary like, you know, minute outage of DynamoDB or something. But, I'm guessing there are minute outages of DynamoDB from time to time, right? Like, that's not global news. What's global news is taking it down permanently, which is what happened here. And until an actual human goes and figures this out, resets it, gets these enactors going again, it's just gone, right? It's just out permanently. So, hours potentially, right?
Um and it was long enough, I guess, in this case to then have cascading failures. You would never have had that if it's just a momentary out like if some people momentarily got an unresolvable name or no records, right?
Then then they would just try again.
That's usually what Like, with DNS, like that's like your phone you went through a tunnel, right? That's all that would have been.
So, I want to know what What did the code look like here? How did you write something that if this wasn't a valid name, which it wouldn't even be on standup, meaning if you were starting this system, and an operator hadn't preconfigured it, it wouldn't be pointing to anything, right? That's the default case that you would think you'd start with.
Um so, if you're going to do this, you would think you would just handle that case because the rollback address could just not point to anything, right? Just take whatever this is, if it's nothing, set the rollback address to nothing. Done, right?
So, there's something really weird about the way they wrote this code. And that is what should have been in the RCA.
That's the whole bug to me. This is just set dressing for how we ended up having this this this thing point to nothing.
But, the same bug would have occurred if someone had accidentally deleted this record. Like, some operator was just like, "Oops, crap. I set it to nothing."
This the same bug would have happened according to the presentation, right?
So, the root cause is not the race condition. The race condition is an aside. Does that make sense? Quick question. Yes.
>> So, I I'm trying to I'm I'm I'm legitimately thinking through this. And so, that means the thing that sets the rollback probably assumes some sort of struct with a bunch of memory or something has been passed in, does some sort of like some sort of access, it explodes, or Yeah, maybe.
>> Do you think this is the same style of bug, which is the one line that took down Cloudflare, which is they just assume it's there and unwrap it?
It's in Rust.
>> it in Rust.
Rust. Unwraps it, explodes it.
I really don't know. My My guess, like in my head, I was like, what is the thing that I see people do a lot of times where I'm always like, why would you ever do this? But it's just because that's the way they learned to program.
And I was thinking like, if you were writing in one of these languages that likes to throw exceptions for error conditions, this would be a great example of that.
So, if you had a thing where you were like, oh, I went to go get the DNS record that this thing points to. And normally, in a sane programming environment, no one is throwing an exception there. If they get back nothing, they just return nothing, right? And then when the person goes to set D DDB.robot.DBS, they just set it to nothing, which is the correct behavior.
Like, nothing flows, literally the value nothing flows correctly through this flow. So, if you were writing it to be since it is a core foundation service, assuming you were trying to write something that was fault tolerant, you would never do something like throw an exception. So, in my brain, I'm thinking, I bet what happens in here is when you ask for this record, they just use some library call or something that throws an exception when the record doesn't exist. And it just threw an exception and the enactor was down.
That's my guess, right? And I could be very wrong about that because I'm just wild guess, right? But this is why I want to see the RCA. What was it? It could be exactly the stuff that Trash was talking about. I mean, it could be stuff that Prime was talking about.
Could be the stuff that I just said.
Could be anything. And I want to know because that's where the actual education would be here. Avoiding this race condition is completely unimportant. This race condition could have lived there. And while it was important eventually to fix it to avoid those once a year weird outages for 5 seconds or something, it is not actually the thing that we most want to learn. What we most want to learn is don't write this thing. And we don't know what this thing even was, so how do we not write it? This is why I think it was a bad RCA. Does that make sense?
Yes.
>> Yes.
Mhm. All right. What is most of AWS written in, Adam? It was Java when I >> [laughter] >> when I used to I was about to say someone from the chat said Scala. Uh they said they worked at AWS for 7 years, and they said most of it's written in Scala.
Well, that's technically Java with extra steps. So, >> [laughter] >> and that will anger all of them endlessly.
So, so that's really it for me. Like, that that this was the thing where I was like, I don't feel like I saw the explanation, and I actually feel like it's important to hear because there was a bad programming practice at the bottom of this summer, and I want to know what it was.
Especially because it helps people like me when I, you know, I don't really do a lot of architecture education right now, but at some point I probably would like to do some of that because I think there's a lot of bad architecture out there. And so, I kind of try to pay attention to these things like, what are the kinds of architectural mistakes that people are making? And I bet this was one of them, right? Uh and so, I'd like to know I'd like to know.
Yeah, I think like what I would expect is like at least like one simple reproducible example of like why it blew up, like a whole like little code snippet. So, like And that this is something you brought up earlier is like kind of like how we approach these type of things. Like, if I'm like reviewing someone's code and I see something that looks weird, I will always do my best to make my own little sandbox and like prove my theory out, and then like actually show them the code like, this is why this is probably wrong.
Here's like a small simple reproducible step. So, I would expect something like that. And that also helps me like truly understand cuz a lot of people, like you said, they'll they'll see something like that looks funny, but I don't know why it looks funny. But I I I can't stop there. I got to like actually like build it out and then like understand. So, that's what I would expect.
And you know, like like I said, the the CrowdStrike and the Google outages I thought were better at like just telling you that. They were like, look, it was a null pointer deref in here, or it was an out-of-bounds array because we thought there's only going to be 20 and we put 21 in the config file, right? I'm like, okay, I know exactly what kind of code that, you know, is causing that kind of problem, right? And furthermore, furthermore, to like an earlier comment, literally, as far as I know, everyone who programs in Rust only does it so that occasionally, when they see something like this, they can say, well, if they'd had written it in Rust, it wouldn't have happened. They were not given enough information to even make that comment. They probably made it anyway, to be fair, but they were not given it. So, you have to give one rule that should be followed in RCAs is you have to give uh Rustaceans enough information to, if they so chose, correctly say that it would have been prevented in Rust. True.
And this we do not have that. We do not know whether this would have been prevented in Rust. We have no idea.
Uh it probably wouldn't have, but we don't know.
Well, Casey, we do have a pretty good chance because it's like probably would have never shipped. So, would have prevented it.
>> [laughter] >> True.
We would have zero enactors because we would have designed instead of enactors.
Yeah. Uh I I will say something I think Cloudflare does really good job at this as well. They like go in and show like a lot of lines of code and say like, this is exactly what's going on. This is, you know, even though the problem's up here, this is the line that exploded due to all these previous conditions. That was me making fun of Rust with the unwrap, which it actually wasn't truly the problem. Uh but you know, it's just like all these things kind of happen. So, they they do a really good job. I'm surprised at how poor of a job AWS has done for this one.
Well, and the other thing, too, is it it was one of those things where in now it makes me So, it makes me unnecessarily suspicious of you, right? When I read this, I'm like, are you hiding something? Did you not really figure out what the bug was?
Like, you talked all about this race condition, but even from your own presentation, I can tell the race condition really wasn't important. That was just That was just what led to the record having been set to nothing, but who cares, right? Like, that's That's like something that's nice to put in the RCA as like an explanation of why this bug occurred now as opposed to some other time, but it's not the bug. So, it's weird to me like when I see an RCA that doesn't talk about the bug, now I'm suspicious, right? And unnecessarily so because if you actually did find it, then just tell me, and now I know you found it, right? So, it's a I think it also is a confidence boost for the people who are looking from the outside who want to know can would they trust this DynamoDB thing? If it looks like you actually found the bug, I have a little more confidence in you. If it looks like you have no idea what the bug was, or don't seem to understand what the bug was, then I'm then I'm more concerned. And so, I think that's also another reason to do this in your RCA.
It It provides confidence to your customers. I'll give it to Maybe that's why they fired Adam as an AWS hero, too.
Maybe it's all connected. Could be. They didn't want him exposing these dirty secrets.
>> [laughter] >> Yeah. He was too new, too much. He [clears throat] knew too much. Hey, could you give a could you give a quick like 3-minute summary of the guitar shop, like what that what that was revealing cuz I I'm trying to remember what it was cuz it involved like a single point of failure guy who was out here for this failure as well.
So, I don't know how to reconcile the two things. Uh and of course, we have no idea We have no idea if either are telling us the truth now, right? Because this was such a bad RCA, I have no idea if it's correct or not.
What was the password? Uh yes, the password was Wishbone12, I think. There you go. I was trying to >> [laughter] >> killing me. That's my recollection, anyway.
Um So, uh yeah, that story was that that uh there was the there was a thing that was designed to copy configurations, and that thing had kind of gone rogue and could not be stopped. Like, it was just like it was just copying configurations totally incorrectly, and it needed to be like fixed or repaired or something, and uh we we don't have any more information because it was an overheard conversation, right? And so, does that comport with this? Well, a little bit cuz those enactors do sound like the kind of thing that would be running a configuration copy, but on the other hand, it's not like a configuration for machines. It like a DNS entry is a DNS entry. It's not It's not really a configuration. So, I would say the two stories don't line up that well. Uh and so, that's another reason why I was kind of hoping that this RCA was a little bit more believable because I wanted to know for sure if that the story was false, and I still don't really know based on how bad this RCA What if What if the tool that the guy wrote to copy the configs is just literally the enactor? Like, they just productionized it and he like they haven't changed it in 7 years. That was kind of my I don't know. Connecting the dots there was he's like, guys, I wrote that as a way for me to test stuff in my local environment, and you guys decided to make three enactors and put them next to each other in prod. I don't How did this happen?
I do I have a ton of questions. Yeah.
Alternatively, is it the rollback because that's the one that did the copying of like, hey, here's the previous one, right? And so, I'm going to copy the previous one, then it gets like this null issue going on, and it just like the script never encountered a null, it just goes rogue and starts writing over and over and over and over again to where you can't you can't do anything.
I don't know.
Um all I know is that like, as far as I can tell from their explanation, going only on what they were providing, I still just don't think the race condition's even relevant because again, and a literally an accidental update to the Route 53 endpoint would have taken down all three enactors immediately cuz according to them, all that's required to stop them is if the if the endpoint points at an unresolvable name. That's all you need. And so if that's really true, literally an operator typo could have taken all this down. No race condition necessary, right? And so again, the RCA just does not do a good job convincing me that you've talked about what the real bug was because I can think of so many ways that you could have triggered this exact same thing that don't involve this race condition that you spent the entire RCA telling me was the bug, but I don't think it is.
So.
>> [clears throat] >> Well, we'd like to extend a formal invitation to Jeff Bezos. Do you want to come in here [laughter] and explain yourself? Uh it's I believe it's Andy Jassy.
>> now.
>> Yeah, Andy Jassy is the man you're looking for. Or we can take the I'm going to I'm going to straight to the top. Jassy is a previous as well. He's not in Jassy anymore. He's head of Amazon. Yeah, that's what we want. I mean, do we have the head of Amazon?
>> We want the president. Bezos was previous head. Now he's just chairman.
So he's no longer in day-to-day. You know, he's gallivanting. So we want someone more Just give me a minute. Just I want a real chairman analysis, though.
Let's go to both. I want the That's an RCA The Sunday chairman >> [laughter] >> analysis. What's that?
The real chairman analysis. Armchair the armchairman analysis. Again, the armchair armchairman. [laughter] That's not what I'm looking for. The armchair Dude, why isn't armchairman a phrase?
The armchairman. Like I'm armchairman of the board. I'm pretty [laughter] sure that's the guy that does the chips, right? The armchairman. The armchairman.
I don't get the joke. I'm sure ARM ARM ARM >> [laughter] >> ARM ARM risk machine. Yeah, ARM.
Not potato chips. I know Trash is on the pod, but >> [laughter] >> Oh my god, dude. Wait a minute. Wait a minute. Wait a minute. Wait a minute.
This is our sponsor opportunity.
Potato chip companies make a line of potato chips where the bags are labeled by actual chip.
So it's like this is a 9950X3D chip.
Like the bag just has that on it.
And it's specifically potato chips for developers. Done.
>> Casey, I have been trying to convince SunChips to sponsor us forever because think about it.
Sun Sun microchips. Yes, [snorts] that's what I've been saying for Oh, I was not I was not going down that PATH IN MY >> [laughter] >> WELL, TRASH IS LIKE GARDEN SALSA.
>> [laughter] >> OF COURSE, PROGRAMMER'S garden salsa.
>> [laughter] >> Yes, microchips. Sun microchips. Yeah, yes, yes, yes, yes, yes, yes, yes, [laughter] yes, yes, yes, yes.
That's the one that we need. If we can get a SunChips sponsorship, that's the perfect one. It's a match made in heaven. It really is. Okay. I've been but Casey, I've been tweeting at them for like 5 years.
>> [laughter] >> They don't care. They're just like, who is this guy?
And why does he keep tweeting at us?
Let me see if I can find a SunChips my oldest SunChips tweet here. This is this is May 28, 2001. Okay, we're really digging back. 2021. Wow, dude, you are really working hard to bag this chip sponsor. Well, I was writing a lot of Lua at the time, so I thought that it would go nicely together. Lua like moon, sun, right? It's like moon does the sun.
We're echoing back to each other here.
So that was a classic hashtag computer chips. There you go. SunChips.
>> [laughter] >> That's a new new terminal line. There you go. No, they were not at the time.
That was not the plan. No, that was part of that was part of the meme.
>> [laughter] >> They did not reply. They've never replied to me. They've never interacted with me.
So disappointing. I know. So we're I'm still waiting. Still waiting. So disappointing. I can't believe You like SunChips, though, right? Uh harvest cheddar specifically. But you're not a garden salsa fan? What do you mean a harvest che- Are you kidding me? Oh my gosh. Hold on. Hold on. Hold on. Hold on.
>> [laughter] >> I am I am a I am a fan When we were when we were at the the tower all those garden salsas, threw them to the side. Disgusting. Oh. Oh, my Nope. Yeah, Trash, you can't be a part of this sponsorship. I thought we had the same taste in snacks. Well, I ate all the harvest cheddars that were in that tower.
>> That's actually great cuz then we make a good team. See, that's what I'm saying.
Uh I I actually enjoy French onion also.
I don't think What? Is there a French onion? I don't think I've tried it.
>> Green packs green packs French onion.
Blue packs original, which I'm not a fan of.
>> Uh it is a it is a a Original is not that good.
Adam, are SunChips vegan? I have never had a SunChip in my life, I don't think.
I think he has.
>> [laughter] >> Please leave his house.
They might not have them in the Ozarks.
They may not be a vegan. They don't have them in the Ozarks.
Oh my goodness. Okay, well, I think this is probably actually a reasonable time to end up this point.
>> Yeah. I can't believe Casey wrote backwards that whole time like that.
>> That's what I was going to ask. Yeah. It takes a while to But you get used to it. Adam, it goes like this. His right hand, he writes forwards. He trained his left hand to write backwards. That is not how you are not doing it that way. Yes, so it's like right hand, he did it the he does it the right way. He cannot write his right hand backwards, but you can watch. He does his left hand, he does it backwards.
>> Are you guys trolling?
No. How you just saw that happen?
Casey's nodding yes. I spent most of the time You just use the left hand. That's how you do it. You just Because if you if you are So if you're ambidextrous, this would be hard. But if you're used to writing with your right hand, then when you train your left hand to write backwards, that just seems normal, right? Cuz you don't have there's not another thing. Yeah. But you are trolling. You don't actually write backwards. Yeah, [laughter] we're Yeah, we're totally trolling.
You What? DO YOU THINK HE WOULD ACTUALLY write back- ACTUALLY, I CAN KICK WITH both feet. SO MANY PEOPLE ASK ME THAT.
It's weird.
>> watching so closely your hand and it I couldn't figure out if it cuz he flips.
[laughter] It was like, no, I think that's the right way. Like I'm trying to like stand where you would be standing and like >> [laughter] >> I could not get that question out of my head. Adam tries writing on his fake And >> [laughter] >> on his fake whiteboard for writing backwards for 6 months and come out of a >> [laughter] >> coma and figured out how to do it.
I don't know how you do it, Casey, so it just seems magical to me.
It's just a mirror camera, right? Thank you, Trash. Yes. Well, no, no, but how do I SEE THROUGH IT?
YEAH.
BUT WAIT, WAIT, WAIT. GLASS? WHAT DO YOU MEAN? It's backwards through the glass, right? Because if I'm writing on one side, you see it backwards. So if you just mirror it, does it actually come out correctly? I haven't I haven't done the math to know. Right, I'M [laughter] CUTTING THE PODCAST. YEAH, WE'RE CASEY, THIS IS A RAPID FIRE.
>> [laughter] >> WAIT. NO. WRONG. FALSE. What did we say at the start of this thing?
If you're actually want to know, you ask questions. You don't just freaking pretend and Prime is doing the exact thing he should be doing, which is saying, I'm not sure. Does that actually work? Yeah, cuz my only question is how many times you reverse it, right?
Because you're going to If I draw like this [laughter] Right, IF I DRAW LIKE THIS, BUT THEN IT GOES THROUGH if you're looking at it from the other perspective. For you, yours it's going to be flipped, right?
Uh there we go. Flip this thing. It's going to look like it's going to Oh, that's vertical. Like the glass was clearly between the camera and him.
Prime, yeah, the glass is clear, so it's That's backwards looking through it. And so then you reflip it again. Cuz that would work out. Okay, maybe. Yeah, sure.
All right, I buy it. Yeah. But it Think of it this way, Prime. If you were if you were like you you got that right now. You wrote the letter D, right?
Yeah. Of course, unfortunately, that could be letter B. So you probably want to write a letter that's only goes one [laughter] way around.
UPPER CASE. UPPER CASE.
YEAH, UH THERE WE GO.
SO YOU WRITE THE LETTER Z, RIGHT? And you're writing it this way. You're looking at it. So think of yourself as standing and you're looking at the board. You write the letter Z, right?
Now if you were to walk around behind your screen and look back at it, you'd see Z flipped. Yeah.
>> Yeah. That's what you see. Okay. So all you have to do is just flip it one more time and it's correct. All right. Can I show you guys one thing that gets people always really riled up? Sure. You want to see how I draw my S's? Yeah.
What? [laughter] Bottom-up S? Are you kidding me right now? But then look how beautiful they are. I mean That's not a thing.
>> No. It looks like a serpent. No. It's a serpent. You can tell when it's done that it's wrong. What? Your Z? Do you actually write your Z like that, too? I write my Z that way. Yeah, I sorry, Matt. Uh sorry, Trash. We did math. So we cross our sevens. Yeah. I don't Okay, I do I do my I do my sevens like that. I don't do my It's called zed, by the way, but sure, whatever.
Oh. All right, got it. Got it. That's fine. Uh also, can I can I can I show you guys something? Mhm. I have a very magical ability. Are you ready for this one? Yes.
I drew an ampersand first try without fumbling it. It's considered pretty impossible by many people's standards when they first draw an ampersand. It's very difficult. Also, can you zoom in on the line? It looks pretty good from where I'm at. Almost a perfect line right there, too.
Now you can see that I I goofed it right here and I goofed it right here. You can see the two spots that I goofed.
Enhance. Computer enhance.
That's why you got to go to computer enhance. The way you write your S, that has to slow you down so much. Uh I just decided to write really nice instead.
I don't care I write I don't care that I write well. Why does it slow you Can you explain that to me?
I don't know.
It just to me it was just slowing me [laughter] down. I don't know.
Uh Trash, is that how you hold your pencil?
Like this? Can you do that? Uh no, no, no. How do you write? Draw a line. Okay, well, that's There's your problem. You hold it like a weirdo.
Dude, we were just talking about drum line. What? Do you hold it like this?
Yeah, like a proper Yeah.
Neither of those No. That's not how I hold it, either. What are pencils in 2024? too many fingers involved.
Right right on top. Boom. That's how you hold it. Can you guys do this? Can you guys do this?
>> [clears throat] >> I CAN DO THAT. [laughter] OKAY. BOOM. YOU SEE THAT? PLEASE MAKE that be the opening. Can you guys do this and then close it and it goes down and then you cut to the start of the podcast.
Trash, you know what that reminds me of?
I was I was at my I was at my friend's house. We were all sledding together with us and our kids.
His His son goes down the hill and he goes, "And this is how a 13-year-old sleds."
He jumps down the hill, immediately gets the sled caught down, falls face first in the snow, and just biffs into the snow.
>> [laughter] >> He starts crying. So now every time I see him at church, I'm like, "And this is how A 31-YEAR-OLD RUNS."
>> [laughter] >> THAT IS SO MEAN.
He thinks it's hilarious. He cracks up every time.
>> [laughter] >> All right, Trash. I'm going to do one more little trick for you, okay? Are you ready?
>> Yeah, okay. All right, you got to hold the pencil like this, all right? And then you without taking your hands off the pencil, okay?
>> We did this one.
>> Yeah, you got to end like that.
I KNEW THE ANSWER AT ONE POINT.
>> [laughter] >> OKAY, YOU JUST YOU just just don't take your hands off of it. I can't.
Dude, that that >> Do it again. Do it again. Do it again.
Do it again. Dude, that's the one that I do Okay, it's very simple. Okay, so you put it right here, right? And then you got to turn it and then you just put your hands the other way.
I still It's like three pixels on my side. Yeah, it's pretty bad. The video quality It just doesn't even look like it is a pencil.
>> Whatever. Can you do this? Boom.
>> [laughter] >> But you can't do that. Can you Can you spell blood with your hand? OKAY. OH, TRASH. BRO, YOU'RE GOING TO GET THIS PODCAST >> [laughter] >> ALL RIGHT, TRASH. All right, Trash.
>> Can you Fortnite dance? Probably not.
Yeah, okay.
No, please don't Trash. Trash, you don't [laughter] want to do this. You don't want to do this, Trash. I don't want to do it. I DON'T WANT TO DO IT.
Trash is just baiting me. No, no, no, no, no. I'm good. I'm good. If you hit an orange justice on this podcast, you will literally never stop orange justicing from here until the universe ends.
No, we're good. End it. Full plug. Thank you everybody for watching. I'm terribly sorry about this episode.
>> Not 50 tech podcast.
>> Yeah, I should have stopped him.
We got out of this one. We didn't even have to show a nipple, so that was a good episode. [laughter] Can you believe I told him to go to Spotify for this? All right, yeah.
Ending. Booted up day.
Five [music] code errors [singing] on my screen.
Terminal coffee.
>> [laughter] >> And head.
Living [music] the dream.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











