Install our extension to search inside any video instantly.

Bay Area Rust June Meetup @Convex

Added: 2026-06-19

214 views71:48:21convex-devOriginal Release: 2026-06-18

Conflict-free Replicated Data Types (CRDTs) solve the fundamental problem of concurrent edits in distributed systems by assigning unique IDs to each character and using actor IDs to resolve conflicts, ensuring all users converge to the same state regardless of operation order. This approach enables real-time collaboration without central servers, addressing the common issue of lost work in software development tools.

[00:00:50]All right, everybody, let's settle down.

[00:00:53]Okay. Um, if uh folks want to come closer so they can see the screen, that'll be good. And I think we had the majority of people who sent up already here. Um, all right. So then let's start.

[00:01:10]Hello and welcome to the Bay Area.

[00:01:14]I will be your host tonight.

[00:01:17]Um, we're very happy to have you here today. First of all, let's uh have a round of applause for our generous sponsors, Zed Industries and Convex.

[00:01:29]>> [applause] >> So, if you haven't done it already, um you're welcome to join our Discord channel where you can hang out with other uh B enthusiasts.

[00:01:41]So, not a lot of people are pulling out their phones. So, I guess the majority of you are actually repeat customers.

[00:01:48]That's good to know.

[00:01:51]>> All right. Um tonight we will have uh four amazing speakers telling you about how to use Rust and after that we'll have a time to hang out in this space in angle. Okay. And uh we start let's hear from Wayne representing our sponsor convex who are generously hosting us tonight.

[00:02:14][applause] >> Hi everyone.

[00:02:20]I've seen some familiar faces. Who's first time here? First time. Oh, there's a few of y'all. I was I was gonna say swag on us, but I think you already got it right. All right, good. Well, welcome to Convex. If you don't know Convex building blocks, a database platform post that real time, but if who's not building on agents these days, they need a backing as a service platform combat. And most important, not most importantly, but also we hiring. So, we have six positions. This is the email address software API platform infrastructure product manager. So if a job knows we love Russ you all love Russ don't leave said but [laughter] but if you're looking for a job we are in that conference. Hope you enjoy the evening. pass back over to house.

[00:03:24]Okay. Um, our first speaker is um, sorry, our first speaker is Conrad, the software engineer at Zed. Let's welcome Conrad.

[00:03:52]Okay, we don't have any slides to uh think back in the recent past to a time where you lost time or you lost work due to like a mistake. Uh maybe you were debugging a test and you you keep adding logging and it's not showing up and then you realize, oh, I didn't save the file. Oops.

[00:04:17]Or maybe you you get to the end of the day, you you add everything, you push and you're sitting at home in the evening watching TV and you get a Slack message and it says uh your launch doesn't compile. Did you forget to add a how does it make you feel? It's kind of irritating, particularly if you do it in front of your colleagues. It's kind of embarrassing and particularly when you realize you've been using these tools for decades. There's no real excuse. Um, this happened to me. It happened to me really badly in my first job. Um, I'd been given a task. I was kind of new on the team and this was a JavaScript shop.

[00:04:55]Wanted to build a date picker. And as you all know, you've seen date pickers.

[00:04:59]They have these big rectangles and the month things and then it shows you clicky things. Everyone hates them, right? They're slow. They're annoying.

[00:05:10]They're hard to use. So, luckily, we were designing for people who like to use the keyboard. And so, we had this idea. Let's not show you any of this stuff. Let's give you a box and you can.

[00:05:20]And so, my job as the new guy on the team was to build the rules for what you type in the box gives you a date. So, some of it's easy. 3rd of June, 3rd of June. But it gets interesting. If someone types 3D, it's probably three days like Saturday. And if someone types 3M, is that three minutes or three months or the third of May? There's no real way of knowing. And so you have to have a bunch of heristics. And so I spent like two weeks on this, like getting all the regexes right so I could puzz all the things, getting all the heristics right, so it was just like perfect. Um, and we get to the end of Friday and we're doing the company demos and I show it off and everyone is so excited to see this because it just works. Whatever you type in, it does what you want. Um, and my boss at the time said, "This looks great. Can you push it up? I'd love to take a look at the code." Very reasonable request. So, demos are over. I go back to my laptop.

[00:06:06]I sit down and I don't know exactly what I typed, but it was something like get fetch, get checkout main, get commit- aim popping open, I I got that message like no changes to commit work tree clean.

[00:06:22][laughter] And I begin to panic and I start googling and like what do I do?

[00:06:26]And um Google's very helpful. It's like, "Oh, you should look at the ref log." I I go look at the ref log and nothing for the last two weeks. I'm new. I haven't committed anything for the last two weeks. And I don't know what command I ran. Presumably check out -f or reset or something. But somehow it was gone. The whole thing all two weeks.

[00:06:46][sighs] That made me sad.

[00:06:48]And so I had to go to the company Slack embarrassed and type, you know, hey, uh mess something up with Git. I can't push it tonight, but I'll I'll get it to you before Monday. Um, and in that moment, I realized that my weekend plans had changed. I was probably not going hiking tomorrow. And I spent the whole weekend frantically trying to type the whole thing out from memory and make it work.

[00:07:08]And that I don't know. I'm never convinced that it was quite as good as that first version. But we got something. We got it merged and it ended up fine. But but boy, what a waste of time.

[00:07:19]And sure, that was 15 years ago, so things have got better, right? We don't use git, right?

[00:07:27]The really interesting thing is we've fixed this for everyone else, right?

[00:07:30]We're software engineers. We see people with a problem like we'll fix that. And we see people writing documents and having to save and losing work because they didn't save. And we're like, great, notion, Google Docs, we'll fix this for you. As you type, everything will be synced to the server. It'll be synced to all of your collaborators machines.

[00:07:44]You'll be able to edit things together.

[00:07:46]You won't lose work because there's no work to lose. It's all backed up. you won't waste time. The same is true for design. We had Photoshop, you used to have to save all the time or it would crash. And now we have Figma. It just backs it up. It makes it work as you do it. And if you want to work with people, all the changes show up in real time.

[00:08:04]You can't lose work. You can't waste time.

[00:08:08]But somehow in the last 15 years of building all these amazing tools for everyone else, we have never stopped and said, "This is stupid." Right? It's really painful to lose [clears throat] work. It's really painful to mess up and it's very surprising to me that we still use these tools until now.

[00:08:28]I work at Zed. Uh we are well known for building an IDE or a text editor. If you haven't used it, you should. Uh we shipped 1.0 about a month ago, I think.

[00:08:37]Um it's kind of done. I mean, there's bugs, but it's kind of done. It's mostly there. And if you look at Zed, the company, the goal was never build a text editor. The mission actually of the company is we exist to fundamentally improve the way that software is collaborated on and built. To fundamentally improve how software is collaborated upon and built.

[00:09:01]So zed the editor step one. Step two let's fix this mess. Let's make sure that no one coming into the software engineering profession ever again loses work because they mess up a git command.

[00:09:14]So, this new thing, it's called Delta DB, and we're building it to solve this problem. Um, I wanted to zoom in and talk a little bit about kind of how it works under the hood. Um, and then we can kind of zoom out a little bit at the end.

[00:09:30][clears throat] So, why Delta DB? If you look at something like Git, it's based on snapshots. And snapshots are great because you can say at this commit, these were the versions of all the files, and it's like signed and it's good and it's great. and you can tag it and deploy it and ship it around. But if you're trying to collaborate with someone, snapshots are terrible because if you have to send the entire contents of the file every time it changes, it it's just too much. Instead, what you want to do is you want to watch as people type. You want to find the little small changes or deltas as they're called and just send those. Um, and this is a obviously a distributed systems problem. It's very well solved by the field of CRDTS. Um, and so I wanted to dive in and show you one of those.

[00:10:10]Let me go to my other slide.

[00:10:12][clears throat] So, let's imagine for the sake of argument that we want to work on a file together. We're going to collaborate in real time and I'm going to have to tell you how my changes affect the document.

[00:10:23]This is the very naive approach. This is broken. Hang. This is supposed to be syntax edited.

[00:10:32]Much better. Is that legible at all?

[00:10:34]Great. Um, this is kind of the simplest thing you can do. you say, I'm going to insert this string. It's going to go here. The reason the position is a range is so that you can represent a delete, which is like a non-mpy range, and then an empty insert.

[00:10:48]But there's a there's kind of a fundamental problem with this. If you're building a text editor locally, it's fine. And it's fine because when you get keystrokes from the keyboard, they come in in order, right? It's just coming over one wire. You know which order the keystrokes go in. And so there's no concurrency.

[00:11:04]If you're building something like a Google Docs, it's kind of the same. You have this central server and you're getting keystrokes and sure they're coming in from many different people, but you on the server can just order them. You don't have the problem of concurrency because you have this central point that can that can kind of order it for you. But we don't really want to rely on a central server because things like git don't. We want distributed collaboration.

[00:11:27]And what happens in distributed systems is there's no defined ordering. And let's see how this plays out in practice with this example.

[00:11:35]So, I'm going to start with this madeup string tivity. Uh, and I'm over here.

[00:11:40]I'm going to insert the C at the beginning. So, I insert it on my machine and I send you a message over the network. Meanwhile, on your machine, you inserted Rea and then you send me the instruction. Okay, insert REA at position zero. Just after you send that message, my message arrives. Great.

[00:11:56]Insert C at the beginning and you end up with creativity. Seems plausible. But the problem happens on my machine because if you remember I already sent you that C and by the time I get your REA it's going to say insert REA at position zero and I end up with reactivity.

[00:12:14]Also seems plausible. Um but this obviously isn't going to work. We both have seen the same things going on in the system but we end up in a divergent state. Uh so this is the problem that CDTs solve [clears throat] and a Ct stands for conflictfree replicated data type. The F is silent. Um, and they solve this typically in in the following way. The observation is that the problem in this case is that these sizes don't make sense because they change as the document changes. So what we're going to do instead, we're going to change from this edit delta that the old one to this new one.

[00:12:50]And we're going to give every character an ID and then when we mention the range to replace, we're going to refer to the character IDs because these are stable over time. we can find out where they are in the document. The downside is these character IDs are pretty clunky.

[00:13:04]They tend to have an actor ID. My laptop might have one actor ID. Your laptop might have another and a use size. So this is maybe like 24 bytes of data on top of the one byte car we're inserting.

[00:13:15]And then again the range is going to be stable now because it's in the the character ID coordinate space. How does this solve the problem? This is the most complicated of my slides.

[00:13:26]So hopefully it's just about legible. So we'll do the same example again. I'm over here on my machine. I insert a C, but instead of telling you insert it at position zero, I say insert it before character C1. And my character has actor ID B is me and it has number seven because seven is one bigger than six.

[00:13:43]And we always want that sequence number to be going up. Simultaneously over here, you do the same thing. You insert the you give them these character IDs.

[00:13:51]Your actor ID is A. They have 789. And so you tell me insert characters A through 9 just before C1.

[00:14:01]My change then arrives and it says insert a C just before C1.

[00:14:07]And here we have to be a bit clever. We can't just do exactly what it says. But what we can see is that in this document we have this A7 fragment here just before C1. And because we know the ID here is B7. We know that when I created that C, I did not know about the R.

[00:14:26]And because we can tell that has happened, we can order things. And we're going to arbitrarily choose the actor IDs as our sort key. That's kind of what most systems do. A becomes before C. And so we get reactivity.

[00:14:40]The converse thing happens on the other side. Right? I've [clears throat] inserted my C. I get your REA. It says before C1. I can see there's a B7 just before that position. you've given me an A7. There's a conflict here. I again can order things based on the actor ID. So I put yours just before mine and I end up with reactivity as well.

[00:15:01]And that kind of works. Um but for our use case, it has kind of one downside.

[00:15:06]It has the other downside which is that that the metadata overhead is large. But there's been a lot of research in kind of mitigating that problem. The real downside for us is that we want to integrate with the existing systems. We want to be able to build a CDT that's not based on just itself but integrates with Git because even if you're using JJ or another thing likely it's Git underneath the surface.

[00:15:27]So how can we do that? Um one approach would be just to rewrite the whole Git repository in terms of these these edit deltas and build up character IDs for every file. And we tried that and it's quite slow. So we don't want to do that.

[00:15:41]Instead we're going to rethink this problem a little bit. So the the downside of having the the U sizes was that it's impossible to know what they mean over time. But we can fix that with a little trick.

[00:15:54]So in um in Delta DB, the the edit delta looks more like this. We're just inserting a string. We're inserting it at a range. But the key extra piece of information we have here is what was the version at which this was a valid range.

[00:16:08]And the neat trick about this is that we can say the version is either a get blob o ID like the hash of a file or it's based on changes that we've both made.

[00:16:20]So how does this work in practice? One more time through this example. Uh we start with Tivity. Um but this time we know this is version get aa. This is the base commit that we're both sharing. And I'm going to insert my C and tell you to insert it a position of this version.

[00:16:36]when you receive it after you've inserted your REA, the version of your document differs.

[00:16:42]The version of your document is at A3 because you've inserted three things.

[00:16:45]And so, you know, there's a conflict that you have to resolve. How are we going to do this? Well, we're going to borrow the technique from the previous slide and we are actually going to give each character its own ID, but that ID is stays local. We don't need to send it as part of the deltas. So we go to our fragment index and the fragment index is just a B tree and for every position in the string which fragment it came from.

[00:17:07]Um in the case of version A3 it's going to have two fragments. One is the rea that you inserted and one is the the whole rest of the file the tivity.

[00:17:15]And we can tell all rightity we can do the same actor ID disiguation and know which order the C and the ERA need to go based on the actor IDs. Same thing's going to happen on the other side. We're going to go to our fragment index. We're going to find out where this is in the string and make that work.

[00:17:33]Um, one other fun optimization we added, it would obviously be quite inefficient to store a B tree for every version of every string. That's kind of like a n squ shaped problem. Um, but you can observe that mostly changes only change a very very small amount of the string.

[00:17:50]And so by changing our B tree per version into one kind of multi-headed I don't know medusa tree kind of a thing we can structurally share all the nodes that don't change per version and that flattens out the index so that it only takes kind of order n space.

[00:18:05]All right I'm not going to get any more detail than that. Um, so think back, think back a little bit.

[00:18:12]When was the last time that your tooling let you down, that you lost work or you lost time? And hope, just hope that maybe that will be the last time that ever happens. Thank you.

[00:18:25][applause] If you want to be part of the the alpha, you can go to this link and sign up. Um, or you can come talk to me and we can figure it out.

[00:18:36]>> All right. Uh we have some time for questions. So you if you have a question please raise your hand.

[00:18:46]>> Hi Conrad. Big fan.

[00:18:48]>> Um so I wanted to ask so is Delta DB then like a CRD layer on top of git? Um and so so ultimately like you do ultimately run git commit when you're um doing stuff or like is that the idea or >> it is okay. And we've kind of gone back and forth in our original conception, we're like we're gonna own the whole thing, right? Right.

[00:19:09]>> Um but everyone uses Git and like snapshotting is is a good idea and so being able to integrate and merge between the two.

[00:19:16]>> Um one of the other neat things I should say like if you have the version defined like this, >> you can kind of hide some of the deltas like it's common when you're working on stuff to have like a whole bunch of changes you don't care about. In a normal Ct you have to share them because the history has to be shared. But if you snapshot at these git revisions, you can like you can hide the fact that locally you have all these deltas because you can build forward on this sharable git commit and and that kind of helps with things.

[00:19:43]>> So in your ordering um I guess I'm trying to relate this to what seems familiar to me like a lamp clock or a vector clock. Um do you need to know the set of participants your actor IDs up front and is it flexible for sort of leaving and joining that set? Does that actually matter or that just doesn't apply here?

[00:20:02]>> Um, you don't need to know up front like people can keep joining. Um, and there there's two ways of doing it. Um, so you'll see here this this delta this version is like a ve of delta ids. Um, in the worst case you have one delta ID per actor that's been ever visible. Um, but you can once you get one delta that kind of like encompasses all of those heads, you can kind of collapse it back down. There are reasons to do one or the other that help with conflict. Um I don't remember what we landed on right now. Um but in the yeah in the worst case this grows in like the number of people who have made a change to the document.

[00:20:37]>> Um how much extra metadata are you planning on storing alongside the CRDTS?

[00:20:42]Like I know it was talking about like uh like conversations with agents and things or like are you thinking about doing like compiling status and things like that?

[00:20:53]>> Yeah. Uh you've you've clearly read the boss. So, one of the things that's cool about this is we now know for every character typed who typed it, whether it was a human or an agent. And we know what was going on at the time, like particularly if agents are contributing, you want to know what was the prompt that they had. And so, we can link all that metadata in. And because these charact the like the um can because you can identify these characters by the version and the thing, you can always look up what was going on at that time. So the initial plan is to do kind of code conversation because it's like the the topic dure is agent collaboration. Um but I think beyond that it'll be getting into things like I mean the other thing we've talked about we do a lot of pair programming is wouldn't it be cool if there was just like a transcript of like what we were talking about as we were writing this code. Um and I think that would be much more helpful than a git blame of a snapshot of the the codebase. So I don't know if you have good ideas we should talk about it.

[00:21:49]Uh I assume you're aware of the u many insertion interle problems with various lists that people like Martin Cleman have written about. Um and so I'm wondering like does your approach suffer from any of those? And then also it seems like with an ID based approach there's probably there are like tombstones if something is deleted, >> right? And then is that like a potentially a storage concern if like I like paste some like megabytes or gigabytes of data and I'm like oops delete and like is that now like forever stored somewhere? I was actually having a slight conversation about that just before this. Um, so yeah, one of the things with Cts is you can never delete anything because you need to be able to refer to the the positions. This structure is a little bit better than an average one because as I said, like once you get to the skip point, you can throw away all of your old deltas. You don't need them to keep moving forward. So that's one advantage we have. Um, in terms of interle, it's on my to-do list to like actually test it. But yeah, I've I've read a lot of Martin's stuff.

[00:22:44]>> Nice. Let's do it. That'd be really fun.

[00:22:45]Thank you.

[00:22:47]>> Yeah. Hi, Con. I've been I've been toying with Jiujitsu lately, so I'm kind of curious to hear your thoughts on how it compares. Like I know Jiu-Jitsu does um snapshots more like when you check the status.

[00:23:00]>> Sorry, I didn't hear the word of the >> uh jiujitsu is Oh, yeah. I was in Yeah.

[00:23:03]Yeah. I was just um curious, [clears throat] you know, you you clearly like know the space of of like all the BCS's. Um, so I'm curious what your thoughts on it are because with jiu-jitsu you only snapshot once every a while as opposed to what you're doing is continuous, right? Um, right. So, you know, >> so one of the one of the use cases I didn't really talk about um is that I believe that code review is not collaboration. It is, you know, a form of collaborating. You're working with someone, but it it's it's very async.

[00:23:31]It's very slow. And it works for some things. I think it's good for compliance review. It's good for like enforcement.

[00:23:38]What it's not good for is like putting two brains on a problem and having two people think about it at the same time and really working together. And obviously people are getting more and more used to that with working with an agent. But it turns out working with a human is like in many ways better than working with an agent. They're pretty clever. Um and the downside of the snapshotting approach which JJ gets from Git is that you can't get this fine grain changes. So the thing I like about JJ is it's rounding off all the rough ed rough edges of Git. It's trying to make it harder to lose work. They're trying to solve a lot of the the git problems, but but I struggle to see like the fundamental model shift that I want, which is going from the snapshotting thing to something much finer grained.

[00:24:12]Does that make sense?

[00:24:14]>> Yeah.

[00:24:15]>> Uh I'll ask one last question.

[00:24:18][clears throat] So >> sorry, someone raised a hand, but you go. Um, from the Z UI, um, is it possible to actually see the actor ID that I'm being assigned and then the actor IDs of the other participants in this file?

[00:24:37]No, with a big asterisk. So, we've been working on this kind of on the side in in private and the reason for that is that we are working on it in a slightly different product surface area. um in that product it is possible to see this stuff and actually we have like a whole debug view that shows you all the deltas as they go by um but it's not part of zed yet and one of the goals we have for like you know once we have the whole thing working and we've tested it is to take the the zed codebase and kind of port it over to using this new system at which point yes you'll be able to see that >> how does this handle rebases like so if I change a commit ID uh does that lose all my collaboration there is a uh like 5,000line code function that tries to handle rebases.

[00:25:23]Um so what we say what we try and do is we take all of the changes based on your current get version. We record them as changes. We see the reset happen and then we replay those changes on top. And so kind of like git it's like a new branch with a new set of histories but like you don't lose all your work um only through heroic programming effort.

[00:25:42][laughter] Okay, I think this is all the time we have for uh Q&A. Uh let's give a round of applause for Conrad. [applause] Let the magic happen.

[00:26:17]>> Maybe.

[00:26:19]>> No.

[00:26:24][laughter] >> All right.

[00:26:40]All right, our second speaker today is Gorgis Valance from Etched. Let's welcome Gorgis.

[00:26:48][applause] All right. Hello everyone. Um, yeah, I guess let me get straight into it. I feel like I have a lot of content to cover to uh to to give useful background here, but yeah, so my name is Gorgus. I work at a company called Etched. Um, and so so you guys have some quick background on what what this talk is going to be about. Uh, Etched is working on building a building a chip that run that is specialized for trans running transformer models. So with all the AI stuff going around uh you you may be aware that the the model architecture that is you know the dominant one these days are are transformer models. Uh and so we are working specifically on building chips that work well with these. Um and [clears throat] yes perfect all right and so where is my mouse? Let's see. Can I Okay, I can't seem to find the screen.

[00:27:59]Um Oh, is it to the bottom? It's to the bottom. Okay. Um so what I'm Okay, actually it's not it's not on mirroring. Um where is the screen right actually display?

[00:28:20]>> Yeah, there we go. So, I'll just go into displays and then a mirror. Perfect.

[00:28:28]All right.

[00:28:30]So, um what what I've been what have I been up to? What I have do I have to say about this? So, what who I am is I work on the inference software stack at ETH.

[00:28:39]So, I've I joined ETH around two years ago. Uh and I've been kind of responsible for figuring out okay, how are we actually going to program this damn chip? So uh you may be aware you know with there's with any as with any kind of silicon there's a large stack of abstractions that you need to build up to actually get the ability to you know run useful workloads on these systems.

[00:29:00]Uh and so I kind of stumbled into this position of uh you know having the responsibility to do this. Uh and I think yeah along along the way we discovered a lot of worked out a lot of very interesting things. Um and yeah I hope to share some of these with you today. Um yeah, so the the background that the way that I'm going to structure this slide this talk also excuse the slides. They're slightly AI generated but hopefully they're just there to convey general shape of the general shape of things. Um I'm going to give give anam give kind of an overview of how things work generally when programming GPUs because those are kind of the the standard way that people approach AI workloads these days. I'm going to talk a little bit about what the challenges are with the GPU approach that GPU programming has taken. Um, and then I'm going to talk a little bit and I'm going to talk about how we decided to approach this problem and how we use Rust obviously um to to actually like to work on this.

[00:29:57]So kind of the concept here like with a like with the way that a GPU works is you implement your kernel in CUDA like in CUDA most likely and then this kernel is a program a static program that you assemble uh that you compile and then when you actually want to run work on the GPU you tell the GPU okay okay execute this kernel with these arguments and then the GPU will go ahead and actually like and you have a pretty heavy like kernel launch that happens where you tell GP, okay, go ahead and do this work and then the GPU starts doing this starts doing this work and kind of distributes over its internal grid of processors. It distributes this kernel and internally schedules things and tries to basically to to complete your request of the kind what you of the computation that you want wants to run.

[00:30:54]Uh and then you know down here I'm not going to get into GPU details uh but there's there's finer grain structures that are highly sophisticated. Nvidia has done a lot of work on on this. I think we have a speaker from Nvidia later today actually. So uh it'll be interesting. Um but um the yeah this is the general structure within how these this work is dispatched and you know one of the key things that GPUs have gotten that like made GPUs powerful for AI is they are very very very good at matrix multipliers which is why which is what ma like what most AI workloads actually are if you really break it down to the very simplest simplest piece. So for example, you have these these two you have two matrices that have a that have a fixed shape. You know that you know like in in this stage of my model I'm going to have this shape of matrix and then this shape of matrix. I need to multiply them together. So I invoke a kernel that can be then very optimized ahead of time and the the scheduler on the GPU can then actually assign these kernels to the individual processors and the GPU and get this and distribute the work.

[00:32:06]But when you're now looking at where the field has evolved, serving an LLM and like a state-of-the-art LLM is it's actually a lot more complicated than than just, you know, this like very simple square matal shape that I describe here. And I'm going to briefly describe some examples here. I I don't think I I would have a lot of enough time to actually go into the details, but some of the things that you may have heard about are things like a mixture of experts models, which is where you have you like you run a you run a computation on the chip to then select for each sort of piece of data for each token that you have which Matt Matmals it wants to participate in. And so you have this highly dynamic routing that happens with which produces ragged shapes of work. So you'll have say you you'll want to run 256 mat moles and some of them will only have very few like a very narrow dimension on one side others will have very wide dimensions and this makes it very complicated to like this make this very this kind of fights this shape of you know I have a kernel that I launch with a certain grid par grid size and nicely let the let the GPU divide up the work similar way similar challenges is actually the self attention computation.

[00:33:27]So for those not familiar, this is all the key piece of what goes into a transformer. Uh and there we have similar challenges where if you're for example serving production inference workloads and you have a bunch of users that you want to batch together to get get good usage out of your chip. They these users will have different shapes of workloads because they'll have talked to the model for different amounts of time for example and the that that means that or they will be in entirely different phases of the workload where either they're going to be in the prefill phase where you just submitted a bunch of work to the model and it now needs to understand and compute over the work the input that you gave it or you're in the decode phase where the model actually generates outputs in a tight loop.

[00:34:12]So this is these are the challenges that you face when programming in a normal in a normal uh sorry when you when you program via CUDA or via torch uh which are the higher level libraries that are built on top of this. Um and what one thing excuse the bad diagram but one thing that people have tried here is the concept of this concept of a mega kernel where you say okay I'm just going to do it myself. I'm just going to launch one kernel that occupies all of my process my entire processor grid on the on the on the GPU and then I'm gonna have a bunch of scheduling code that actually is also implemented in in PTX in the CUDA like in CUDA itself uh or whatever language that compiles to the underlying to the underlying PTX which is sort of the GPU ISA. Uh, and then you have basically persistent tasks that schedule a bunch of work and you dynamically feed them and you unify this entire computation and you like you really extract the every little bit of raggedness and every little bit bit of every free cycle that you have in this computation.

[00:35:20]And so going into how we like how we wanted to design our system, it was it was key that we like we knew okay we wanted to we want the ability to say like you know a batch comes in which is sort of the the fundamental I want to run a four pass through my model that is a batch and then what the key thing decision that we made is you have in in your rust there is no compilation per se of of instructions ahead of time and or some device specific DSL. Instead, what you have is you have Rust code that actually runs at comp like at runtime when the work comes in and generates and issues the commands on a very very granular level. Imagine imagine all the like individual commands at of the ISA are issued at runtime based on to exactly fit the shape that you want to of the workload that you want to run.

[00:36:17]uh and so let me give an example here.

[00:36:20]So the slides are going to be slightly pseudo code but um you know actually like like mostly these are taken from from our code from our codebase directly. So um maybe to to to give an give an example here to like walk you through the example here like we have this for fn forward which is also you know it's an async function so we we tightly built things into how you know into the capabilities that rust provides and you you have like you have this context that you carry around and you have an input say the batch that you want to execute and what you do is okay I say let's let's say we're just programming a single chip here. So we select the chip where we want to issue commands and then we say okay we will run these these operations. So we're going to invol invoke for example a matmal kernel with these runtime parameters and then we're going to invol invoke a nonlinear kernel which these kernels again they're they're not pre-ompiled programs that live that get live on the GPU. Instead, they're functions on the host. They're functions in your Rust program that get called and that by themselves at some point into this context emit instructions which then are submitted as as the full body of work uh that for the or like either submitted at the end or incrementally for the accelerator to then execute them.

[00:37:44]So basically you have like a bunch of threads on the host that are doing this work and they're figuring out okay uh and like you know I have this batch I want to I want to do these operations.

[00:37:55]It submits them into a bunch of cues uh that that are in there. The chip goes out and fetches them via by doing DMA operations over PCI Express and then it executes them in the firm the firmware that we developed.

[00:38:09]Um so yeah let me let me just jump straight like walk through some some shapes of like how this looks just so we like we can get a feel for this. So again like the simplest case is kind of like okay I have my I have my Rust code I invoke my Mattmo kernel uh and this one uh you know very simple I just like do this one you could just do on a GPU in the same shape as well.

[00:38:34]Um let me skip this one and go straight to something more complicated. I can have loops. I can have control like very arbitrary control flow where for example this implements a block diagonal matmo which it consists of a bunch of smaller matals that I want to do. And so here I can just have a function also like just a like a pure rest function that is going to be okay I I iterate through my through my weights for this for this matrix multiply. I issue my I call invoke the the matmal kernel itself. So you can see this this calls out to the kernel that we saw previously and keeps track of some state which can be completely dynamic. We can slice things and slice and dice things however we wish and then we construct our output operand which is sort of out which is our mechanism for representing these values that you flow through like effectively the the tensors that you have at runtime that you pass into these other kernels.

[00:39:27]And you know another another thing that we do in this same way is collective operations. So for example when you have a to implement a reduce scatter which is where you have two tensors on on two chips and you want to add them up and have the results be distributed between the two you again invoke these down like smaller kernels and they all compose together into one stream of work that is sent to the chip for execution. And you know, you can see we use these we use these builder patterns here on the like in terms to actually represent to to make it much easier to sort of build up complexity and like have more and more and more and more features added to these things as time goes on.

[00:40:07]And then maybe you know one of the more most complicated kernels that we have and this one is you know very pseudo code uh is the attention kernel u but fundamentally you know again we we are in this context of we're running on the host we have full access to doing arbitrary work and we just we just decide okay we depending on the each request that is in our batch what you know are we you know what kind of work are we doing and do dynam do some dynamic planning work based on this then tile this dynamically and do a like basically run a full schedule uh and but entirely you know entirely running on the host and entirely debugable and you know you can just like look at what it's doing and you can reach in deeply and do what like modify it in whatever way you want.

[00:40:55]There's a lot of other stuff here as well um that you know I I don't really have won't really have too much time to get into but u maybe just some samplers u you know for like one thing that we for example that we've worked on is you know I I described the system as like very dynamic you you have like all this code running like whenever you get a request in but like really the the time scales here are mil like single digit milliseconds until where you need to your work comes in and within within a couple milliseconds the accelerator computes the entire forward pass of your model. So you you have to be very very fast. Your code for generating these uh these instructions has to be very fast.

[00:41:33]Um that's that's one piece. Um also this you know you need to scale this out to much much larger domains in terms of you know say that you have say you you have like dozens or hundreds of accelerators that you want to program together and you know make have them perform computations and then you know like things like dynamic packing for these mixture of experts things that I mentioned earlier um are also possible in this model. Uh, so yeah, there there's a lot of a lot of stuff here. Um, and uh, yeah, I'm sure there's going to be questions, so I'm I'm happy I'll also like come back to this and then discuss in more detail.

[00:42:10]Um, but yeah, it's basically what are the takeaways here for us. It's like the the the key pieces of the key challenge of programming an accelerator is how for for these AI workloads is how do I figure out when my work runs where and how do I get the flexibility to do to do this programming in a way that where I don't need to literally think about every single clock cycle that is happening and you know the these static kernels that have been the kind of how people have approached this work so before have been are great for very regular shapes of like large uh large inputs, but then when you go really into these highly dynamic workloads, it it becomes you you want a different model and you people are trying to break out of this in GPU land and this is you know our approach to how we broke out of it for our chip.

[00:43:04]Um yeah, at the end uh I'll have a plug for EST. So if you are interested in accelerators in Rust in AI models uh or any of the things adjacent here we we're always we're looking for talented talented Rust people. I'm sure there's there's good people here who would complement our team very well. Um so yeah just email come talk to me after the after the talks or email me or look at our website. Um I'm very happy to uh yeah it would be great to talk to you.

[00:43:35]>> All right. Thank you. [applause] Anyone has questions for >> Yep.

[00:43:46]>> Is this a compiler? Like is there optimization phases and things like that?

[00:43:50]>> No. So there's no there's no compiler per se. Instead, it's like really runtime rust code that emits emits the instructions. This is something where the design of our hardware is a little bit different than uh you know different than a GPU which allows us to do this.

[00:44:06]So you can think about it as the hardware does like the first off the hardware operations themselves are generally higher level. So it's the the amount of instructions you need to issue is generally smaller and they're more they can do more with one instruction and also the hardware has strong support for uh for basically for automatic operate operator fusion where things will naturally form pipelines. It's kind of like um if you look at um for example like um this slide uh you you take the output of this matmo and you pass it directly into this nonlinear kernel and what will happen is the hardware will actually automatically overlap these so that your mat as the mat is happening the nonlinearity will be executing directly after it. Uh and so this allows us to build make this programming model manageable without having a compiler.

[00:44:58]All right.

[00:45:00]Another question here.

[00:45:06]I I think your last answer sort of um answered it slightly, but I guess I was curious about the latency here and maybe that operator fusion solves some of this, but I was wondering if the take was you have enough work concurrently that you can fill that pipeline, therefore any one individual stream of work might be slower, but that's okay because you've got the parallelism or you are actually trying to keep the latency.

[00:45:29]>> So, we are we are trying to and and we do keep the latency down. Um there's there's a couple ways in which we do this which you know are themselves non-trivial. One is for example say that you're you're serving a model and you have a an outerul that is running that is accepting work. You have a bunch of users that are running and generating tokens. Uh what you can do is you can do slightly forward-looking work. For example, you if during your current forward pass, you you can already you can you can with high certainty predict what the shape of the next forward pass is going to be like. You can already assemble the tokens of your batch and you can already submit this into the pipeline for execution and then only backfill the slight amounts of dynamic data when you actually when you actually need to start to start execution. And so this way you can have uh you know you can get get the latency between finishing one batch of work and immediately proceed like proceeding with the next batch of work to be basically as low as you want.

[00:46:26]And then the latency for the instruction fetch itself is you know you're going over PCI express. So it's there's only a couple really a couple microsconds that you need here. And so when you control the full stack uh you get the flexibility to do these things uh you know these highly dynamic and highly inter intertwined uh operations.

[00:46:45]Are you Oh, can you hear me?

[00:46:47]>> Yeah.

[00:46:47]>> Yeah. Yeah. So, are you taking advantage of because like when you're running the Rust compiler, are you are you fusing operations at compile time?

[00:46:55]>> No. Uh, so the Rust compiler, we we do not do anything fancy in ter with the Rust compiler itself. So, we we don't have a we don't even have a proc macro for this for any of this. Um instead it's um uh instead the way we approach things is uh like when the when there is a need to to do static planning then that is cap or like planning it is captured as part of the kernel and then we that that you implement in your like in your rust code and then we rely on the like runtime caching like on the runtime caching stuff that I briefly mentioned where basically we will cache a pre-planned uh you know weight of the schedule some kernel or fuse some operations between them if really necessary. But this is something where we can surgically apply this when when it's really you know when things don't work well by default but we don't ne we don't necessarily need this for everything.

[00:47:48]>> Uh are you able to uh model and submit the conditional workloads right like I mean like what you showed so far is like you have a lot of like these like one operation following after the other. Um what are Oh, you had that on the slide.

[00:48:00]>> So this guy Exactly. So um exa so exactly like that that's one of the main benefits right if you look at SK scheduling attention is one of the most complicated one one of the most complicated things in uh when you're actually like like serving serving these AI workloads um specifically because again you have all every you want to batch together a lot of different users that are using the system concurrently so that you can get high arithmetic intensity during your big matrix multiplies. you need to effectively you you want to fetch things from memory only once and so you as the more things you can batch together the the better like utilization you can get of your flops but then in during attention this becomes very complicated because you can't mix different users to together and so this is where you know our most complicated kernel um in in reality it's it's more like hundreds of lines than and not like this um but here we have at runtime conditional planning and um and and scheduling of the different operations in this pipeline where we build up build up based on the shape based on the shape of some some users data. We estimate how long it will take on the hardware and because the hardware is like structured in a way that it's very predictable. We can then use that to plan the other work that needs to happen around this such that we can get achieve as much overlap as possible. And then here we do in fact rely heavily on the caching functionality to for example in within a single forward pass of the model you will have all the layers structurally will perform the same attention computation and so we don't want to re redo this planning work and instead we we we do it once at the in the beginning and then we cache it for the for the rest of the instructions that will be part of that forward pass.

[00:49:48]>> Yeah. Uh real quick one I think um I was just curious about like the proprietary nature of it. Is this something that is like accelerated by etched hardware or is it something that could be like ported to something else? So it's it's more so like this is like we we built this model for like for etched hardware.

[00:50:07]I think getting something like this to work on GPUs is going to be complicated.

[00:50:13]um you know in terms of like the proprietary nature of it we we're hoping to release this open as like an open source pro uh you know project um so that people because practically to program the chip you will need to write these kernels yourself so you will need very deep access and we're we're hoping to you know one of the things that annoys me when I work with CUDA and Nvidia's Nvidia stuff is that it's all like kind of pre-ompiled blobs that you need to that you need to use and so we're trying to to to be more open here in terms of you know the hardware is off limits but all the software you get to control things down to the level that you want and uh and tailor things and to get to fully squeeze out the hardware.

[00:50:51]All right, let's give another round of applause for Gorgius.

[00:50:56]>> Thank you.

[00:51:15]>> [clears throat and cough] >> Okay. Um, thank you very much Deorgus for this presentation. And our next speaker is Emma Smith who works at NVIDIA and is one of the CPython core contributors.

[00:51:32]Please welcome Emma.

[00:51:45]That's over here.

[00:52:24]Okay, excellent. Um, so hi everyone. Um, my name is Emma. Uh, I am here to talk about Rust for CPython. Uh, making Python safer and more robust for everyone. Um, so first a little bit about me. Um, I'm a Python core developer. Um, I'm on the Python security response team. I've been writing Rust for over eight years now.

[00:52:49]Uh, I've authored two Python enhancement proposals which are roughly equivalent to Rust RFC's for Python. Um, and I work at NVIDIA on Python open source and packaging. And I'm a mother of an adorable cat pictured here. Um, so let's talk about motivation. Um, why do we care about Rust in Python? I mean, maybe this is an easy argument to make in this room. Um, but, uh, you know, I think it's important to talk about. Um and so if you take a look at this code here, this is a Python example. I apologize, most of my code examples will be in Python. Um so, um but here we have some Python code um which is a generic example of a common pattern which is that you want to take in some type of message. It could be a websocket. It could be an HTTP request. Um and you want to serve a bunch of requests from a server. Um and so you have some generator that takes in all these messages and gives you uh the next message. And then you have some function like do work here. Um which in this case is just printing the message. Um and then at the bottom you can see that there's a thread pool executor which in Python is a way to create a thread pool and run functions in a thread pool. Um and at the very bottom you can see that we are taking the next um message and calling our do work function on it. Um and if we run this code uh on Python 314T which is the latest released version of Python and it's the free threaded version of Python. Um how many people are familiar with the global interpreter lock in Python? Okay.

[00:54:27][clears throat] A good number of hands.

[00:54:29]Excellent. Okay. So um the freethreaded interpreter does not have a global interpreter lock and so multiple threads can run um par in parallel. And if we run this um oh uh we get a fatal Python error um and an invalid frame and the process was aborted. Well, what happened? Um well, if we look at this code, um and it turns out that right here we actually have a data race. Um and as a trove of hands, how many people here are familiar with data races? Okay, a good number of hands. Okay, I will go through this quickly then. Um, so to talk about data races, I think it's important to talk about the C and C Python. So CPython is the default Python implementation. There are other Python implementations such as pi. Um, but if you run the Python executable, you're running CPython. Um, and as the name might imply, there is a lot of CC code in CPython. Um, there's actually over a million lines of CC code in CPython. Um, which is a lot to manage. Um, and C has been very useful for CPython. Um, C Python is over 30 years old and um, C has been great for performance um, for portability and compatibility. Um, but I want to talk about undefined behavior today. Um and so if we take a look at data races, uh the C standard gives this definition which is that the execution of a program contains a data race if it contains two conflicting actions in different threads at least one of which is not atomic and this results in undefined behavior. Um and so uh how many people here are familiar with undefined behavior? I assume most people. Okay, great.

[00:56:15]Awesome. Um and so uh the C standard also gives a definition of undefined behavior. I think it's important to talk about this in C because undefined behavior is different in C and Rust though they have similar u meanings. Um and the C standard gives this definition which is that it's behavior upon use of a non-portable or erroneous program construct or erroneous data for which this international standard imposes no requirements. Um which what does that mean? Um, [laughter] so, uh, so I like to think of undefined behavior as an error in code which may or may not be caught and it may be caught at compile time or at runtime.

[00:56:55]Um, and it can manifest in a bunch of different ways if your program runs correctly. Maybe you got lucky, but also you might have like a hidden bug that may not work for some people. So, is it lucky? Um, program crashes, corrupted memory. you're all Rust programmers, so you you understand why undefined behavior is bad. Um, and so, you know, there's this quote from Chris Latner, who's creator of LVM, um, in Clang, uh, in which he says, there's no reliable way to determine if a large C codebase contains undefined behavior. Um, and this is kind of the Achilles heel of C, maybe, I don't know, that's one way of thinking about it. Um, but it's a problem. Um, and I want to take a brief detour. uh enough undefined behavior for right now uh to talk about Python uh and some of the things happening in Python.

[00:57:45]So how many people here use Python maybe in the last year?

[00:57:51]Okay, vast majority of people. Awesome.

[00:57:53]Um so Python has a lot of major changes happening. Um there's removing the global interpreter lock like we alluded to. Uh there's sub interpreters where you can run multiple Python interpreters in the same process. Um sometimes concurrently um and then there's also the JIT which is currently on pause now.

[00:58:13]Um but uh is also changing a lot about how the interpreter is written. Um and I don't want to you know talk about these changes in terms of we shouldn't do them because I think we absolutely should. Um and I think that it's important to make Python faster and more scalable. Um, however, there are some considerations when making a lot of big changes. Uh, over the last year, over 500,000 lines of CC code have been modified in CPython, which if you remember the slide that I showed you earlier, is about half of all the CC code, which is kind of scary. Um, and it becomes a little bit more concerning. Um, this is a Usenix study which found that most vulnerabilities uh in code come from new or recently changed code. Um, and so you can see that there's actually an exponential fit here where exponentially more vulnerabilities and bugs come from newer code. So let's review. Uh, it's more or less impossible to avoid undefined behavior in C. Uh, undefined behavior can lead to bugs and security issues. Uh, and those can uh tend to be in new code and CPython has a lot of new code or recently changed code due to these major undertakings.

[00:59:27]Um, so you might be a little concerned.

[00:59:30]Um, and I also want to talk about mitigations and why mitigations are not enough. Um, because when you're working on a large C project, there are a lot of things you can do to try to avoid undefined behavior. Um, and so this can be like, uh, you know, looking at code harder, increasing code review. Uh, we can't really do that because, uh, the vast majority of CPython maintainers are volunteers. Um and so we have a limited finite budget of review time and unfortunately um you know we're not going to have a hundred extra people show up overnight. Um another thing you can do is fuzzing which we are already doing. We're part of OSS fuzz and presumably other people are running CPython through other fuzzers. Um you can also use sanitizers which can check for you know out of bound address reads and uh data races and we are also already doing that on every PR to see Python. Uh and then finally in the more recent uh development perhaps is LM review um where you can say you know LLM please stare at this code and tell me where the bugs are. Um and we are also already doing that um both the core team and also contributors. Um and so we also get a lot of uh LLM assisted security reports and I think the the takeaway from this is that fuzzers LLM and sanitizers partially mitigate undefined behavior but they also aren't going to be able to find every instance of it.

[01:01:02]Um, and you can try to fight against undefined behavior in your C codebase.

[01:01:07]Um, but that has a cost. Um, and so this is a chart of uh issues in the CPython repo with the uh type crash, which means that the interpreter is crashed. Uh, and not all of these are necessarily due to uni undefined behavior, but the vast majority are. Um, and if you, you know, look at the graph, unfortunately, there is a, uh, significant increase year-over-year, um, for the past several years. Um, and this means that, uh, the maintainers have to spend more time triaging all of these crash issues and less time working on cool new features or, you know, fixing other bugs. Um, and uh, another data point that I want to point out is from, uh, Seth Larson is the Python security developer and residence, and he's, um, uh, basically in charge of security for the Python programming language. Um, and we saw a 4x uh, increase in volume of security reports um, year-over-year to the Python security response team, which is a lot to triage. Um and a lot of this comes from um you know this may be logic bugs but also undefined behavior. Um and so one of the goals of introducing Rust is to reduce the burden of maintenance. Um so that's kind of where Rust for CPython uh originates from motivations. Um, and so we're a group of Python core developers, um, maintainers of PIO3, which is, um, a way to write Python extensions in Rust. It's really awesome.

[01:02:43]I highly recommend you look that up. Um, and also other Python community members.

[01:02:49]Um, I also want to thank the Rust team.

[01:02:52]um we've been talking with them um for several months now and they've been super helpful and um they uh invited me to go to the Netherlands to the Rust all hands and uh we had a lot of really great discussions um and so this photo is from that and I just wanted to thank them um for you know all of the help that they've been providing. Um and so I'm going to skip over benefits of adopting Rust. think most people here are probably familiar with the benefits of Rust. Um but um I think you know one of the most important ones that I want to highlight is that data races are impossible um in safe rust um because with the advent of freethreaded Python there's going to be more parallel Python code and avoiding data races will become more and more important. Um, and so the fact that Rust is able to model this in a safe way is really powerful for us.

[01:03:48]Um, but unfortunately Rust is not perfect. I'm sorry to say. Hopefully this is not, you know, >> [laughter] >> um, uh, but nothing is perfect. Um, and so Python builds and runs pretty much everywhere. Um, apparently it's part of like the GIB seed build process. I' I've learned a lot about where people end up are using uh, Python all over the place.

[01:04:10]It's also part of the Rust bootstrap process. Um, if anyone's here has built the Rust compiler, um, you download your first stage compiler, uh, via Python.

[01:04:21]Um, and so if Rust doesn't work for some of those people, that might be an issue.

[01:04:27]Um, if it's, you know, somebody has a custom machine that they hand etched hardware for, then, you know, maybe we don't want to block progress for just that one person. But if there's a lot of users, then that's a concern. Um, one big concern is that not all of the core team knows Rust. And so a lot of what we've been doing is um talking to core team members about um how can we introduce Rust without preventing you from contributing because we don't want to have the introduction of Rust um prevent people from contributing. Uh and then the other thing is we can't rewrite a million lines of C code in Rust overnight. Unfortunately, I'm not a millionx developer. Um maybe the next frontier LLMs will be. Um so, and then the other thing is, you know, Rust still can have um you know, unsafety using unsafe. Um and this is a screenshot from the Rust Omnicon. Um and you know, undefined behavior is bad and you don't want to invoke it. Uh so avoid unsafe if possible, but you can't always do that. uh especially when you're introducing Rust into a largely C codebase. Um and so that's another area where we need to be very careful about how we introduce Rust. Um but something that's been very encouraging about this is the experience of the Android project. Um they wrote a blog post called Rust and Android move fast and fix things. And the Android project adopted Rust many years ago. Um, and they found a thousandx reduction uh in memory v vulnerabilities uh per million lines of code, which is amazing.

[01:06:06]And you know, if we can reduce the type crash bugs by a thousandx, then that would make me very happy. Um, we'll see.

[01:06:15]Um, but I think the other thing that really makes Rust shine is not only does it reduce vulnerabilities, um, it also has increased confidence. Um, and I think this is, you know, with those big changes like the JIT, like free threading, like sub interpreters. Um, it's important that we have confidence in the changes that we're making to CPython. Um, and so this chart shows that, um, Rust code, especially for large changes, had fewer revisions compared to C++. Um, and so this I think is one of the most important selling points of Rust in my mind beyond memory safety. Um so let's talk concretely about uh the proposal. Um Rust and CPython is currently a proposal. Um we're going to introduce Rust as an experiment hopefully. Um and the reason for this is um if you know there's a million Python users out there who will never be able to run Rust or you know can't install Rust for whatever reason then we can't really break them unfortunately. Um, and so we may need to remove Rust, but hopefully not. Um, and then, uh, any Rust code will have uh, a C fallback. Um, we're also considering a Python fallback. Um, and that's mostly just if you don't have Rust on your system, uh, at least for the first little while, there can be a fallback, so you can still use whatever module we implement in Rust. Uh, and this works really well for the Linux kernel. Um, you know, as I'm sure many of you know, um, they ended their experiment at the end of last year, which was really exciting to see.

[01:07:50]Uh the other thing is that we want to target a significant improvement. Um while I'm sure most people here would probably agree introducing Rust is a uh significant improvement in of itself. Um I think a lot of people in the Python community want uh to see other improvements uh beyond that. Um and so um Python 316 is the next uh is the current version in development that accepts new features. Uh it's going to be released in October 2027. Um, and the idea is that we will reimplement the Z-Lib module, which is a compression module, uh, using the Zibb RS crate. Um, there's actually a recent blog post from the Trifecta Tech Foundation about ZLibb RS, um, and its use in Firefox. Um, and how they ran into a CPU bug. Uh, it was deploy deployed so widely um, that they actually ended up triggering a CPU bug and how they worked around that. Really interesting blog post. Highly recommend.

[01:08:44]But um it also gives confidence that you know it's a very robust codebase um and so I so um it's also very fast um and so Python packages uh use zib compression under the hood and so in 316 hopefully um everyone's package installs will be faster which would be pretty cool.

[01:09:06]Um so we can talk briefly about the future. Um the main thing is you know we want to gather information from distributors about any platform issues um and you know try to fix any issues related to that. Um hopefully introduce Rust in more places. There are a lot of places across Python um that I've talked to people about um for the IO stack um the JSON module XML basically anywhere where we're parsing untrusted input. Um, and then it would be really cool to have a browser grade HTML parser using servos HTML 5 ever. Um, you know, Python has a not HTML 5 compliant parser currently.

[01:09:44]Um, and so it'd be really cool to um to use that. Uh, and then there's, uh, the memory view type is a type in Python which you can use to express a view of a buffer. Um, and it's actually generic, but C doesn't really have a great way of modeling that. um which means that there isn't a way for us to specialize some of the code paths in its implementation. Um and so some of them are kind of slower than they could be if we had generics and an easy way to implement generics.

[01:10:11]Um so another place where Rust would make a lot of sense. Um one idea that we've probably you know is is kind of an open question is stabilizing an official REST API. Um you know for C we have a C API and a Cabi. Um, and ideally we would like people to be able to write Rust extensions for multiple Python versions.

[01:10:33]Um, but stable APIs in Rust is a a can of worms which uh, you know, there was a lot of discussion about that at the all hands. Um, and you know, we'll see how that turns out. Um, and hopefully maybe eventually make rust required to build Python. That would be really cool. Um, that would allow us to kind of remove some of the fallbacks and ease maintenance and use Rust in more places.

[01:10:56]Um, so if you want to learn more, uh, this QR code will take you to our website. Um, or you can just type it in rust-4-cpython.com.

[01:11:04]Um, we have a discord you can join. And, uh, thank you very much.

[01:11:09][applause] All right, we have some time for questions.

[01:11:20]Hello. Um so you mentioned earlier that the JIT effort in Python is currently paused. Um does it have anything to do with the closure of the faster C Python team for Microsoft or is it some other stuff that's happen like what's what's the context behind that? Um there was a so the Python steering council which is the um replacement for our BDFL GO who retired um and is kind of the leadership of the project uh made a post last week I believe um saying that they wanted a Python enhancement proposal going over the JIT and its goals and things like that um and that until that proposal has been made and you know approved that it would be the development of the JIT is paused.

[01:12:16]Hi, thanks for your talk. I had a question about you. You mentioned that there are some platforms that CPython supports but Rust does not. What are these kind of platforms and what would it take to either make Rust support them or drop support um for them in CPython?

[01:12:33]>> Yeah. Um, so I think one important thing to keep in mind is that like there are officially supported Python platforms or platforms that Python officially supports and then there are a lot of platforms that Python does not officially support but a lot of people use them on. Um, so I know I'm sure that people run Python on uh Lumos or things like that but um that's not an officially supported platform. Um and uh and so um there are platforms like uh I think HVPA or um I'm trying to think of some of the others. Um there are some that uh I think only have uh GCC backends but not LVM backends. Um and so one of the things that you know we're hoping to do is um see you know if we can um you know help with the uh GCC backend for Rust that would be really exciting. Uh I had a conversation with the uh m maintainers of that um at the all hands and so um I think that will cover a lot of the platforms that Russ doesn't currently support. Um but I'm sure that there are others out there that I have never even heard of. So it's kind of the only way to find out about all of them is to introduce Rust, have somebody's build break, and then say, "Okay, come go tell us about it, please." Unfortunately.

[01:13:55]Yeah.

[01:13:58]Uh I have a question. So out of the uh core developers for C Python um are there a lot of people who are like very against this uh proposed changed or some like most people are more or less okay with it.

[01:14:15]>> Um so I think there's a wide variety which is probably not surprising. Um, I think, um, a lot of people have concerns. Um, but as I've talked to them, I think a lot of their concerns have, you know, not um, or they they're no longer concerned about a lot of those things. Um, I think I I haven't really held a poll, so I don't really want to speculate on uh, you know, share, but um, I do know, you know, there are some people who think that it's a bad idea. There's a group of people who think it's a really great idea and there are people in between.

[01:14:50]>> Great. Thank you.

[01:14:57]>> Hey, how's it going? So, um, for teams migrating to Rust, what parts of the borrow checker or memory management do you find people struggle with the most and like how you got overcome it?

[01:15:06]>> Um, sorry, the questions about memory management.

[01:15:10]>> Yeah.

[01:15:11]>> Um, yeah, memory management is tricky.

[01:15:13]So, Python has its own allocator um pimal and meshing that with Rust is definitely one of the things that we were talking about at the all hands. Um and so I you know um minimal allocator API is getting stabilized I heard which is exciting and that will help. Um so I think there's still a little bit more work to do to figure out exactly what that should look like.

[01:15:39]>> Great. I think those are all the questions we have. So, thank you very much to Emma.

[01:15:45][applause] Okay. Uh we have our last speaker for tonight, Marco Venovich, who's going to tell us about uh his experience of uh how he thinks we should go from looms to Rust.

[01:16:28]Can I grab the uh hand microphone because I'm a energetic kind of guy.

[01:16:33]>> All right, cool.

[01:16:43]>> Actually, I'm going to need this thing. Oh, wait. Hold on. I'm back.

[01:17:00]There's a little bit of prep involved.

[01:17:03]>> Got it.

[01:17:10]>> Yeah. Hold it down.

[01:17:13]Bottom one.

[01:18:42]All right. Hello. Awesome. Uh, I'm going to try not to be too loud, so maybe I'll reduce my volume a little bit. But, um, my name is I should also run a timer.

[01:18:50]Give me a second. There's a This has been timed to 15 minutes sharp. So, uh, All right, cool. So my name is Marco and uh I'll talk about my background and all that in a second. This used to be a talk on compilers and I used to give this talk at Tesla and um the whole point of this talk was to introduce new audiences to compiled languages and how they worked. The thing is that talk was an hour and a half. I needed to trim it to 15 minutes and if anybody figures out the reference with the dragons and the eval apply that's where that's from. Um this talk became uh focused on the most important thing which I in my opinion is going to be talking about what computers are. Um so that's me. Those are some of the places that where I worked at or studied or whatever. These are some of the things I like and these are some of the languages that I like. Oh, one's missing. Hold on. Oh, there we go. Um, so what is a computer? In the 20s, a computer was a human who would compute.

[01:19:38]You would give them a mathematical formula, an algorithm if you will, and they would do computation for you. This is my dad. This is me with my dad.

[01:19:45]Pretend these are shot in the same decade. They are not. Uh, to answer what a computer is, we're going to have to take a trip to my home on this street. I asked my dad, "How do computers know how to think?" they can think. He said I said well how do you know how do they know what to show on the screen then I saw these flashy lights and I thought there was a homunculus in there and how do they know that and he said I don't understand your question thing is I didn't either this are uh this is music this is our signals you can take a bunch of sine waves I actually have a signals e degree uh you can take a bunch of sine waves you can sum them up and when you sum up those signals sine waves you end up with a rich signal you can actually do this at infinitum and you get a square wave potentially or any arbitrary signal going from going from that to this is called a FIRE transform. And what a FRIA transform does for you is you take the time domain and you convert it to the frequency domain. You take any arbitrary signal and you get the individual amplitudes of each signal.

[01:20:35]That's the formula. I did that a lot in college. It's not easy. This is a this is a Mickelson harmonic analyzer. This device is a small mechanical computer that allows you to take a pen and trace an arbitrary signal. And on the back of the paper, it shows you the individual amplitudes of each of those harmonics.

[01:20:52]Pretty awesome thing. What is life?

[01:20:55]I love this meme. What I think life is is doing things we enjoy. Um I think uh that's what I the way I see life is thinking about Fibonacci numbers, thinking about ideas, thinking about mathematics. That's what we like.

[01:21:09]Formalizing those things is a thing we like. Computing we don't like. Do they look happy to you? Maybe the guy in the back, he's excited for the photo, but the rest of them aren't very happy for computing. Cool. So how do we go from this a very specific computer to this?

[01:21:25]Well, generalization we generalize. This is a switch, a theoretical switch. These are implementations of a switch. If this is your abstract data type, those are the implementations of the abstract data type. The relay is really and this is actually my house. Uh the relay is really interesting because the relay allows you to actually take current on the inputs and it allows you to connect two different switches. So what's really awesome with that is it is electrical but it doesn't have to be. All right these are logic gates. I hope most of you have seen logic gates but we will reiterate you can take theoretical switches which you can implement with valves to implement something called an orgate. If you close this switch this goes high. If you close this switch this goes high. Same with the valves. If you open the valves water comes out. Either one of those valves water comes out.

[01:22:08]It's an orgate. That is a theoretical orgate. Uh these are all the possible logic gates you can make. There's a billion more. These are the ones that are probably the most important ones.

[01:22:18]Cool. And I I do want to like prove this to you. You don't actually need um Are we dead? Are we not? Are we Am I loud enough? All right. Um you don't actually need to do this electrically. Like there's a lot of projects and I'll share this presentation with you. So you'll get the links, but there's there's a lot of projects which have achieved the mechanical um mechanical orate logic gates. You can take those logic gates, take the individual cells of the switches that you've built logic gates into, combine them as cells themselves to build an adder. And this is what an adder is. Um, you can build that yourself in a simulator. If you haven't worked in embedded, it's fun. The thing that we need to build a computer is memory. And this is called an SR latch.

[01:22:53]If you haven't had a computer engineering thing, this is worth learning about because the this is the magic. You can take combinatorial logic, these things that have no brains, and they can now start to memorize because if you set the R latch, that sets that input. And if you when you set the reset that sets that input even when you're released S that is still sticky. There's memory. There's a brain. This is a jakard loom. And that's the joke for the title. Um a jakard loom is a loom that has all these little cards that you can program by punching holes into them. And the way it works is as those holes slot into th these rods, these rods pull individual uh thread needles up and down, which allows you to as the the loom works do pre-programmed patterns.

[01:23:36]This was built in the 19th century. Um, reminds you of something. I don't know, maybe this is like maybe we're too young for this, but this is the the thing. Um, those that this four punch card is the exact same thing as a loom as that jakard loom punch card. Okay, this is a computer. I drew this on my whiteboard.

[01:23:51]I'm sorry it is crappy. I hope you can see it. Um the whole point of this this computer that the thing I want to illustrate to you is if you can control these switches and if you can control that switch which wires whether maybe we're out of charge. All right.

[01:24:05]Sorry. So if you can control the red switch to pick whether you want the the adder or the subtractor and if you can control the red the blue switch to control whether you want the output to go to the register zero those SR latches or uh register one uh you can build yourself a computer and that jakard loom right here. Um you put some lights on the output so we can see what we're doing. Um the relay it's an implementation of a switch but remember the switch was abstract. It may have been mechanical, wooden, cardboard, whatever. And if with a relay we can actually build a really nice electrical computer. And it is the 50s. Oops, sorry. It is the 50s and we built ourselves a computer. It works. Modern computers have we love to represent addresses this way. The reality is addresses are just like huge arrays of cells. Oops, sorry. Huge arrays of cells and huge arrays of memory. And the memory controller kind of like gives us this view that we like to reason about.

[01:24:52]But the end at the end of the day, it's just a bunch of SR latches. Okay. So the answer to this question, computers can't think. People think. Computers are machines. When the transistor came out, it just allowed us to put a bunch of them on on one spot. Um, and well, what's a talk without demo? So, I have a computer for you. This right here is an STM32 um F7. And I'm going to switch microphones.

[01:25:21]Sorry, I probably should have warned you of this.

[01:25:25]Cool. All right, we can uh we can we can hear we can hear each other. Awesome.

[01:25:29]Okay, so the reason I need to switch my computer is because I'm going to give you a demo. So the first thing um the first thing we're going to have to chat real fast and I I have a background in embedded and I want to inspire you guys to do embedded. It's really fun and it's the best way to learn how computers work in my opinion. Um you know all these like high level programming languages unless like you get a feel for the memory you will always I think there's so much more to learn if you just go down down to embedded. Cool. Okay. So modern CPUs have something called memory mapped IO. There's two ways to interface with input output. One of them and the more way more common one is called memory mapped IO. uh it effectively there's a memory unit. So when the CPU says let me read oxfo um you can take um the memory unit will have an internal lookup table that says oh I know o xfo everything between oxfo and everything between oxo f10 means vj controller I'm going to push that to the vj controller vj controller hands the result back and the CPU has the result that is a memory map I am really sorry that is probably way too tiny for you um but the the core point of the memory map is you can see this chunk and this is by the way the memory map of this chip uh and you can get this in the technical reference manual um it has this block that I'm just going to yell. I guess it has this block which uh has all the actual memory, the RAM that you would normally expect, but in this block it says peripherals. That's the memory mapped IO part. That's where your GPIO is. That's where your lights are and so on and so forth. So, let's punch something in.

[01:26:46]Let's, you know, let's let's make our loom. Let's let's make a punch card and let's stick that in memory and let's set those registers to be that. So, to achieve that, um let's go to GDB. Um and I'm intentionally obscuring what that means. Uh don't worry, I'll show you. Uh so what we are what we've done there is we've effectively punched in those uh oops. So I'm so so so sorry. I realize this is not working. Okay, we're going to have to do this a little bit uglier.

[01:27:10]Animations are going to have to we're going to have to lose on animations. But uh what I've done is I've set that same address that we've mentioned here ox and ox7 in GDB on this chip at address ox2000100 and so on so forth. What I've also done is I've set the program counter to be at that address. Ignore the stack pointer.

[01:27:30]It's unrelated. You need it because this chip is weird, but don't worry about it.

[01:27:33]So, what do we expect to see here? Well, if I print the register, the program counter, I will see that it has not moved. If I do more of continue, I will see it has not moved. Very, very annoying. Very boring. Well, why? Well, let's try to understand what happened here. I have a really sweet animation. I don't want to lose out on this. Um, let's try to understand what happened here. So we what this is what we punched in binary, right? If you actually go to the manual for the Cortex 7 chip, the one that's on this, you'll find that these bits map to this address. So these bits are here. In fact, it's telling us which instruction we're running. These bits are the oper instruction.

[01:28:16]>> Okay, so there's more animations. Um, it says we should extend by one, which we do. It says we should sign extend, which we do. This is what that means in hex.

[01:28:24]That's that. Does anybody know what number this is? Fast math? Nope. It is minus four. Um, we're effectively jumping four bytes back. That branch instruction is a jump on this particular uh chip. The reason we're jumping four bytes back and still staying in place is because the way the fetch a code uh fetch the code execute cycle works on this chip is like it the the instruction pointer is actually like running four bytes in front of you. two instructions the code fetch and the code while we are on execute and each instruction is two bytes wide 2 * 2 minus four so we need to actually jump four steps back cool let's make something more awesome let's go a layer higher remember we started at switches that that's that's the point um where let's make this program what does this program do it's an arithmetic sum from one through five very boring you can actually compile this and this I did do this by hand and like I did not get claw to do this I got clawude to to like take the PDF and spit this diagram out but but Um, but that that was like comp compiled by hand and it is not hard to do if you know what to look for in the manual. You can do that yourself and let's run it. So I have these are all real examples and let's see if they break. So you can see that what I've done here. Oh, there's no more animations. So we'll we'll have to live with this, I guess.

[01:29:36]Uh, can I do this without entering present mode? Yeah. All right. Awesome.

[01:29:39]So um all right. So, we have programmed I have programmed that that those addresses if you look at the the the the hex's and all that. Uh, will this work?

[01:29:48]Oh, there's there's a funny cat. I should show the cat, right? So, I converted that to hex and that's what I flashed on the chip in those addresses.

[01:29:54]Uh, as you see here and let's see what our registers are saying. R1, oh 15, R0.

[01:30:02]Um, and what was it? Program counter.

[01:30:03]Uh, I mix up IP and PC because some chips are called instruction pointer.

[01:30:07]Um, so let's run a little bit. Um, and if we print PC again, we'll see nothing has changed. Why? Well, because the last instruction was a B dot. And what's really funny is if you read this instruction pointer, you'll see it says o x10 whatever a the actual absolute address is that offset of 2001. The A is the where the brand where the B dot was, right? Awesome. So to repeat this, computers can't think. There is no homunculus in there. There is no person in there reading your Rust code or reading your binary. Well, with modern chips that's available, but but there is no person in there like like doing that.

[01:30:44]No, they're just switches. And your program, your bits zeros and ones have configured switches in particular ways.

[01:30:50]Let's get into some Rust. So, like this isn't Rust talk. Like, what am I talking here? That was that code. This is that code. The one you looked at in Rust.

[01:30:57]Let's compile it. Um, it should be relatively easy and I am going I'm mindful of my time. So, um, I'm gonna just like run this. Oh my god.

[01:31:05]Unfortunately, my colors are broken. I don't like a bat very much. You're going to have to trust me that this code is that code. Um, these are the compiler flag. What's really quirky about compilation for firmware is, and I'll go to this in a second, is you need to actually give it the linker script.

[01:31:20]Normally, when you compile Rust, like Rust doesn't know where to place your your your code. Like you compiled it to some hex, but then like then what? Where do you want me to place that? We have the whole memory map and it could place it really anywhere. And normally the kernel is responsible for handling this.

[01:31:33]I'll if you're interested in that, like talk to me about that. presentation used to have that and I cut that. Um but um we have to tell tell the the the linker here's what I want you where I want you to compile stuff. So if you see there's a reset function somewhere in there that's why that is not mangled because we need to tell the linker the linker needs to know what to place there. The elf icon is because this spits out an L file u executable linkable for file format anyways. Okay. So um we ran that and this is the output. Well, right. Uh uh R1.

[01:32:04]Nothing special, right? Kind of boring.

[01:32:05]Um let's do something more fun. Um the hello world program of embedded is called Blinky. Um and Blinky will get us into memory map IO real fast. So memory map IO. Remember we do IO through addresses. That's what we're going to do here. And if we look here, you'll find that this address I that's the particular peripheral peripheral that my LED is hooked up to.

[01:32:25]We're going to need to set this register at that call the mode R. We want it to write or read and we're going to have to set the state of that pin. All right, so that's what we're going to do. Uh what that code looks like is this. Lots of unsafes. I know, right?

[01:32:39]>> Um here's the here's the the code of of configuring the direction output. Here's the code that says hey. Here's the code that says reset. This line doesn't matter. This is clock configuration.

[01:32:52]These are big machines have lots of clocks. Don't worry about that. Uh I'm not trying I'm trying to inspire you, not not scare you. Uh and this is like 90% of the day job is actually configuring clocks. So um anyways, so um I've compiled that code and I've ran that code and let's stop GDB. Uh sorry, I changed my keyboard to Chinese. U all right. Uh um so Oh, sorry. I should not stop the code.

[01:33:16]See this yellow LED up top? I don't know if you can see it. Uh there's a yellow LED up there. It's blinking periodically. not as well as it should because the clock is misconfigured. I did that intentionally. All right. So, this is still C in sheep's clothing in Rust clothing. So, what we're going to do is we're going to use some libraries.

[01:33:31]There's a Cortex M library. There's that STDM32F7 library. Uh we're going to use them. That's what the code is going to look like. It's doing exactly what I told you.

[01:33:42]Set the high problem solved. Uh I'm running on short on time, but there's one more minute left, so I'll be fast.

[01:33:50]So, let's do that. And um let's run that. And the nice thing about this is now we're in the cargo ecosystem and I'm no longer making my own linker scripts because the community has built so much.

[01:33:59]It is [clears throat] like it is insane what the Rust community has built for embedded. Um so um here we go.

[01:34:06]You will see that other people have configured the clock better than I have.

[01:34:10]The the yellow LED now blinks actually periodically with half period on half period off. Um cool. All right. Um can we add auler? Uh that we have a minute for this. Um normally normally in in embedded use something called free arts. It's like the default. Uh there's a reason there's this footer joke in here as well. I I'll get to that in a second. Just just bear with me. But it's a priority baseduler. Theuler is surprisingly simple. Um it's it's mostly written in software. Free arts is 100% written in software. There's an interrupt. Interrupt fires periodically.

[01:34:40]It stops your program, evaluates what needs to get rejiggled, rejiggles your threads. Um in Rust there's Arctic and it's a billion times better. It is written in hardware. It is really awesome. And the guys who did this are like geniuses. Like it's it's it's it's something else. Um my slides used to look like this. And I I don't have an example for a reason. My slides used to look like this. Now they look like this.

[01:35:01]And that's because of you, all of you, and the whole Rust community. This is like who I am without the people around me. I kind of like motorcycles like innately. This is what people in my life have brought and this is what you guys and everybody in the Rust community has brought. Rust on that thing. I converted from C++ which I actually still prefer as a language to Rust because of Tokyo and tracing like tracing was the thing that I said this is the thing like the community is so amazing. Good luck setting up open telemetry in C++. It took me like a month. Uh again my slides used to look like that now they look like this. Thank you. And what best way to celebrate except by doing a club penguin party. So what I'm going to do and this is like the this is the meme.

[01:35:42]Uh, I'm going to I'm going to show you I'm going to make you join that IP address in a second. And hold on, hold on. I'll bring it back. I'm bring Just give me a sec. Uh, this is the this is the part that is like the least reliable one. So, there'll be a little bit of work here. All right. Make flash.

[01:35:57]That'll help us all. Oh, I think it's stuck because of Oh, nope. It's it's fine. All righty. So, you should be able to very well Oh, somebody connected. Um, okay. Okay. Give me a second. This thing is flashing. Okay. There should be a penguin that you can you can connect. If you were to connect to this and if you try to join this, you should see your penguin join this embedded device if all the relays and proxies end up working.

[01:36:20]Uh you're likely going to break it. The code is not good. It supports like five penguins. So, good luck. Also, like I need the the relay to work as well. So, hopefully this all works. But this is the first penguin and I'm not going to like u I'm not going to like stick dwell on this for too long, but do feel free to come up. Um, yeah, we went from switches to Rust and that's the whole point. Remember, all of these are just like machines and LLM and all these things are machines and they start at the switch and we built the Rust. Um, that's me. I work I have a startup right now. I'm hoping it'll take off. I built CI/CD. If you care about CI/CD, talk to me. Um, I love Elixir and the whole thing is written in Elixir and also Rust of course. Um, but yeah, I don't see any penguins so maybe my IP is messed up, but um, thank you >> [applause] >> Okay. Um, so this one, so I have a question that's kind of a dumb question, but I'm curious what your take on it is.

[01:37:19]Um, so you're showing an example with the blinking lights, right? And you were like, okay, I don't want to write this myself. So you imported a library that does that, right?

[01:37:25]>> Yeah.

[01:37:26]>> Um, >> well, no, it it doesn't. What it does is it's a hardware abstraction layer. So the library acts as a rust safe wrapper around all of those low-level things that I've shown you where we're literally writing to an address.

[01:37:36]>> Yeah. U my question is that that would necessarily blow the binary size, right?

[01:37:40]Is that something? No, >> no, it's one instruction like writing.

[01:37:43]It's it's literally a move to an address and another address.

[01:37:46]>> Okay. So, so calling library doesn't blow the >> No, no. If you compile OS, it it like compiles for size and it's incredibly common to do this in in um in in yeah, in embedded like we don't in embedded we really prioritize that. So there's no heap and there's none of that. Like there's a bunch of reasons why you might not want to have a heap or something like that. But like that's why you saw no stood everywhere. Technically like I could implement half of the standard library and get like all that jazz, but like size is a really big concern.

[01:38:11]>> Okay. I that was the other question. My second I have I have a dumber question which is Did you get a haircut?

[01:38:16]>> I did get a haircut.

[01:38:17]>> Yeah. [clears throat] >> I don't see any penguins. So I'm going to try to like finagle the network with my right hand while I'm answering questions with my left hand.

[01:38:25]Uh the this one uh uh the password is where is the server? Where space is space the space server.

[01:38:42]Okay. I also see that uh I'm just going to restart my proxy. I have this like proxy thingamajig. All right. Hopefully that works.

[01:38:51]>> Yeah, I do have another question. Is it working? Yeah. Uh for I'm also doing like some uh like no main and like freestanding rust and like I figured like rust core is kind of polluted with like uni code tables and all of that and that's probably also a big concern for like the embedded community. Do you know if there is like any for that like a mini core or something that that community has that I could use >> for mini core?

[01:39:18]>> Yeah, like just uh core with lang items.

[01:39:23]No static tables.

[01:39:26]>> Oh. Um I think um I I don't actually know. Um the I know in C++ there's ETL.

[01:39:31]I can talk to you about C++ for eons.

[01:39:33]But um I think you're are you looking for something like like that where it's like a re-implementation of the standard library but u like optimize for embedded >> pretty much like just uh what you need from core but like get rid of everything else that's like just adding a static a data.

[01:39:49]>> I think I think I I don't know anything off the top of my head. I know it's going to be surprisingly difficult because um like the CR like in C++ we have ETL because we don't have cargo um but you could just like kind of hotspot what you need from cargo packages that don't depend on no stood um and I think I think ultimately the big problem with core is like most of it is system specific like networking IO files um heap stuff that's like generally unpopular and embedded. Yeah, >> those things aren't in core.

[01:40:19]>> You can also shout and I'll repeat your question.

[01:40:24]Never mind. There'll be no shouting.

[01:40:26]>> Anyways, there's one penguin here for now.

[01:40:28]>> So, I also did embedded. Uh, however, I come from C++ and now recently Ziggland.

[01:40:37]So, I see you are a you are a formal former bun employee. Uh, what's the trade-offs between Zig and embedded Rust given that you're not really doing memory? Do you want me to be like honest or or do you want me to tell you?

[01:40:51]>> I'm not trying to start a flame war.

[01:40:52]>> The question Tesla would say. Um, no.

[01:40:54]The reality is like uh Zigg is uh much easier to sell to C developers. That's that's the only that's the only advantage I see. Realistically, I actually think that like my preferred language is C++ because I love templates. But like um no jokes aside, I mean, not jokes aside, but like Rust has so many safety guarantees that are that make it much easier to work with Rust.

[01:41:13]uh it is much easier to build primitives in Rust that you can then give to you know people on the team who are maybe not so uh super comfortable with like dealing with race conditions because you do get race conditions believe it or not in in embedded especially when you have multi-core systems um and and um I like like generally rust gives you the facilities to build this layer of of of danger that is hidden and abstracted away and then on top of that you can build pretty much anything and and you know it's not going to crash your application that's the deal um the the I think zig has this advantage of being very simple and a relatively small language compared to Rustin, especially C++. Um, and you spend a lot less time learning the language and a lot more time being productive. There's a, in my opinion, there's a crossover. I'm sorry for all the popping noises. There's there's a crossover point uh where you get value from one to the other. Also, sorry um I'm gonna Does that answer your question? Shan, do you have an ESP32?

[01:42:09]>> Do you want one? I'm giving all the people who ask questions ESP32s.

[01:42:14]All right, I'll save one for you. Sir, what was your name? Arim.

[01:42:19]>> Param. Do you have an ESP32?

[01:42:22]>> Huh?

[01:42:25]>> Sorry.

[01:42:26]>> A malware? No, it's brand new. It's from Amazon, so like I don't know. Uh, you have an ESP32? You're embedded.

[01:42:34]>> We're trying to get more people embedded.

[01:42:38]>> I also have a very serious question. Can I also have one?

[01:42:41]>> Yes. Uh, are you is your background embedded? If the answer is yes, then the answer is no.

[01:42:46]>> No, my background is not embedded at all. No, but I play around with it every now and then.

[01:42:50]>> Okay, there you go. You're getting 32.

[01:42:51]They're all reserved. That's all I have.

[01:42:52]I'm sorry. I only had three.

[01:42:58]>> Hey, I could have a I don't know tangential question, but I'm curious about uh you you worked at Tesla, but I don't think they use Rust much, right?

[01:43:04]They do like >> No, you'll be surprised. So So So Tesla it doesn't it's complicated. Tesla Tesla believe it or not my team used C++ for embedded and we had a bunch of vables uh which was very bad for performance but that's the problem with C++ is C++ is really hard to teach and C++ requires a huge amount of time investment to be able to use the language effectively uh rust gives you same defaults I think one of the teams at Tesla um uh that we were adjacent to because I spent some time in validation land as well they built their whole uh validation cluster which is effectively this like cluster very large cluster of machines that um that like runs tests and these tests are really involved because they're simulation in the loop test. So they literally simulate a game of the real vehicle software firmware running in like a game loop if you will. Um the whole thing was written in Rust. Um this calledh it's called happy box like I don't know if you want to start chasing me for selling a name whatever. Um but the whole thing was written in Rust and they exposed Python bindings for a lot of the validation engineers through PIO3 and that turned out to be a great success.

[01:44:04]Our sill on the other hand was a little bit of a mess because our team specifically had a C++ sill because we were C++ engineers and not Rust engineers even though the rest of the company was almost 100% powered by this Rust Rust sill. Um, our cell struggled from lack of development because there's not that many people working on it. A lack of poor C++ and lack of like writing correct C++ which is much more difficult than Rust because Rust will stop you from doing idiotic stuff. Um, C++ will not, which is why I love it. Um so we had a lot of bugs and the sill was consequently slow bad so many vtable jumps again the default for people people for poly polymorphism is not um you know curiously recursive template pattern it is it is polymorphism it is vable polymorphism dynamic polymorphism so you ended up with a lot of performance hits because the default of what people knew was that and versus the default in Rust is significantly better where like it's compiled down polymorphic I'm out of ESP32s, by the way. If you're raising your hand trying to get one, I am all out.

[01:45:08][laughter] >> You've mentioned a few times now that you prefer C++ over Rust. I'm intrigued about >> what are the things that Rust is missing in your eyes?

[01:45:21]>> Uh, controversial opinions, templates, and exceptions.

[01:45:24]No, jokes aside, I really I I what I love about Rust is the community and like there is nothing uh there is nothing in in C++ land that is going to be sufficiently large of a uh swing for me to move because of that. Like community is so good in Rust and um everybody's so nice. Um there obviously there's like real people everywhere but like in on average um the rest community loves the language and has built so much with the language versus in C++ where sometimes we get stuck on things that are maybe irrelevant and and I think u overall the things I really miss the most in Rust. I'm not I'm still going to stay on Rust because of the reasons that I've mentioned. Plus it's just like easier to like write and maintain projects in Rust. But um the core things I really miss are templates and exceptions. I I don't I'm actually not joking. I think exceptions are like pretty awesome. We I really wish exceptions in C++ merged Herb Sutter's 2014 proposal for static exceptions.

[01:46:17]That would have make them even better.

[01:46:18]But um yeah, I really miss those features templates especially.

[01:46:23]>> Uh Marco, I would recommend you maybe consider like getting a fourth ESP32 so he could get give it to Estabbon.

[01:46:31][laughter] >> I unfortunately do not have this is uh this is what I found in my home and it was very much a last minute thing but cuz you know I'm grateful for the community. I was like, "What can I do back?"

[01:46:41]>> All right. Thank you very much, Marco.

[01:46:43]>> Thank you.

[01:46:51]>> Sadly, no club penguin.

[01:46:52]>> OH, WE HAD A PENGUIN. YO, WE GOT A PENGUIN. WAIT, who is the red penguin is gone because my computer died. Sorry. My computer was a proxy.

[01:47:02]>> Okay, I'll just pick this up later.

[01:47:07]>> Okay.

[01:47:36]All right.

[01:47:37][clears throat] Thank you to all of our amazing speakers. That was really really great.

[01:47:42]Uh I love the energy and uh yeah, Marco did a great uh job of pumping everyone up at the end of the this long evening.

[01:47:52]Uh so let's give a round of applause for all of our organizers, including Michaela, Ray, Jose, Esther, Mary, Luna, who helped make this event possible. And uh once again, thank you to our generous sponsors Zed and Convex.

[01:48:10][applause] So, it's uh now time to mingle and uh thank you very much to everyone for joining us tonight in

Related Videos

Computer Science

LBF101 Creating an XML Changelog

liquibase7511

3K views•2026-06-15

Computer Science

Alta Labs Cloud Dashboard Real time Network & Xnet Insights!

ShinyTechThings

158 views•2026-06-17

Computer Science

Wait... Group Policy Not Applying? Check This First!

keeplearning_iT

144 views•2026-06-15

Computer Science

Leetcode Weekly Contest 506 | Life's boring these days

Pudeesht

2K views•2026-06-14

Computer Science

microJAM: MAKING A MICRO GAME FOR A GAME JAM IN CLOJURESCRIPT AND TOTALLY NOT C

janetacarr

156 views•2026-06-18

Computer Science

Partitioning vs Bucketing vs Clustering: How to Make Queries 100x Faster

thedataandaiguy

194 views•2026-06-16

Computer Science

Design Claude Code Like a Senior Engineer

hayk.simonyan

344 views•2026-06-19

Computer Science

Linus Torvalds: AI Won’t Replace Understanding Code

SavvyNik

140 views•2026-06-19

Trending

Nobel Scientist Creates Device to Harvest Water From Desert Air

DrBenMiles

2200K views•2026-06-16

GROW A GARDEN 2 UPDATE

KreekCraft

668K views•2026-06-20

উটের কুঁজের মধ্যে কি থাকে?

MrBonGrow

1861K views•2026-06-18

아픈데 손은 호강 중

Memody-q3b

5995K views•2026-06-14