This talk explores the importance of accessibility tools in technology communities, using the speaker's personal experience of temporary blindness from eye surgery as a case study. The speaker demonstrates that accessibility tools are not universal solutions and can sometimes help one group while hindering another. Through practical experiences with screen readers like Orca, Fenrir, and Yasser on OpenBSD, the talk highlights the challenges of making software accessible to users with visual impairments. The speaker emphasizes that accessibility tools should be considered as first-class consumers in software design, and that mandating specific tools is as problematic as mandating specific desktop environments. The talk also covers practical accessibility solutions like Ed Browse for web browsing and discusses the need for better tooling to skim text, handle Unicode properly, and manage output buffering for screen readers.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
BSDCan Friday 2026-06-19: 1120
Added:I don't know how I'm going to get my slides to DJ3 Venda.
>> We said email the slides if this is not working.
>> Yeah.
>> Here, let me >> Okay, so you're back to straight straight plugged in.
>> Okay, >> I believe you start.
>> Okay, but it's right now writing to this screen, so we'll see.
>> All right, I'll let you be.
>> You're just going straight from here to here.
>> Okay, that is a tragedy. How's my How's my audio?
>> Good. Unless you have your backup computer you want to try.
>> I just sent my friend to go try the backup computer.
>> I do not own I do not own the backup computer.
>> No, no worries.
>> Okay.
>> I could find a way to go back and No, it's not just detecting it. Oh, no.
Now I can ah I can be here.
>> Okay. Uh sudo if config w0 NWSD cam WPA key. I can hear me throughout the entire room.
Do I see network? I see.
>> Well, Well, this >> perfect >> perfect.
>> Do I have a terminal? Command space terminal.
Uh uh I don't have a key on here. Oh, how am I gonna get to my mail? I mean, I know how to get to my mail, but I don't want to get to my mail.
>> Uh, >> oh, I know what I can do.
>> Yeah.
>> Does anyone have a USBC?
I have do >> Oh, you do?
>> I almost brought one, but >> I I decided to bring the dongle for this computer.
>> Let's go.
>> Okay.
>> HDMI doesn't work.
>> Not quite. Uh >> I'm not sure what he's running on here, but we got a Mac as a backup computer, but now we tech >> sir. How do I present this?
>> Here, I'll help you.
>> Thank you. I don't actually I've never actually like made it into presentation.
>> Enter full screen.
>> And there we go.
>> Okay. Do we have output?
>> The projector is indeed able to do that.
What? Oh. Oh, shoot. That was the wrong computer. Whoops.
>> Do not shut down your web server remotely while at BSD can. Just a piece of advice for everybody, >> especially when you're presenting your slides off your web server.
after.
>> Oh, shoot.
>> That looks good. Yeah.
>> Will it work?
>> We're not getting the >> Yeah, >> it should come back now.
Yeah. So far it has not.
>> This is a new laptop. Yeah.
>> USBC.
>> Yeah.
>> We have one of those. We should have one of those cables right here.
>> That's HDMI.
>> So it works going directly from there to here.
>> I see.
Okay.
>> Oh, here. I'll give that back to >> Well, let's not give it back quite yet.
>> I have a feeling >> something.
>> I have a feeling this might be having a fun day.
>> Yeah, I have another one. Let's swap it up.
>> Okay, >> two seconds.
>> Okay, get a drink. I'll be right back.
Just a shot of vodka.
Where is the water fountain?
Shoot.
Whoa.
>> Whoa.
>> Thank you.
>> I got you for the win.
>> Thanks.
>> Love you.
>> Okay.
>> Am I late? I think I'm late.
>> Have a good time.
>> Thanks.
So, ah, don't start advancing by Oh, no.
>> mode, man.
>> That's what I get for putting things in slideshow mode.
Okay. Hello. Um, I am Am I good to go? Okay. Hello, I'm Sean Howard. I am talking about OpenBSD and blindness. Uh, if you do not want to be listening to OpenBSD and blindness, you're in the wrong room. Hopefully um you're now late for a talk unfortunately. Um again I'm Sean Howard.
Uh I am not a software developer. Um I'm not a I mostly do security work in my daily life. I started as an ISP grunt. I did a lot of CIS admitting work. Uh I don't know anything about the topic I am talking about. I don't write a lot of software. I don't do any accessibility work. Um, but I do care a lot about being able to see. So, uh, when I was going to go blind, I decided to, uh, as something to do while I was blind, write an a BSD can talk, which, uh, I don't know if that's masochism or wisdom. It's one of the two. Um, however, at some point in your life, you will become disabled. That might be because you smacked your head into something. It might be because you broke your leg. It might be because you are forced to carry something very heavy. Uh accessibility tools will always help you um in some way. At some point in your life, you'll go, "Wow, I'm so glad this elevator was here. I'm super glad the screen reader is here." Uh while we were setting up the presentation, um we actually had a whole bunch of issues. You all saw, but uh I actually used my screen reader a whole bunch to confirm the computer was actually running. Uh I could not see the screen. I wasn't blind, but I was uh in some way disabled. Um it's going to happen to somebody in your community.
It's going to happen to somebody in this community. At some point, people that you respect or people you really want to do things are going to need accessibility tools and so I do think it is worth talking about. Um it doesn't matter kind of why we need accessibility tools but we need to talk about them. Um accessibility tools are also they are not universal and an accessibility tool that helps one group might hurt another group. Uh good example is a lot of people have a lot of opinions about uh captioning images on social media right now.
And uh that can be complicated because people with a lot of anxiety feel a lot of anxiety captioning images. I feel a lot of anxiety captioning an image. I actually feel a little more confident after going blind temporarily. But those kinds of shared accessibility uh conversations can be very complicated to have. I am hoping that I can try to convince other people to talk about these things. Um I also think that OpenBSD itself needs to have these conversations. We are an aging community unfortunately. Um I think there are lots of new blood coming in but I think there's also a lot of people who are getting older. A lot of people that I deeply respect are getting older. Um, and I want to make sure that we can help people who are starting to find more limitations and help having those conversations. I don't know what that's going to look like. Um, I don't think that the stuff I'm doing is necessarily good. Um, but, uh, that's kind of my accessibility preamble. So I wanted to define the terms uh OpenBSD, temporary and blindness. So first off, I'm defining OpenBSD. Uh I hope you know what OpenBSD is. If you don't uh please raise your hand. Um but I'm trying to run OpenBSD.
I'm trying to run OpenBSD as an OpenBSD user. I have used OpenBSD as my primary desktop for 16 years now. Um, most of the laptops or desktops I run have run OpenBSD as their operating system on the front end, but I'm not um talking about like your mother, your friend, your oh, I have a friend who's blind. Let's get them on OpenBSD. Oh, I have a parent with failing vision who's not technically savvy. Let's get them on OpenBSD. This is not that talk. I'm going to talk a lot about like, hey, and then I tried to compile this tool and it failed on OpenBSD and I don't know why.
Somebody figure that out, please. Um, it's not a talk about the generic form.
Um, temporary. I was only blind for two days. I had eye surgery on March 2nd and I was able to see by March 5th or 6th.
So, the surgery happened. I wasn't mostly blind on that day. I was blind for two days and then by the sixth I was fully able to see. My vision was returning.
Uh my eye surgery caused me to not be able to open my eyes. Uh explain my eye surgery. I had a disease called katiconis. If your prescription changes like every time you see your eye doctor uh get yourself checked for katiconis.
uh your cornea gets thin and your eye starts to bulge a little. Um your lens can crack if you have katiconus. Uh it's really bad. Um but back in 2016, they came up with a surgery where they take the front of your cornea off and then they put B12 drops in your eyeballs and they harden them with UV light. So, they give you an Adavan, like just a a mild anti-anxiety medication, throw you down on a table, put some numbing drops in your eyes, and scrape the front of your eyes off. Sorry. Uh, it's terrifying. I apologize. I should have warned people as much. I was going to say this. Uh, and then you lie on a table. Um, and for 40 minutes, a doctor every two minutes drips drops in your eyes. And then for the last 10 minutes, they turn a highowered UV light in your eyes and then close it. high clouded UVI. They put the clockwork orange things in so you can't close your eyes. It is not pleasant and obviously your eyes are burned and cut. Um, you cannot open them without excruciating pain for two days.
I had to put eye drops in and I would cry every time I did. Um, so it was not fun, but it was not like I didn't have low vision or cuz like a lot of people you'll know who are blind might be able to see blurs or might be able to see um like shapes or colors. Yeah. But I could not see. I could not open my eyes. I was stuck like this and I was in excruciating pain if I went any other way. Um so just very different than a lot of blindnesses you'll see. So obviously uh I had time to prepare for being blind.
Um which was very different than a lot of people who are going blind. Uh when other people go blind they might have zero warning. Um they might have minutes of warning. Uh so there's that. Some people might have years of warning that they are going blind. And I hope if you ever go blind that is how you get it.
Um, but I had I found out when the surgery was in Janu or in November and I had the surgery in March. So obviously I had about 5 months to prepare which is just enough time to waste. It is not enough time to actually prep. It is enough time to go, oh I have plenty of time for way too long. Um, that puts me in a very different position than a lot of people. Um, so I started immediately to think about these things. Um, I figured if I couldn't get a screen reader working, I probably couldn't use a system at all. Uh, I if I didn't have a web browser, it probably wouldn't work. I really wanted something to work with the Fedverse, which would probably be the browser. Um, I like I'm on the Fedverse a lot. Um, I would really want a mail client. I'd like to be able to send and receive mail client. Um, I wanted something to do which was either going to be an audio book or a text adventure and I don't like audiobooks. I like podcasts but not audio books. So, I really wanted to do some sort of interactive fiction. Um, I wanted an XMPPP client. Uh, if you ever want to bug Steven right there, uh, he'll make your phone your phone into XMPPP if you want uh, for money. But like my SMSs, my calls, my life is over XMPPP. Um, so I wanted to make sure it worked. Um, I wanted Mumble. I have a role playing game group that meets up and hey, why not try? So I wanted to see if I could get Mumble working. And then I wanted something to play media.
For me, media means podcast and music.
Um, audiobooks and TV shows are obviously a thing. I did not even try to like figure out descriptive video or whatever to watch a TV show or just watch a TV show that I like with not knowing what it looks like. But I was hoping uh but we will talk about what happened. So basically every choice I made was based on my screen reader choice. Um you've probably heard of Orca, right?
You might have heard of Fenrer. pretty popular in the Linux space. It's a Pythonbased screen reader and uh I've heard some rumblings about a thing called yet another screen reader which is you knew there was a screen reader called Yasser but you didn't know it was Yasser probably. Uh and so I first tried Fenrier. Um I don't use Gnome and so I wanted to try something that was not uh was not Gnome specific. Uh, so hey, it's in pip and it has supports BSD in like the top of its readme file and I was like, "Oh, great."
Uh, I ran it for two hours and then the pip process crashed. Uh, I'm pretty sure there was a wheel somewhere. Um, so I tried pipex and I let pipex run all night and it installed and then the screen reader crashed.
I started to try to pull apart the wheels and figure out everything. I managed to break a lot of my Python. I paved the entire OS and I started again.
I could not get Fenrirer working on OpenBSD. I bet if someone goes out figures out what wheel it is and compiles that wheel for OpenBSD, it will work fine. I did not do that. I do not even know where to start. I am not a Python guy in any stretch of the imagination. So, I tried Orca. Hey, Orca's in packages.
So I just started throw Orca uh on the computer and it started reading every keystroke I made which was nice. Reading my keystrokes was useful but it did not help me.
I tried to get it to read windows. It did not work at all. I could never get it to read a text drawn in a window. It just read my keystrokes. I did tried to get it to run in a TTY. it would refuse to even run in a TTY. It was like, "Oh, X isn't here. I can't readstrokes without X. Um, it's fine." I tried to install Gnome. I was like, "Hey, I will give up. I will install Gnome and it will just work."
Once I installed Gnome, audio never worked on that computer again.
I think it was Pipewire's fault. I have no idea. That computer never made a sound. It would not play music. It would not speak with Orca. it would not do anything.
I don't know. I'm sure somebody has run Gnome and Open BSD successfully here. I hope so. Um, so it's probably my computer's fault, probably my hardware's fault. I had to pave the OS and start again. That's two for two, which is not a good track record.
Um, so I'm going to talk about Yasser.
And I want to draw your attention to Tim Chase. Tim Chase wrote a blo blog on this topic and this literally saved me.
This saved this entire project and this entire talk. Um, in 2019 he wrote a blog post on Yasser saying I set up a terminal screen reader and open BSD and I installed Yasser. I just went to their GitHub and get clone and make and it all worked. So I was like hey great. So I tried it. uh since that blog post is written. They switched to Mason and made a bunch of changes in 2020 2021 time period. Um I don't know what happened. I don't know anything about Mason. I don't even know what Mason is beyond a build system. I am not a software developer.
Uh I massaged the Mason. I read error messages. I paid attention and I got a build to work. Every time I started that build, it would seg.
I have no idea what I'm doing.
Okay, but it's fine. I know when it worked. I know that Tim Chase got it working seven years ago. So, I checked out the git commit made before the blog post was written.
And then I typed make and it worked. If I hit control A to go back to the beginning of my line, my screen reader would crash. Not great. If I hit control-R to search my terminal, my screen reader would crash.
Not great. But you know what? It worked.
I had a quick exit button for my screen reader, I think is what I meant to say.
Um, so Yasser worked. It would read me characters on my terminals. Would not read me my windows. Would read me my terminals. So everything's got to be TTY. Everything's got to be terminals.
I'm not even going to start X.
It was great.
uh problem. When I tried to log in to a session, what what's the feedback that you successfully logged in to your session in a TTY?
>> Nothing prompted.
>> Nothing prompt. So I just put in my profile echo login successpipe flight.
And so I'd start my computer, type my username, type my password, and we go login success. And it worked. And then I just typed screen reader, which is my Yasser rapper that like set all my stuff up. And it worked perfectly.
It was probably not the world's best solution, but it worked real well. Um, uh, I actually had to turn it off on the train here. Uh, I've been using it ever since. Uh, it also meant that every time I would open a terminal emulator, it would go login success for me because it was in my profile. Uh, this was not a great solution, but it worked real well, shockingly.
Um, okay. Now, I need to take a step back. I need to take a breather and I need to try to send an email and look at a website.
Ed Browse. Has any of you heard of Ed Browse? Seriously, Stephen's heard of Ed Brows from me. Ed Browse is so cool.
There is a person who is blind. they are uh constantly they they're constantly using this tool. It is perfect for a very specific kind of user. If you are the kind of person who really wants to use their copy of Edmastery that they bought from Michael Lucas a couple of years back and also would love to try a browser. Uh it is that it is a browser using Ed. you like ed browse web page and then it just shows like how many lines on the web page and you type like one comma p enter and it'll print the web page and you can be like 1 comma 22 p enter and it'll print the first 22 lines of your web page j to get the next page etc. Ed browser is amazing. It does what you need. Now, you might think that it might not be fully featured enough.
It has a JavaScript engine.
It has a mail client. I think it can do PDF rendering. I don't know how to trigger it. I've never tried, but I've read in its documentation. But it only uses text, and it just takes the stuff that you've done, shoves it into a text buffer, and treats it as if it's Ed. I swear to you, Ed Browse is the best piece of software I have ever used.
Um, it is obviously a passion project by one person who is using it every single day to run their life and it is amazing and you 100% should try it. Okay, that's mail and browser out of the way. Ed Browse just did both for basically free.
Um, so like I said, I wanted to play role playing games on Mumble. Uh, I love Mumble. So, I started looking around for a console mumble client. I found something called mum rs. Uh, which is awesome. And it says in its header supports BSD.
And so I tried to run it and it said target Unix not target open BSD not found. Okay, cool. Cool. Open up the make file and there's a bunch of if defafs and everything. Oh, if defaf freebsd if def freebsd if def freebsd.
Okay, I understand. That's fine. I get it.
So, sfreebsdopbsd enter. Uh, I tried to get it to run. Um, it connected to the mumble server. It didn't build super well. There was a lot of faults. It connected, but the sound difference, like the sound differences between uh OpenBSD and FreeBSD were too much for it. It could never play sound.
I decided to try to think about sound libraries and if I could fix anything. I This was like April 25th or something. I did not have time. The screen reader journey took me to the beginning of April to be clear.
So I did not have time. I really wish I had but it never worked out. I just gave up one mumble.
So I already have said my entire life is through XMPPP. I love XMPPP. It's how I talk to my wife. It's how I talk to my kids. I really wanted this part to work because I really wanted to use XMPPP.
I tried I know about Finch and Profanity, but they're both curses clients and uh at this point I believe curses is cursed. Unfortunately, I'll talk about that later. Um JJ and SJ uh has anyone heard of JJ and SJ? Anyone heard of II? Uh II is an IRC client. JJ is the like jabber version of that IRC client. Uh I got them to work, but I couldn't get them to send stanzas.
I got I got them I got go x go send xmp a go client to connect to a d group chat and I could see the user in the group chat but I could see no stanzas coming from it when I tried to send a message and I couldn't receive any messages. I was really hoping. I completely ran out of time to get anything to work, but nothing worked. I even bugged this guy a little bit and I couldn't get it to work. Um, I would love for those things to work. I don't know why they're not working on BSD. I will probably put more time into these, but that's okay. This is, I think, the most important. I was going to be lying in bed in pain a lot of this time, so I just really wanted to listen to music and I wanted to be able to do so without seeing things. Um, my home media server is Jellyfin. Uh, there's a project called JF2. If you've never used JF2, you should use JF2. It is a console interface to Jellyfin. Uh, I love it. It lets me just view my libraries. It does have a flaw, which is it pushes everything through M Player.
M Player loves to display album art, and it makes a window that spawns as the window with focus containing album art with no feedback or warning or easy toggling, especially when wrapped in JF2.
So, I just got album art written over my screen. So, I was like, "Okay, again, I'm never going to use X." So, I go to my TTY and I try JF2E and it displays the album art in the frame buffer, which is clever. It's hard to do, but it was not a good idea. Now, I can't even like navigate. It's eating my keystrokes. It's not letting it's not letting me use my screen reader. My entire screen reader is broken. I figured out if I ran T-Mox inside Yasser, ran JF2E inside T-Max, I could then have it eat the entire T-Max session and then I just like open a new window and hope like a new T-Max pane and hope. And that worked actually. The album art just kind of stayed in its T-Mux pane. It was not good though. Like please don't draw things on my screen that I didn't ask for. Um, but that's just kind of uh M player uh podcasts.
My cited flow for podcasts is I have a VM with running the console tool cast get which every night at midnight downloads all of my podcast or sorry no every night at midnight it checks for any podcasts in the MPD playlist. If there are podcasts that are not in the MD MPD playlist it deletes them. It then downloads all of my podcasts, enumerates the the podcasts that are not in the MPD playlist, and adds them to the end of the MPD playlist.
I then play and pause the MPD stream, and my MPD output is an internet radio station. I then listen to the internet radio station playing and pausing MPD.
So, I use IceCast, CVLC, and MPC and cascad. That is my cited flow for podcasts.
This actually worked perfectly uh when I'm using uh when I'm blind because there was no human intervention in most things. I had no problems. Uh this is why this worked for me.
Um okay, I wanted to play an interactive fiction game. So these days the new hotness in interactive fiction is Twine.
It's a nice JavaScript library. You run, you write your script and it makes it into a nice JavaScript object. You run it in your browser. Did you know that when you use Twine and you render that JavaScript, it dumps to the console the entire Twine script and then uses JavaScript display hacks to only show the part you want to see because that's not good. So Ed Brows would just dump the entire Twine script onto the screen. It did not help me. I could never figure out how to play Twine in Ed Brows. Um that's okay. Fro exists.
We all have tried Fros, right? Um Froz will just play like old Zed Glob files from back in the day. They were great.
Um I got through two games. Uh both of them a little bit more modern than old like than the ancient ones. I'm also a Inform 7 partisan. If you have never written code in Inform 7, I think you owe it yourself owe yourself time to do it. I last wrote Inform 7 with an 8-year-old. Uh we worked together. We wrote a game together. It was very simple. It took me about 2 hours to learn Inform 7. I really think you should do it. And then you can play it in Froz. Frz works great. Do not draw ASI art in your text adventures, please.
I don't know if you've ever heard if you've ever started a a text adventure and it goes equal sign equal sign equal sign equal sign star star equal sign equal sign equal sign star star star star equal sign and you're just like I don't know what's happening. What what is even the screen? I don't know where I am. Um there was a lot of early text adventures that would do that because obviously they were trying to get graphics and stuff and graphical adventures were kind of a thing. Sierra started with drawing single screens then went to King's Quest but yeah uh for the most part Froz was amazing.
Um cool. So I've gone through my list of things. I've done everything and I just want to like talk about some more generic things about screen readers and other stuff.
Um output buffers are incredibly important when you're managing uh screen readers.
uh the way that you output text and the way that you buffer text uh really matter.
I don't know if you've ever tried to SFTP a file and it goes like temporary blindness. PDF equal sign equal sign 2% temporary blindness. PDF equal sign equal sign equal sign 3%. And then does that every second.
Um every time I would want a prompt to work, I called my laptop RAR. uh my four-year-old at the time named it and so I hit enter and it would go round dollar and it was how I knew I was at a prompt that worked which like worked out but also was very weird. Uh package ad I could never get it to stop. I probably could have gotten it to stop reporting things if I was cited, but I couldn't easily if I was blind because I don't know if you've ever tried to use apppropo with a screen reader, but appropo will often give you dozens of man pages completely unbuffered. So I'm like sending app propose output to ed browse and then like trying to page through it line by line and like listen to the beginning but like the first word is the only word you need and then it starts reading a description to you that you don't have the ability to process very easily. It is very overwhelming. If I was good with a screen reader, this might not be a problem. Obviously I'm not good with a screen reader. I used it for two days. Um, also T-Max's bottom bar was really funny because it would tell me the time but only the seconds.
Every few seconds it would update the seconds and T-Mox and my screen reader would read me the the seconds change.
Took me forever to figure out why it would say random numbers.
Um, screen readers and Unicode. Uh, have you ever seen a post on social media that's like, "Don't post in mathematical text.
It breaks screen readers." And you're like, "Well, why doesn't the screen reader just read the mathematical text as text?" Like, it can, right? Um, the answer is you really need to know what kind of letter you wrote. And I didn't think about that very much before I used a screen reader.
But so much of the time I would have caps lock on and I'd type something and be like error and I type it again and I hit backspace and be like capital C. I'd be like oh that's what happened or something like that. There are so many times where there's semantic value to the area of Unicode you're using. So the few times where it isn't, a screen reader would be broken, like a screen reader should not make that decision for me because the screen reader cannot make that decision. If that makes sense. Um, this was really enlightening because I've always wondered why screen readers just don't go well like these texts all are the same. I'll just use them as the same thing. But if you're using weird weird parts of Unicode, you usually want to be. And that is a problem. Um, I also I don't know. I have no idea about even how to solve this problem.
Um, okay. I my my this USB dongle. I have I have a 2012 MacBook Air and it has a crappy USB dongle and it would not work with my parents Wi-Fi. I lived with my parents during the time. I did not want a 5-year-old jumping in my face while I had scarred eyeballs. Um, so I lived with my parents, my kids visited me. Um, so I had to use if config a whole lot a lot to try to like configure Wi-Fi, but I could not figure out where WN0 was in Ed browse very well because I'm piping if config to Ed browse in order to read this. And it was really hard to figure out where the different separations of the different things are. I don't know.
I don't know how to fix that.
Indentation is currently our method and I'd love to have better understanding. I don't know if that's an accessibility problem in that like I don't know how to use screen readers or I don't know how to use ed browser. I don't know if it's an accessibility problem in that like we should have like hints somehow. I I don't know. I'm just raising it as a thing I noticed. Um if config uh apppropo obviously um but just like so many tools would dump a lot of text to the screen. PS was awful. Um, even on a very simple machine. Um, and as part of that, uh, I don't know if you've ever heard screen readers work, but they're incredibly quick. They read so quick.
And I've always been like, I can't understand this. What What is going on?
Why is it reading so quickly? And it turns out that the answer is because they have to. Um, if you've ever tried to sit down and you're like, "Okay, I just need to figure out what's in this.
I have config whatever." and the screen reader starts reading the entire if config output to you line by line, every flag, everything. It takes forever. So I was like, why is my screen reader so slow within like two hours of using one?
Seriously. Um, so yeah, they got to be that fast. But you also just can't skim text in any meaningful way. A screen reader can read a little bit slower than I can read, but infinitely slower than I can skim. And so when you're using a screen reader, skimming text is not an option. So having better tooling to skim text, obviously less ed brows, other things like that will let you just like page through text very quickly and engage with things. I could never do it with a screen reader.
I'm not very good with a screen reader.
I was blind for two days. This is a talk about being blind for two days, not a talk for being blind for life. Um, but at the same time, I think it's important to think about those kinds of things. I don't know what the solution looks like, but I think it's something that we should be thinking about when we are designing software. Um, and then in general, the screen reader is my window manager in my opinion when I'm running things. Uh like I said at the top, the screen reader made every UI choice I made. Like every UI choice I made was based on my screen reader.
As such, um I want to make sure that we understand that like if we are mandating screen readers, we're making decisions about screen readers for people. It's the same as saying, "Oh no, you can't use our system without KDE. Why would that work?" Or, "You can't use it without DWN. my can only use sway. Um that would be really annoying. So mandating screen readers is really annoying. Uh choice in the ecosystem is really important and thinking them of them as first class consumers is I think really important.
Um and so I'm going to request everybody try to use Edbrows a couple of times in your life. Just try viewing the web in Ed browse. See how uh Mastadon completely breaks. Fendikica completely breaks in a different way. Twine completely garbages on a site. Your website might break things, but your website might be awesome and just test, try stuff. Um, and try going blind for a few days. I think it was actually really enlightening. You can use a blindfold. I don't recommend getting your eyes scraped off. Uh, but uh, yeah. So, I think I'm at time, right?
>> It's about five, right?
about 10 minutes.
>> I got about 10. Oh, that's more than I wanted. I'm went too fast. But I'm going to answer questions because that's all the slides I have.
I spoke too quickly. I should have I should have talked more about my eyes my eyes getting scraped off.
>> Sorry. Gallows humor. Yeah.
>> In the intervening time, have you tried using screen readers on platforms beside OpenBSD like Voiceover on Mac OS or Jaws or NVDA on Windows? I've used voice over on Mac OS.
>> You repeat the question.
>> Oh, sorry. Uh the the question is uh have I used screen readers on systems that aren't OpenBSD? Um I in the intervening time now in the intervening time I have not used basically any screen readers. I've had an incredibly stressful year. Um but uh previously I have tried using Orca on Linux. I have tried using uh voice over on Mac OS. I think voice over on Mac OS like was my escape hatch. Um, it was the one that I trusted the most. Um, I've heard that a lot from a lot of people that it's the easiest one to get onto. I know a lot of ma a lot of screen reader users use Mac OS. I know there are some good Windows ones and there's a strong Windows uh screen reader community, but there's also a very strong Mac OS community and I I've used it and it worked really well. So yeah, uh and things like being able to click on text and then just have it get read was so nice.
Um >> yeah, go.
>> Um so you talked a lot about your experiences with output um in a temporary temporary form of the situation.
>> Yeah. Could you speak a bit to input things like did you consider things like speech to text or you're still >> typing blind?
>> Did I consider things like speech to text or other input methods? Uh I don't look at my keyboard. My keyboard doesn't even have letters on it. So I just like took my keyboard and typed on my keyboard. I didn't really think about it. Obviously I also spent the entire time in a TTY. So I didn't use a mouse at all. I just used a keyboard. I did not think about that as a problem. I've used speech to text in the past, but not in many years. I don't think I could speak to it meaningfully.
Sorry. Yeah. Go.
>> This is actually a serious question, but does it pronounce as?
>> Is it pronounced as git or jet? I have always pronounced it git. I believe it is a reference to the uh English insult git.
>> How does the screen reader pronounce g?
>> Oh, the screen reader. Sorry. How does the screen reader pronounce get or j.
Oh, GIF.
>> Sorry. Did the screen How did the screenwriter play pronounce GIF? I do not know. Um, if you talk to me afterwards, I have a laptop with a functional screen reader and we can go try. Um, it currently does not say login success though. So, uh, I had to change that for the train, but I we can figure that out in a second.
Um, sorry, I saw another hand. I thought >> Oh, yes.
>> Yeah.
>> How are your eyes now?
>> Um, so I because my corneas are like hardening and regrowing into like the right shape.
Uh, for like couple of months, my prescription was like changing every couple of hours. Um, so like I'd be like looking at a screen and have to like refocus my eyes intentionally. Um, and now, uh, I'm still not allowed to wear glasses cuz my like I couldn't get a useful prescription. My prescription meeting to get my glasses is Monday. Uh, I'm actually can see better than I used to be able to, which is really nice.
That was not I didn't know that was going to be an outcome, but I'm really happy.
Uh and second question >> second question was so powerudget for most laptops primarily goes to the screen.
>> Yeah.
>> Is there enough utility in using screen to actually do that by choice?
>> Doing it by choice.
I think if you got really good at it, you could do it really well.
>> Oh, sorry. Uh power budget in laptops mostly goes to screen. Uh, would it be reasonable to use a screen reader just for no reason? Um, other than power or whatever? Um, probably. I think you could do it. I think you could. I think it would be very fun to try. I definitely have used Ed Brows on purpose before this project.
Um, but uh I Yeah, you should definitely try it. Uh, I don't know if you'll do super well, but I think it would be fun.
Um, I think it would also be really good because it even if you do it for like a week or something, like I say, try going blind for a few days. Uh, I think it'd be awesome.
>> It says GIF.
Yes, >> I just had an idea with being >> Could I make a descriptive video over HDMI uh device or an LLM based describe video describe HDMI stream device basically. Um, you probably could. The HDMI people would probably go after you for money.
Uh, unless you run it on Windows or Mac, but the HDMI people would go after you for money is what I think.
>> I think that's a thing that exists. Uh, I think there are like descriptive video headgearss. Yeah, just to elaborate on that.
>> Yeah.
>> Yeah.
>> Uh comment was that a lot of phones will have look at your like here is a picture. Can you describe it to me just via the camera immediately? And I Yeah, I think that's a real thing. I mean, I've definitely heard of that.
Um I did not try that.
Anything else? Oh, yes.
>> For doing lots of terminal output, did you end up finding a better situation than rows? Like were you able to use like a fuzzy finder to kind of narrow the output so that you didn't have to get like walls of text right to you?
>> Did I end up using a fuzzy finder to manage walls of text read out to me? Um, one of the problems was I didn't realize how bad it was going to be before I was blind. Um, so I didn't really prepare for it as a problem and then I realized how objectively terrible it was and did uh Ed Brows like is Ed so I can like do like GPS and things. Um, so that side of it I could mostly uh use it where necessary uh just using it as a pager.
Um, and obviously I could have used less or I could have used whatever pager you love. Uh, I never ended up using a tool beyond that. I definitely think there is space for that. I just don't know enough of the space because typically, uh, my wife has walked in on me just like catting a log and letting it scroll past my screen for 40 minutes and go, "Oh, hey, there's a minor change in the scroll." And she's just like, "How did you do that?" Um, so that's kind of how I do things already. So I didn't I don't have those in my lexicon, but yeah.
Yes.
>> Was there a key to have the screen reader stop reading so that if you were getting a wall of text you didn't want, you could tell it to buzz off and let you do something else?
>> Was there a key to let the screen reader stop reading? Uh, actually, oh, that's a thing I forgot. Um, so that that T-Max it would all when it got a new input, it would stop reading the old input and start reading the new input. So I'd be trying to read apppropo and it would go 02 and I'd be like oh no what happened.
Um so there was that but I could also just like hit enter and just like get it to go ra dollar.
>> Okay.
>> Um so it would end my like session for me. So there is that like it did work that way.
>> Oh yes. any challenges you need to use a website that has a recapture or something like >> oh recapture. Uh were there any challenges to using recapture or other similar tools?
I never got hit by one. This was March right before the big wave of Cloudflare captas. Uh so I never ended up getting hit by one. I also mostly stayed on websites I controlled like my Fedverse account, things like that. Um but yeah uh one of the advantages of Ed Browse is it does have a JavaScript engine so it will pass things like Anubis usually well thank you all for listening to me talk about this topic.
Double check. Double check. Hello.
>> Double check. Hello.
So hello and welcome to our talk. This is the highly involved version of our talk last year. So we will present today our involving cars education and evolved it in a distributed freebd native platform or framework. My name is Andreas that is Benedicting.
I think that's the name here in the in the community. The cool talk uh cool um system is based on bastil open set FS um distributed Unix education and of course free FreeBSD and jails and also it is based on FreeBSD 15 and I would like to thank first the community to invite us again after our talk last year. That gives us a sign that was not so bad.
>> We must have done something right there.
Yeah. Yeah.
>> And now I will hand over to Benedict.
>> Yeah. So I for those people who weren't here last year or those people who were and don't remember everything yet or anymore um I just want to recap a little bit what we did not too long not too much but just to get everyone on the same page. So uh I'm teaching in the university of applied science and I for the longest time in a Unix course. for the longest time uh I was a bit frustrated. I hand out an assignment the same for everyone and students do it more or less all right and then yeah I'm kind of frustrated by the end result in the exam because they seem not to have learned much or don't recall many things and in recent years they also the the added the addition of AI made it even worse because the assignment has already been solved multiple times and students just pass around the solutions they don't care about the actual learning effects anymore and so I thought how could we improve this a little bit so the classroom is a bit uh looking like this a bit of chaos but uh I try to u give a Unix course not a FreeBSD or Linux specific course but Unix in general even though mainly the labs are based on FreeBSD and uh let's look at the next slide >> yes >> where >> so Benedict and contact me and say okay do you have ID to make my idea to make my work easier and yes I must do master Jesus so yes I decide to help him >> back then it was just a rough idea yeah we could some do something like this and why don't we try that >> and so the the concept that you came up in the thesis was basically this uh jailbased uh system that would uh inject errors into the students systems and then you have to solve the error find it first >> and then uh come up with the solution.
So the the talk is based on the problem then why FreeBSD jails our original architecture from last year then the real problems we encountered during that uh P phase and the testing phase last year then how we evolution it into a distributed framework then our new training node storage life cycle management runtime tracking redesign I talked so I had talked last year about this big problem with the time scheduling and then we followed by a deep dive also a monitoring stack completely new designed a bootstrap automation to um use it on every FreeBSD system for running FreeBSD 15 a live demo what we have learned from this phase and a conclusion on our question phase so yeah so this is a little recap of the original background problem from last time so uh the again the background as I mentioned already I'm a lecturer and I want to have the students a more realistic working environment because they will certainly encounter certain problems later in their work life as well. So I want to prepare them as as soon as possible with these problem solving skills uh while having a low-level uh setup stack so that I don't have to spend too many hours just preparing a single lab. And um the difficulty should also range between easy to not frustrate students but also very difficult for the more advanced students.
And so I thought why don't we create a concept like each students get their little working environment. They can can give get root access in that environment aka a jail. And in those jails we can from the outside inject certain errors, files are missing or permissions are wrong or something. And then the students need to identify the problem, solve that problem and the system would then automatically check whether each student has solved this problem and a high score list would then be presented for the fastest students. They get higher points. And the chaos monkey approach is basically um the concept from what Netflix invented that uh a system gets changed randomly and people need to fix it and that way they learn more about the system and make it more resilient and it doesn't mean that each student get the same error injected. So there's uh also not a much copying of solutions possible. And what we found in the original evaluation, so he created a prototype during his master's thesis.
And what we found while playing this with students that students were engaged, they were also uh seeing much better problem solving skills. And we also find that they were engaged more in the labs and like these kinds of scenarios.
>> The the informations about this you can rewatch in the the recording from last year.
So yeah and even we thought okay with the advent of AI would this concept be completely uh you know not useful anymore or why don't we create two groups one group of students were not allowed to use AI and the other students were and then we figured out okay what how quickly did the students solve it the the ones with AI are they a bit more adv advantaged with that or is it more of a hindrance because they have to type their their prompts And uh we figured out that our system is quite resilient against AI use and the students are also more motivated finding these solutions and getting a bit more practical experience rather than a you know academic example which is not so realistic in the real world.
>> So we now why we used FreeBSD jails. So there are many um tools on the market we could use docker virtual machines or jails. Why not docker? Because docker abstracts so many levels of networking system things real file system behaviors that we would like to teach in our concept. So it was not a perfect tool for us. Then why not using free virtual machines that would be possible but the overhead was so much and we had need so much performance in our system. It was not cost efficient.
That is why we use FreeBSDs that could we could teach everything we would like to teach and of course it's lightweight and super fast and don't need so much um CPU workload or capacity >> resources in >> resources yes then for orchestrate the FreeBSD jails we used the best BSD in a really early version I found during that phase many bugs I reported in Susan we fixed that bug so now we could use the whole capacity of Bastil. We um used Bastil create first and then we only use Bastil clone. It's much faster than last since last year to to create every jail from the scratch clone. I think it needs one or two seconds to clone a whole jail. It was so fast. Then we used open sets for integration as snapshots because if students take mistakes, we need a quick snapshot feature. And then we also implemented a safe server snapshot but we can later do that.
>> Yeah. So this is just a repetition of the original architecture from last year. There's a so that if you can compare it with our new architectures that Andreas will present in a minute.
So the original idea or the prototype was using SSH login to figure out oh does this jail has the solution yet? No it doesn't. Okay goes to the next jail.
Does it have the solution? Oh yes it does. Okay then report back the result.
Um but this took a while to set up properly and it's kind of having an overhead because students might have solved the so the um solution already but next time the jail would come around it would take a couple minutes before that result was collected and that would skew up the high score list and students weren't as accurately monitored or measured as they should have. So we have um for a web server jail that's basically just presenting the high school list for everyone. So during the class students would see the high school list and whether someone else has already solved the scenario and the controller system was basically the central component which was also quite loaded and had a lot of operations to do and then there were end students jails like for a classroom like this 16 students or so each one would get their individual jail and that would also be monitored by the monitor jail and uh we can as the yeah the conductors of this scenario we would look at oh is this uh jail overloaded or is it creating a high CPU load or something so we can still interfere as the lecturers and then the students would basically get an SSH key log into the individual node and you know start working on their scenarios and keep in your mind last year I said the whole logic is based in the controller system and for for gathering the information from the student jail the controller connects via SSH gathering the information via a script and write it back into the controller system. So it was monolytic written and was quite difficult to improve that but keep in your mind this um graphic. In couple of slides we will show the evolution version. So you will see what I mean >> and it was a prototype. We were actually quite happy that it worked >> but we have a newer version now. So real problems we encountered during that phase as I said had a snapshot cow students required when if I could change my username then could I wrote install my own programs could I then make a snapshot so that's initial snapshot for the whole course or not and when it was only doing snapshot creations all the times then we have a runtime tracking problem maybe you remembered last year it worked not so well in the presentation only in Asia I fixed at now um it was based on last and username.
There was mixed up mid usernames and the last um command was executed by a script. So you must connect to the system execute last and then you write the results back.
That is why in inaccurate runtime values because you have delay between the connection and writebacks and that was not so good. Then um we have infrastructure problems. We have PF edge cases but still network works CL LF problems because as you mentioned I worked on Windows now I changed to Mac.
So that was a big problem because I wrote the scripts on Windows and I have TL LF deployment problems. You have bootstrap race conditions and server startup order issues because if you start the first service then you need the next service and depend on the first service it will not work. Maybe you know it on distributed system then problems we encountered PF problems on Apple silicon arm anti- spoof for um uh external interfaces inet doesn't work well also the scrub feature doesn't work well so I for this version I disabled it first maybe Peter has the better idea how to fix this but yes for the first we disabled this feature the another big problem was the monolithic orchestration because Everything goes from the from the center from the controller and this was not possible to to make it really distributed and also workload issues and of course you would like to take red versus blue team scenarios. So you can run two different scenarios on the same time. It was not possible with the old architecture and the P evaluation showed that students has transparency problems. I would like to saw how the the some of the points are cons con uh >> consistent >> consistent together. Yes. And would like to see where there are weaknesses and where should they improve their skills.
That is why I first fixed. So now I come over over the evolution into a distributed framework. That is why I first fixed the transparency problem.
You could see there now the new high score list. You can see a clearly written instruction from the uh naming from the training notes. We now changed in also into training note instead of June node. So maybe you would not use it in your companies then it's easier to present it like training nodes. We have now two clusters for the testing. CE car education zero Charlie 02 for cluster two then training node 03 or 04 in the clusterings. Then now you can click on each cluster or each day and then you see the consumption from the scenarios you see DNS manipulation medium successful or not how many times you need it if you get the bonus or not don't get the bonus if to ones who is interested into the bonus system I also give an advice on our record from the last year then now the new concept that is why I say it's really complex Now um yes you have a base CI host um calcification host as a freeBSD host the machine then you have jails on it uh training node training note uh xepsilon is why because you can want to run it's limited I think under 100 nodes per each cluster so if you have enough power you can do this I would not recommend this um of course you have our c it's the web service old web service client now is uh The naming is also um consistent and of course the clusterings clustering one clustering two and the monitoring jail who is health checking every jails in the system even the host system then the database problem we come later to this it's bendic faults uh yes but he will say something about this the students also reads the high school list as before jumps via SSH page um jump hosting onto the host system from the host system into the jails.
Currently the cluster controller is only softwarebased. It's not a real jail. It was performance things but it's it's okay. It works. So then the training node has an CE client. The client is written on C++ and C really hardware near and um that is quick and fast. It's a hidden process on the systems. The CE client will do many things. I will come later do this. The CE client published via API on HTTP port and you can the C cluster Z cluster will um pull and reply the spart HTTPS requests or HTTP I've now work on HTTPS requests. Then the cluster will um write us into the hosts and the host will also publish this via an export client via JSON API into the C web with with HTTPS push. Um yes the lecturer operates now the CE host controls the CE host select the scenario the scenario selected will be published on the cluster and the cluster will publish it cluster controller and the cluster controller will publish it it on every training node into that cluster.
So it's really complex now but it works well for the orchestration and all the code designs we used picked up go last year we use bash because proof of concept but go is much faster it's multi- um threaded it's it's easy to learn um our design priorities I mentioned execute the script command it's now gone we are now able to manage bus till really fast uh set up jails um multi threaded uh pass and write via JSON files and of course the go code is easy to understand and foot better to for the contributors maybe will join later after our talk then why fits well small static binaries simple cross compilation it's as I mentioned fast and built and of course it is native native natively working on um FreeBSD Why not rust? Because it's higher to learn. The benefit of this is not so big that I won't learn rust.
Yeah. Won't learn rust and more complex ownership models. That is easy why we don't use rust. If you want to use rust, you can make contribution and we can reroute our whole code in rust. So it's up to you.
Our takeaway from this as I said because a frame need to be operational simple portable readable and easy to extend. So and now it's Benedict's time.
So yeah um originally in the talk description we were promising or saying something about a database where that would keep the scenarios that you can load the scenarios from and also the high score list data would then be stored in the database so you can later review it. If you have a multiple day uh event like this or over the semester, you can start from the first week and then on the end last week we would have uh all these results or the scores the students achieved in the database. Uh we took a little bit more time than we anticipated for setting up the whole uh controller scenario and the scenario synchronization. So we haven't had uh success yet or haven't had time before coming here to finish the scenario database. So we uh kind of kept that out and try to implement this by Eurobdcon.
So it's not everything that we promised in the talk today in description but I think the result matters to have something working and then later once that foundation is there we can then attach the database fairly easily to it to have the scenario database in it.
>> Yes. And if we did this then the database should store the scenarios completely and also the locks from the host that you can every time come back to the host and are also be auditable if you use it in your companies.
>> So yeah sorry >> so um but let's talk about about the training node storage life cycle management aka snapshots.
So what students typically want is they have a scenario they don't know whether the solution they try will work and then they would typically ask us what if I do something wrong what if I delete an important file or something breaks then they ask me hey can you make a snapshot and I'm I could do that sure I always do that for starting such a scenario but the students during a scenario also wanted their own snapshots to have a certain state to fall back to and now we can do that by letting the students create their own scenario with their own little naming scheme. And that keeps the instructor aka me or us uh out of the game of running around and oh you want a snapshot okay let's create another one wants to snapshot oh okay I create also another one so if students can do that on their own whenever they need it they are now capable of doing that and I as the uh instructor can individually roll back the scenarios if a student can't connect anymore or something breaks they lock themselves out I can individually restore that particular jail without affecting the other students so they can continue working and we also have uh when we switch scenarios a snapshot life cycle improvement. So in the past uh in certain scenarios for the uh prototype we still had certain edge cases where old scenario data was lying around or some old information was available that the students shouldn't see. So now we have this life cycle management that wipes completely the old state away and we don't have any old scoring data lying around or uh some runtime caches are not there anymore. So everything's properly cleaned up.
>> Yes. And Ben mentioned um in the first of scenarios you get um the pre for before the scenario you get a pre um ID from the um snapshot and the if the student want to request it he only must write it in his bash or or shell command line car snapshot requests and then in um brackets >> in in quotes um what you have want to has as a message for the destructor like before BSDK T76 and then it will result at student requests from the where clustered from the CE classification cluster 01 training node 05 and then the time stamp on it and >> the naming scheme really helps us to understand ah when was this scenario and what scenario did we play and from uh which student was this uh snapshot created.
>> Yes. And we will show this later in our dashboard in the live demo.
So now the runtime tracking redesign as I said the last year it was only based on um an old approach on last who for who the username you want to track via script via SSH it run in really many problems by that then our new approach was runtime tracking directly in the CPP client the next step was directly cur um compute the scenario time into the CPP client, not in the cluster controller and then publish the CPP client. This the time and big JSON file where something written like uh scenarios running time you got user cluster node ID cluster node node namings in a time in a JSON file. So I only print here a really a little example cut out of this JSON. You see locking seconds. I was really fast in that that try 68 seconds. Then scenario which scenario we executed the time stamp the signal the JSON was written there and you could access this via HTTP then note IP and then at pull that is the the HTTPS HTTP pull request.
Now we come into a deep dive.
>> Yeah. So the deep dive is a little bit more geared towards how does this all work and how in which order do these systems work. So first uh we need to roll out the whole thing. If I start a new semester I can't recall how I did it last time. Now I have a little setup routine that does it for me. And the pro provisioning is really quick because we're just cloning jails and each student no matter how many there will be. Sometimes students come in late and register for the class and then they go ah can I also get a VM? Yeah sure you have another one. But if you have a class of 40 students you don't want to manually create jails this way. So we have a provisioning system now which is very fast and it's also possible to do it in like a another setting like in a workshop or uh for a capture the flag event for example. And uh I also can provide my own configuration to it. I can say ah this time I want to give the students a couple files in their jails already and not a plain FreeBSD jail. Uh we also have the integration of the monitoring agent so I can right away see what the students are doing or if jails are running too much CPU or RAM and we also have the automatic initial snapshot creation. So I don't have to take care of all of that when I start a scenario because I want to focus on the teaching part and not doing a lot of technological or uh you know set up stuff to get the scenario running and during the scenario. So I give students a certain amount of time to solve that and the longer they take the less points they get. uh and I can already already track how far or how many students have already finished it. And for the students who are still struggling, I can still send them a couple messages and give them some hints. Hey, have you tried this before or look into the you know book and figure out uh what we talked about last week in the lecture because you didn't attend. And this is all done by this uh workflow.
So the next the big big big core feature of this new architecture is the cow client. I try to um explain it on one slide. It's not so easy because the cows client is so much powerful.
At first the cow client runs on every training node and runs there in the idle processing.
After a scenario would be started, the cows clients track the time was be before the lo the user was logged in before and exclude this time from the scenario starting time. Then the scenario code is loaded into the client because I wrote the um the scenario check script into a temp file loaded into the memory of the client. The client will execute this in C++ in binary code execute this and remove the evidence of the solution from the host from from the jail completely. So the only um place where the solution of the scenario are stored is into the binaries of this code of the client. Then the runtime state also the run this client also operate observes if the client healthy is the is the is the jail healthy got the network informations of the jail if the cluster is available client all load the client configs from the cluster controller locks every messages you could publish these messages um it's not implemented but you are able to publish these messages to the the cluster controller it's a notification service based system. So last year I tried every minute or every 30 seconds if the solution is the does is the if the student are done with the solution. This time it's so you wrote into the solution script how files should be affected to find a solution and even if these files would be uh edited there would be notified because it's watching observer job on these files will be notified to the client and the client will start his work. So you don't lose so much power into the idle phase. Um yes and all the the informations the client will calculate stored something like that and is necessary for the controller will be published via JSON API and so the client turns each training trail into observable self-reporting training node and safe healing of course that is the API you got you have a host name you have cluster ID, you have scenario DS manipulation cluster scenario is running your clust you have your scenario run ID every scenario have new really unique ID um ID and a status of the scenario it's successful or not if successful is true the status could also be no scenario is running so yeah it's really double double the feature but yes then the logging sessions from the scenario So it was the I think the second try and I was really tired. So I need 842 seconds for this. And the time stamp when I got this solution, the time is not um the real time. So I think it was 3 a.m. in the morning. So yeah, >> late night coding.
>> Yes, late night coding. It's the best time to code. So yeah.
>> So how does this work? when I start the class. So at first I pick a scenario that I want to torture the students with and then I deploy this to all the nodes so that everyone has the same base setup. Then the students would start working on that and I have a a baseline like I would say ah for me this scenario to solve it would take maybe five minutes and that's the the baseline that I set. Sometimes students are faster sometimes students are a little taking a little longer. Then the scenario monitoring starts and brings up all these monitors to see in the dashboard what kind of uh you know load each individual jail has. And ideally the student starts solving the scenario sometimes a bit faster sometimes slower and that causes an automatic scoring to start which would then update the high score list and notify the system that the scenario is completed. And then there's two optional parts. We as lecturers can communicate with each jail and the student and say have you tried this before or congratulations or something. They already get an automated uh congratulations messages to notify them that they really have solved the scenario. And you can also as I said roll back individual scenario notes in case students really destroyed uh the the jail and couldn't work with it anymore.
Uh so that's the whole uh run through and so how is each scenario defined?
What components do you need? We need a task description. This is displayed to the students like for some reason the network is down and can you try to identify where where that happened and then there's a template which would be deployed. So sometimes I would roll out certain files that are maybe incomplete or I would manipulate existing system files. So these are the the template parts and then I need to also figure out has the student solved this scenario. So we would have a validation script that would continually run on the jail and figure out has it been solved yet? And the last part is the difficulty. So I we currently have three difficulties easy, medium and difficult or hard. And depending on if I see a class is quite good, they solve the easy scenarios uh very well then I could go one level higher in the difficulty and that also um in uh involves or yeah is part of the scoring system. Harder scenarios give you more points, easier scenarios a little less but it's also shouldn't be frustrating to the students. Right? So this is also part of the uh building blocks of a scenario and we can write fairly easily these scenario definitions. You can probably think of a couple nasty things you can do to systems and as long as your validation scripts figures okay has it solved this then pretty much everyone could write their own scenarios without knowing too much about the base system in the background and you can uh put supporting files in there as I said maybe a log file that's too large and the the disk space goes down or you want to deploy certain applications to it that's also possible by the scenario description.
>> Yes. And for those who are interested in the task description will be sent if you're not logged in by the match of the day or if you're logged in by wall. It's an easy tool and you can send the messages but there's little issue. You got the message twice. If someone of you have ideas how to fix that, please write me in it afterwards.
So now monitoring stake. a new monitoring stake the old one I think maybe remember was a manual graphana data source setup because I didn't find the automated one then export official auto variable after restarts because the jail the exporter note exporter tool may sometimes doesn't work if you start the processes into the bastil file in the wrong order now this is fixed and manual permitters target configuration. So I must log into a first time into the system. I must configure the graphana completely auto manually.
Now the fixed version it's automatic you only install you execute your scripts and then everything is there is there I use as default um for graphana the node exporter porter default um dashboard it's 4206.
It's the recommended one for node exporter and for mitosis to use it in in in together with kafana and now it's completely automatic. So then you have one command deployment automatic monitoring setup cluster visibility is also given. You can select into the graphana cluster nodes nodes or student nodes or um system like monitoring or something like that and it's a safe feeling. So I report this to a log and the log execute scripts and if some process um turns down it will be completely automatically restarts and most of the cases as you know a reboot fix all problems.
Yes.
So the safe hilly monitor infrastructure our problems testing monitoring components as I said rail jail restarts system reboots bootup operations service failures now the framework automatically validates node exporter prometo graphana and cow's web export that it's one I we set together into the hacker launch on Tuesday >> yeah last changes >> so then automatic recovery health checks to verify service availabilities. So service node uh exporter will be restarted. Service permitters will be restarted or service kana will be restarted in case of errors. Then benefits we got reduced in structure workload because if something goes wrong bendic don't need to log into the system and fix the system manually. Um we have now increased platform reliability faster recoveries because automatically and an improved monitoring accuracy during the training sessions.
So that is the new build. You don't see any new changes, but maybe in the top um raw you saw car training nodes and then the IP from one training node and in the car's training nodes I could select also the web monitor or something like that.
Yeah. So the bootstrap automation is again my little install script at the beginning of the semester to roll out the whole stack and don't have to worry about individual nodes setup and special things. So what it starts with uh I needed to automate all of this because the more components you have the more difficult it is to start them in the right order and maybe I forgot to install a package or something. So I want to have a an installed system that runs all these system from beginning to end and then I know that a certain state has been reached that this uh scenario can be started and uh in case if uh during the semester I need to you know tear down the system again I can restart it with the same uh bootstrap. I just basically run the bootstrap shell script which basically runs all these installed scripts. then uh the CBCTL configure and uh that makes some uh changes depending on my scenarios and then installing it to all these jails that are uh part of my class this semester and yeah then uh here are some challenges that we learned during the automation phase. So I thought okay automation how hard could it be once it's automated it will run forever and have no problems. uh but as we said we found out that the that the certain line that we showed earlier uh behaved differently on the Apple silicon arms and x86 system. So we originally had a central server system but uh the testing was done on our uh max here and we for some reason this line in the pfcon caused the system to not work anymore and once we commanded this out it suddenly worked for some reason and that's what we have to figure out why that is. Uh the loop back networking sometimes required a careful handling because if you don't do it properly then the old connection is still there or the interface is still there. It would try to connect to the old one instead of the new one. So that caused a couple headaches and the interface discovery was a bit unreliable in the original client and we tried to make this a bit more uh reliable so that we know we always work with that particular interface so the students are not confused like which interface to connect to and startup ordering became also critical because if certain services were started before others then the scenario wouldn't run and we had really crazy uh debugging sessions like why did this not happened. It it used to work before and uh sometimes also the Prometheus exporters failed after the jail restart. So we couldn't see what the students were doing in the jails or what the load would be. Uh so that also needed a bit of uh debugging. Uh but the improvements implemented are now uh creating automatic interfaces that are detected and properly assigned to the jails. Then we have the dynamic PF configuration. So I could say oh in this scenario we're doing a network scenario that needs a bit more port access instead of blocking most of them and that is based on the scenario. So the PF configuration is generated in the dynamic way and the monitoring is automatically provisioned. So I during a scenario just sit there and watch the nodes you know the CPU going up and students are just try to solve this and don't bother me with questions. Um then we have the service health evaluation.
So some systems would be restarted or the jails uh kill the monitoring system it would automatically respawn and then uh the recovery and restart logic is also there.
>> So now live demo >> yeah quick quick one because >> yeah we need to be hurry up okay so I explain a little bit what what we're doing and will uh run the demo. So we have two windows the one is for the instructor and one is a student.
Hopefully you can read that big enough.
And uh so this is the the student logging instructor >> profession >> instructor.
>> Oh the instructor sorry the instructor is at the top and the student is at the bottom. And so the instructor figures okay let's what kind of scenario do we want to run today? Uh again this could also be a uh you know penetration testing or a training scenario in a company. So it doesn't uh limit yourself to be in a uh university environment.
And so we figure okay uh the system is already installed and rolled out and we must uh check now what kind of >> you slide this over like just a teeny bit we're cutting out.
>> Oh we we have some cut off at the the corner. Okay. Yeah. Thanks for letting us know because otherwise people don't see too much.
So uh students get their login and we find out that the scenarios that I so I would be like I think the class is ready for this kind of scenario and then collect the uh information that I want and figure out oh here's me again >> often happens so >> I will run a scenario. Okay, let's run a scenario. Yeah, you've seen it before, right? Um, then we figure out which cluster this should run on. Uh, let's run the first one. And so the initialization starts. And now the students are getting their scenario as a wall message. And that's the indication for the students to start working on solving the scenario. And so maybe you know the solution already. So the students figure, ah, it must be something about the networking. So maybe they try pinging something or checking their network configuration.
And uh up there in the professor's view or the trainer view, you don't see much at the moment. But in that time, you could look at the uh graphana dashboards and see if something is out of the ordinary. Yeah. Oh dear, the name server is wrong. Huh? What did I do there?
Okay. So the student figures this out and then they get this broadcast message. Remember last year it took a minute before this message appears because the check script took a minute and now we have immediately >> first scenario complete five scenarios.
>> Yeah. So that we get a bit more extra.
>> Now you can click on it.
>> Yeah. And then I can see oh in which scenario did I get points or lose points and then students could go back and say ah maybe I need to study a bit more for this particular scenario.
>> And due to the time I was really quick so you saw 77 seconds.
>> Good one. Yeah that that results in a lot of points. Yeah. H maybe we should now show uh other features like you now you could request it house request I don't know the name I show I'm short um our live dashboard you could select now the scenario you run then you see the time is now from from the scenario is running on cluster one 17 seconds and this was now snapshot requested if I now request snapshot and the other ones have not solved the scenario yet. So that's why this is success false.
>> So let's create a snapshot just in case I break something.
>> I don't know the name.
>> Yeah, test.
>> No, it's not the right name.
>> Which slide was it?
>> It's in the slides.
>> Yeah, it's in the slides. So, >> uh, you need to pass the the, uh, ID to it.
>> No, no. I >> switched around. Yeah, we can create an alias for that so that we could both >> Yes, you could create an alias for that.
>> I should make notes. Uh, >> there we go.
>> So, and now you could refresh there. And there is your snapshot.
Yeah, created. And now Benedict could um roll back the system by the central ID.
>> Yeah, >> student has created a different scenario all trying to solve another scenario and that would I like and then copy this as a new >> via the snapshot management. But I think we run out of time. So >> let's continue with the presentation.
>> Yes, let's continue with the presentation.
>> We have a bit of time for into the launched hour.
We stopped in the slides.
>> So what we learned our teaching concept was pract practical hands-on learning increase engagement due to our feedback. Students preferred solving real problems over worksheets and competitive scoring encouraged participants. The the students was really eager to learn more about the system and motivated to do more of that stuff and was really positive feedback from the students. Then our new technical architecture um previously jails scale extremely well.
So I run on my one rack actually I think eight jails and the performance is quite well. So yeah it's perfect >> then set a snapshot automatically simply recovery operations because you saw it I only one command and get my my snapshot and could roll back the whole system and it's really fast and our monitoring become essential as the platform group because you can't do it by yourself. you need a monitoring tool for this and also if the students mention at night who I don't can't do my scenario um yeah it not was not due to our assistance it's due because you don't want to work yeah the more complex more components it has the more the need for creating an automation for it and roll it out consistently and the runtime tracking is now much better than in the original prototype and even though it's a bit more complex we can still uh roll it out in a in a manner that's uh doable and the students don't get too much uh yeah excitement about the uh we had a couple of things with PF behaving certain ways which six years, hundreds of millions of snapshots, zero package dependencies.
Right now, we're all feeling the pain on the hardware side. Who the heck is eating all of our drives? And what are we going to do about it? A huge chunk of the forward drive inventory is going to AI. Frontier openweight models are already 600 800 GB.
So, here's a little math.
Three fine-tuned variants of the same model used to mean three full copies.
With conventional workflows, that's triple the storage cost. Or it's one model plus deltas.
That's the world ZFS and Zelta are built for. Let's get the most out of our deltas.
I'm Daniel Bell. Here are a few things about me, but most importantly, I've been rocking BSDs since the 90s.
I've been doing infrastructure in regulated environments for a long time.
The kind of places where uh bad backup isn't just data loss, it's a regulatory event. I wrote Zelta because I needed a replication tool for my FreeBSD private cloud that matched how I think about backups and iteration. What started as a backup tool turned into an infrastructure and safety philosophy.
As an added benefit, it helps me squeeze a lot of space out of every drive. And that's what this talk is about.
Zelta exists because we have powerful file system with amazing primitives that are hard to stretch as far as they can go. But I don't need something that replaces the built-in ZFS utilities.
What's missing is composition, clear output that makes sense at 3:00 a.m.
tools that understand relationships of replicas and workflows that handle edge cases without blowing up and policies that make sense.
I built Zeltza with some specific principles in mind. It has to be permissively licensed because I want to collaborate with people freely and I want everyone to have zero friction using it anywhere. It has to be ultra portable so I can save a decrepit trunaz server with nothing but a rescue drive in kernneg.
I need Unix principles all the way through so everything composes together fluidly with other tools and scales and I want point of use documentation because that's where documentation actually belongs.
ZFS comes with enough foot guns so I want Zelda to be as safe as possible. I never want to be pressured to delete something that could be important in a recovery scenario.
I want flexibility, so no force naming conventions and no required data set structure on any system. Zelta's safety and Unix philosophy extends well to many projects and environments. In particular, I'm very proud that we've been adopted by the Sylvia project, an amazing VM, jail, and storage control plane for FreeBSD. We don't have the most GitHub stars, but we have big fans in high places.
Zelta's job is to extend ZFS operations to be remote, recursive, safe, and auditable.
Here are Zelta's verbs. You can run Zelta usage or Zelta help to learn about them. Each simplifies one or more ZFS commands.
For instance, Zelta match takes five billion lines of ZFS list and tells you in just a few lines what you actually need to know. Can I replicate right now or am I cooked?
Zelta backup does send and ZFS send and receive but with divergence handling, remote awareness and mountpoint safety.
Zelta takes complex errorprone ZFS workflows and handles the scary edges for you.
And zprone is a new and deliberately separate command because it destroys snapshots. I wanted Zelta itself to never be destructive even by accident.
If someone runs a Zelta command and something goes wrong, I never want them to say Zelta ate my homework.
So I kept the explosive bits in their own script. I als it also turned out to be a useful separation of concerns.
Zelta Prune can safely pipe retention output to other systems with minimal privileges, while ZPR handles summaries and nuking.
Zelta 1.2 BSD BSD can edition is available for preview with its intuitive new features and wonderfully boring defaults.
When possible, Zelta passes through ZFS flags they influence. For example, ZFS send flags will override Zelta's defaults and it'll enforce those flags for ZFS commands it runs.
Three filters, depth, exclude, and include now work globally across Zelta and are much more flexible. For example, you can filter data set patterns, which tells Zelta backup to limit which snapshots it sends.
We added a bookmark option that tags the host name of the recipient host providing a little handy point of use telemetry.
And there are a few other new features that I'll mention during the talk. But most importantly, the majority of you Zelta use cases need just one or two data set endpoints and no switches at all.
Six years of operator feedback also taught me what makes a tool both useful and pleasant to use in production. first predictable hierarchical option behavior basic Unix stuff and the other options follow that lead.
We have an expressive dry run and verbose modes for every option that's proven to be a great CFS training tool at my company. For endpoint resolution, we use a simple SCP style syntax so it feels familiar.
And Zelta isn't designed to replace ZFS for all operations, but instead to keep uh keep the safety boundary visible. If a command starts with the word Zelta, nothing's destroyed. Zprone can't delete data sets or clone origins.
These safety boundaries really matter in production. Last year, a third party update nuked a client's domain controller in the middle of the night.
My junior CIS admin took the call, and this was the first time Zelta Revert was used in production. Not only did she get the VM back online, but using Zelta instead of ZFS roll back meant we retained the broken version of the VM and were able to get to the bottom of the crash.
I'd like to show some fancier Zelta patterns, but let's start with some basic backups.
Zelta backup followed by a pair of data set endpoints replicates the first to the sec second along with all of its descendants. Here we have a production data ba data set on host alpha backing up to a backup server named vault zelta backup alpha beta. Let's call this pair of host twins. They have similar hardware so either one can be our active host for our data sets.
Just repeat the commands on a schedule to keep them up to date and you can repeat them safely no matter what state the systems are in. If something goes wrong, Zelta will tell you. If nothing's changed on the source, Zelta won't create a pointless new snapshot. It just confirms that your replicas are already up to date.
Here are the commands we just used. For regular syncing, Zelta backup replicates the full history from the source to the target.
We can also get more selective.
In the second example, we sync alpha alpha to vault with the snapshot prefix daily and filter it to only send those snapshots. This can be handy if you want to run backups with different intervals and different naming schemes.
A replica data set can only receive new snapshots if it isn't modified. This goes for backups and failovers. If you want to move a data set, use Zelta failover to perform a final sync and swap the readwrite properties of our twins.
Next, we can talk a little bit about data set clones and how they help when something goes ary.
Most people who hear the word clone just think about a regular copy, but a ZFS clone consumes no space. One of Zelta's goals is to make using these zerocost references more accessible. For example, Zelta provides safe alternatives to several destructive ZFS actions using data set clones.
In addition to being free, clones are portable and can replicate to a different pool or host. You can sync one as long as its origin, the snapshot it's referenced from, is available on both sides.
Replicating clones, especially recursive ones, to a remote server, can be tricky with ZFS commands. Zelta provided several different options to help, such as Zelta Backup origin.
The practical result is that you can spin up as many independent writable versions of a data set as you want on as many servers as you want and they only cost you the storage cost you storage for the parts that actually change.
Let's say we need to return a data set to the state of a previous snapshot. A common solution is the destructive ZFS roll back which destroys everything to the previous point. A lot of ZFS pros avoid this. They rename the data set to keep the changes the changes safe and clone it to restore its old state. You can do this rename and clone Tango in one step recursively with Zelta revert.
We've rewound our data set, but now the target is diverged from the source, which blocks the blocks usual replication. Let's fix that.
Zelta rotate us rotate to rename our target and create a clone in its place based on the source delta.
The source and target now have a matching snapshot and replication can continue normally. Zelta rotate helps helps other continuity problems as well.
It fixes replica divergence in three scenarios. If the source was cloned like our Zelta revert example, if the source was rolled back, or if the target has diverged. It's also handy for iterative QA workflows, which we'll discuss shortly.
Using everything we know so far, we can make a warm failover pair using Zelta backup commands.
When data changes on Alpha's data set, Zelta backup from Alpha to B alpha to beta will sync.
But beta to alpha won't because beta is set read only and can't change. Zelta won't take a snapshot unless there's something new. God's own property, right?
That's important for keeping sync continuity in the right direction.
Now we're using Zelta failover to promote beta to the read readr copy.
And now our sync direction is reversed.
Zelta will no longer snapshot or sync alpha to beta.
ZFS really does make this beautifully simplic uh beautifully simple. Let's make things a bit more complicated.
Distributed storage is okay.
The title's facicious, of course.
Distributed storage is amazing. They can scale massively and keep things running when nodes die. Seph is the best choice for many workloads.
But a lot of storage growth we're seeing right now like VMs, databases, and inference have different needs. These workloads need to be performant, correct, durable, well tuned, and recoverable. A distributed system means that when something goes wrong, it goes wrong at scale. If ransomware encrypts your files, the cluster does exactly what it was built to do. It dutifully replicates the encrypted blocks everywhere. Don't forget about the robot army of vibecoded script kitties getting smarter by the day.
And they never sleep.
ZFS doesn't make you immune, but I've recovered from catastrophes on ZFS that simply could not be resolved as efficiently or painlessly in most other environments. ZFS is actually built for performance and durability where most of our workloads live.
Let's talk about managing lots of twins with birectional failovers, disaster recovery sites, ephemeral containers, and readonly replicas distributed where you need them. It can be a lot to manage, especially when your different workflows workloads need different ZFS options. This is where organizing backup jobs with Zelta policy can help.
Here's a snippet of an example Zelta policy file. My company Bell Tower runs our policies from several hardened open BSD Zelta bastions to orchestrate thousands of data sets across sites.
So uh just just to note, Zelta does not need to be installed on any system that it's orchestrating. So open open BSD is a perfect choice.
Having an orchestrator that isn't a storage node is critical for security and operation clarity at scale in our configur in our configuration we can define z uh define any zelta command line option at any point in the hierarchy global definitions by user userdefined site name dell one in our example or by host the site is the unit of concurrence zelta policy will orchestrate one site per number of jobs defined. If you have different patterns or multiple hosts, you can define them repeatedly and Zelta will do the right thing.
Most log messages emit directly from Zelda backup, which supports human readable logs or JSON.
We keep our policies light and manageable by keeping source data sets, backup targets, and option lists in separate files. Here's some options describing how to name the snapshots on the receiving side and source data set definitions. It's very flexible.
This Selta policy dry run output might not seem pretty from this far away, but I promise you it's absolutely stunning when I'm troubleshooting.
Dry runs show a tidy pair of filtered source and target pairs while Verbbo's dry run provides the gory details of exactly what Zelta policy will run.
Zelta policy runs everything defined in its policy file by default but can be filtered by by just adding a site host data set or data set leaf name.
When we're dealing with this much data, we better make the space count. Let's go back to two more cloning patterns I use.
Zelta clone with two operants recursively clones a data set. Cloning a pool action. So we cloning is a pool action. So we can only clone it locally.
But Zelt clone with four operands also fires Zelta rotate retaining sync continuity for itself and its clone.
Both pairs of data sets DS1 and DS2 in this example can now function as independent units, but consume no space on either pool until they're modified.
If you already made the clone and want to sync it with a zero cost clone reference, we can do that as well.
You can provide Zelda backup with an origin or target origin to replicate the clone reference on the target side. The first backup is free. Um, d- origin is the same as target origin. They're the same. Um, Delta will just automatically detect which origin to use on the source side.
You can accomplish amazing things with ZFS workflows based on these.
Three years ago, I was working for a financial firm who ran a full web stack.
I migrated to ZFS, Beehive, and Jails, a database, web frontends, multiple backends, and most disagreeably, four Windows boxes. All of it was running in parallel on a single application stack.
They needed to test v daily versions against real production data. Multiple environments running at the same time for different branches every day. This is where ZFS really shines.
The daily deltas for the QA versions were around 1% of the production total and we got hot independent testable environments on a single base.
Amazing.
Recently, I've been experimenting with workflows using the two other types of iterative storage for ZFS.
An exciting, somewhat recent addition to ZFS is block cloning. When you copy files or parts of files, ZFS can reference the existing blocks on disk instead of writing new ones. As with data set clones, you only pay for the new data. Block cloning happens automatically and works between file system boundaries on the same pool. The fact that copying files no longer consumes space feels like magic.
On its own, block clones provide only source side efficiency. New block clone references must be replicated again when using ZFS send operations.
There's also the new fast ddup feature, replacing ZFS's old painful, slow, and memoryhungry dduplication feature. The new version is painful, slow, and memory hungry, but significantly better.
If you have a case where it works, it's excellent. Identical blocks that arrive from different sources only cost space once. That means you could potentially get space saving of block clones on both sides. Here's an experimental workflow our team is working on that ties together everything we've discussed.
Imagine we have a fleet of FreeBSD jail replicas. They're all regular thick jails. They they're lived in and they have changes their own databases and unique data, but we want updates from an origin.
With data set clones and block cloning, we can do exactly that. First, we perform a Zelta rebase or sorry, we perform a Zelta rotate.
Now, we'll copy a subset of files back based on a mapping file included with an update from the base.
That's it. We have a jail patched from upstream that retains some of its old data and nothing was lost in the process. Another way a friend and colleague said it, thick container convenience with thick thin container uh efficiency.
Of course, the most surefire way to save space by is by deleting stuff you don't need. Zelta Prune helps you plan snapshot retention policies. It provides previews and pipable output and is the engine for zprune which actually destroys snapshots. The visuals you're about to see are real output. Blue diamonds are protected. Red X's are candidates for deletion.
Why do we murder our snapshots?
Usually, it's because we're low on space. The option prune size selects the oldest snapshots until we freed the desired amount. 50 gigabytes of red X's of Doom.
What if we need to make sure make sure we have copies of What if we need to make sure we have copies of everything? Those old snapshots might be the only copy. Adding prune guard unsynced.
Add prune guard unsynced and Zelta protects anything that the target the second operand doesn't hasn't seen. The diamonds at the top row tell us the vault is missing history. It looks like our backup target never received the original several data sets uh several snapshots.
You can also prune by timer or count from the time right now.
Prune time and prune num. Prune time 30 days keeps 30 days or keep the most recent 30 snapshots with prune num 30.
The default policy keeps whichever is more restrictive 30 days and 30 snapshots. So in this example, the default options are identical to per num 30.
For more advanced retention policies and to play nice with other retention tools, you can use Zelta's universal filtering options, depth, in exclude, and include to narrow the set. Here, we're considering only snapshots that start with the word Zelta for deletion while leaving the default 30 days and 30 snapshots. Some older snapshots tagged Zelta will get cleaned out while a few with different names stay. In our example, the first X, the first red X at the top must be the point where we started tagging the snapshots with the word zelta.
For aging data life cycle policies, we have a prune grid that works similarly to ZRrepple's great retention system. We can divine define simple options like this one that keeps one snapshot per week forever.
or a complete list of light data data life cycle windows. And you can make pretty much any retention policy with Zelta Prune. All the options stack. You can filter first by included excluded data set snap data set or snapshot names, data set depth, and replication relationship.
Then select snapshots to destroy by time, size, count, or aging retention grid.
After you investigate your preferred command with Zeltapune, execute your strategy with the destructive zprone.
What it lacks for in emojis, it makes up for in showing the exact CFS destroy commands it's about to run and a summary of how many snapshots will be smoked and how much space you'll get back along with percentages for each data set in your tree. Like all Zelta commands, they work recursively and remotely.
Years of working on my own with the ultra portable A and shell limitations got Zelta pretty far, but it's time for the project to grow. First, we're investing heavily in Zelta and we will always maintain a highly portable and BSD licensed version. I'm never going to give up on the fle on that flexibility for recovery scenarios. We've made some progress moving toward a professional datadriven refactor with testdriven development since the beginning of the year. Zelta no longer needs to be installed on ZFS endpoints. We're running low on reasons to port every feature to 1977.
So we've started to prototype more advanced workflows in Ruby. This gives us the the agility to add additional security features, replication hints, ZFS program workflows, and other things that are too much of a bear and born in Kernnean A.
We also keep enhancing and adding features Now, I'd like to separate our concerns before we move on to talking about my approach to ZFS permissions and security policy. Any questions about the iterative infrastructure concepts of Zelta?
>> Another question from slide 38 36 I think it was back.
>> I think you have predictions there like estimates on savings.
>> Oh yeah.
>> How tricky was that math? because the best math can be all over the map.
>> Yeah, it's not it's definitely not perfect, but um so the the prediction the prediction math in the in the um prune size command is is a little bit tricky because ZFS destroy doesn't even predict properly. So it it's it's it's pretty close, but it it bases it on, you know, it based on snapshot used and does a little arithmetic to to figure out what's likely to be saved. It gets it gets pretty close and is is basically a match with the ZFS uh ZFS destroy dry run. ZPrun just does the ZFS destroy dry run which you know is close. It's not it's not perfect but uh yeah great great question. That was uh you know that was a challenge to to figure out in advance to guess what uh ZFS Destroy was going to produce.
>> Yes. What was the hardest part of this to write?
>> The hardest part of this talk to write?
>> No.
>> Oh, the part the code >> the hardest part of the software to write was probably um uh some of the uh the the rotation logic. So the I wanted Zelta rotate to be able to heal basically any any any type of divergence between any any source and target which is which is critical for so many so many operations.
Um, and the the ZFS documentation is pretty clear about how the target origin works. Um, but in a in a workflow where you sort of want the you know, you basically want to heal a backup. That's that's usually the reason why um, you know, why why I want to do it. So, so getting that right and getting that recursive and figuring it out all the way down the tree got to be got to be a little tricky. My original version of Zelta Rotate was a total mess of logic.
It was just ifs from here to the end of the earth. Um, but uh, yeah, we've been we've been tightening tightening that up a lot and it's a it's a lot more, you know, it's a it's a it's a lot more precise and does a great job.
Yes, >> I I liked your example of the four Windows machines that you turned into ZFS backed and could you elaborate on that a little because I'm not sure I followed how you pulled that off.
>> Yeah. So, it's it's actually so um it's actually similar to that that picked up your question, right? Yeah. Um yeah, so the so the the QA the QA workflow um we we it it would snap it would snapshot and replicate as an entire unit. So all of the all of the VMs and jails were on a you know obviously on beehive and jail in the first place. So it was just one data set tree. So that data set tree after being replicated just needed to be rotated and then that created a clone that could then be booted. So we just literally mounted that mounted that rotated clone and that's yesterday's data and then we did it again and that's the day before and did it again. That's the day before and so on. So this is actually uh really right out of uh ZFS's own you know you know ZFS help documentation. It gives you it gives you some hints on uh you know how how how people how people like to do it. and we implemented you know the the same the same strategy.
Yes, >> you have mentioned on slide something >> you have some you have mentioned on slide eight something with encryption.
Um how is encryption built into the the Sela pro program and did which um tools do you use for this? So, I think I I think what uh uh the mention on the mention on slide eight, I'm pretty sure was just an example of a destructive encrypted action. Um but but I can speak to um Oh, no. I I'm sorry. That's that's wrong. Yes. So, okay. So, so basically ZFS send without flags or with just the large large blocks flag will if it's if the if the data set if the data set parent is encrypted on the remote side uh using ZFS encryption it will it will reenrypt it. So by default Zeltza will try to fight you on that. Uh if the data set is encrypted by default it'll automatically force a raw flag to send that to send that um unencrypted. If it does not see it encrypted on the other side or it sees it being re-encrypted on the other side, Zelta will fall back um to uh to to just the compress option so it can rec so it can re-encrypt on the other side. So there is some logic there to to automatically handle uh all all of those different cases that it's encrypted on the far side, it's encrypted on the near side, it should be encrypted on both raw. That's that's obviously the safest way to work. and what um Zelta will coax you to do if you if you start with it. And and then there's the the re-encryption possibility that that it that it could it could do that. So So basically we're we're essentially just dropping we're we're just dropping the raw flag.
Basically the embedded option is the one that counts. Dash uh D-mbbed uh is the is the is the option that that makes the difference when um when it's choosing to send that data encrypted to encrypted, which is best. That's what we want. Um, and we're actually going to talk about that on the next slide as well.
Does Does that answer your question?
Yeah.
All right. So, I'm going to I'm going to move on to my security model.
All right.
So I have the evidently rare belief that pseudo is not a universal command prefix.
Segregation of duties isn't a good idea just a good idea. It's also often legally required. Here's my mental model for the five distinct separate privilege sets. Obviously this doesn't refer to five different people or five Zelta installations though it could. But we should set up our Zelta our ZFS workflows so that production systems can't touch backups and of course backups can't meddle with production.
Here are the associated ZFS allow permissions for the five security roles.
Um so just another note, Zelta is built on not using root. There is no, you know, dash dash I'm not rooe or anything like that. It assumes that you're not rude and it assumes that you have the least uh the least permissions.
And while we're on the subject, my thanks to the prolific Open ZFS developer Rob Norris, who recently who on my recent suggestion just opened an Open ZFS PR to allow uh to extend ZFS allow permissions with a new send encrypted permission that would limit the backup operator to only encrypted data sets, making it impossible for a backup operator to ever see an unencrypted stream.
Coming soon. Um, so I I think there should be I think there should also be is there a destroy snapshots?
Sorry, singling you out, but is there there is there a ZFS allowed destroy snapshots permission because I think that could be handy.
>> I don't remember you mentioned just open.
>> Yeah, >> yes, it's it's it's in review. I know.
But uh Okay. Yeah, but uh yeah, snapshot snapshot destroy. So, you know, I think I think Zelta's model of having a pruning a retention system that that uh that that's limited only to that that's limited only to destroying specific things could be could be an interesting uh addition as well.
All right, onto my open ZFS wish list.
Well, my biggest wish is coming true.
Open ZFS community members and funding from Trunaz is making user space ZFS a reality. This opens the doors to new portability and security options.
Uh new ZFS on NetBSD anyone >> or other >> maybe? I don't know. Exciting stuff.
Um, I've been told my next wish wish list item is impossible, but I'll mention it anyway since uh dduplication is still a heavy lift and block clone aware replication would be so nice to have. I realized block cloning just matured five minutes ago, but now I want it everywhere.
And finally, what if some Zelta features were part of Open CFS project itself?
some of the replicational relationship stuff, stuff like Zelta match making um and uh easier clone tracking um that might fit into the ZFS command stack, right? Maybe. I don't know. Let's talk about it.
All right. Thank you. Uh so, uh yeah, that's that's all. Any questions?
Oh yes.
>> The most fun part uh what's the most fun part of uh Zeltza to write was the um was the was the uh was the pruning system once I figured out that I could just you know I could just spit out X's and O's to find out what was being saved and what wasn't. uh it really made uh dealing with complicated retention policies and there's like you know some of my clients have you know like uh sli sliding windows and and stuff like that.
So being able to sort of visually go back and forth and you know which which of these features do I want to steal and how do I want to anchor certain um certain features and and stuff like that. making it making it sort of visual and back and forth like that uh made it a joy to work with and it and it actually suits A pretty well unlike a lot of the code that doesn't really suit a well doesn't suit a as well. I mean I I met uh Steven Bourne at a talk the other day and Stephen Bourne doesn't even use Bourne. So, I'm still I'm still sticking with it, though, cuz >> Bash.
>> That's what I said.
>> Yes, >> you can repeat it. Uh, does Delta do anything to visualize the size of relative size of snapshots to know which ones really would make some great to.
>> So the question is does Zelta identify snapshots to uh get an idea of which one to nuke? We're we're >> what size what size it is? Um yeah, I was I was looking at I was looking at sort of what size how packed tightly together snapshots are. Um the the prune g grid does does a little bit of that.
you could theoretically use seconds as a as a window like you know don't don't keep more than x seconds of a snapshot.
So that that's one way and then I was thinking about like you know how to how to deal with like how many if there's a bunch of zerosized snapshots in a in a row could we get could we get rid of them? So that's that's a feature that's kind of cooking right now. I was thinking about it. I I do like it because, you know, on on systems with with 15inute snapshots and like a billion jails, you can really get like a just a horrific number of number of snapshots. And knowing which one of those actually matter so you can clean out, you know, big ones, clean out little ones and and figure those things out in the policy, I think would be hugely helpful. So, I I'll say that's that's, you know, on on our to-do list for sure.
Yes.
>> Um, so you mentioned that this doesn't have to run like a backup machine doesn't have to actually have ZFS on it >> or what did you mention that that was the portal piece like have you tried running that on different systems and what have you tried?
>> Yeah. So, so the way the the way the Yeah, I didn't I didn't get into too much detail of this. The way the the Zelta bastion works by default is wherever Zelta runs, if both of your endpoints are remote, I mean if one of your endpoints is local, it'll obviously pipe it correct pipe the you know run the commands and pipe the commands um appropriately. But if the two if the tube endpoints are separate, if the two endpoints are something else um somewhere somewhere else, it'll it'll try to do the default action is it'll try to do a pull backup from the backup target from the uh uh from the backup target. It'll pull from the source to the backup target. And the reason the reason for that is because uh the the storage servers in a you know in a common environment are are the ones with the least um uh the least attack surface. So since your backup server is the least attack service, the default is that you can you can switch the direction or you could technically make Zelda pipe through itself uh for two remote systems if uh there was some interesting security need to do that. Um but yeah, it's it works it works um really well with uh with SSH agents and um and the command master system that's built into SSH. So it can fire all the commands really conveniently and then orchestrate. But you know uh but sort of be the the third party the third arbiter to you know you know which which systems are primary today and and stuff like that. So so that's the idea. So so Zelta does not have to run on either endpoint.
The commands are the commands are just remotely orchestrated.
Yeah.
All right.
Good afternoon everyone. I'm Sharda Kamal. I'm from the Middle Sex University. We well to be very clear I'm not part of the central IT. We are I'm a member of the faculty of science and tech. Uh we look after the setup at the faculty. And today I'd like you to take I'd like to share the journey that we had when we were trying to migrate from VMware into FreeBSD. We actually went into beehive. A bit about myself. I had been bitten by BSD long time back in the 90s. It was 4.4 BSD light. Got a bunch of disc out of nowhere. And it actually killed my hard drive because back then I was more used to Linux which had partition scheme different to what BSD does. BSD label was something new for me. But I fell in love and I had been using BSD ever since. and I love art, literature, music, good company, preferably a combination of everything.
And I had been a radio amateur for quite some time. So there are lots of uh field of interest that I have to go through.
One thing happened is when I suggest when I offered to my colleagues that okay, I'm going to do a talk, they got in touch with the legal team, legal team got in touch with me because they know me very well and they suggested that I put a disclaimer. So first disclaimer that I have for all of you. If you are here to learn something technical, my sincere apologies. We can talk technical outside. Here is the journey that I took and the journey has nothing to do with technical bits. So if you learn anything from here, that's on you. Don't blame me. Okay.
Right. Uh timeline in a broad stroke. So at some point in time like May 2022, Broadcom announced that they would like to take over VMware and at that time we were extensively using VMware for our entire infrastructure. We will get to that one in a bit. So we decided because it might go anywhere both in terms of cost in terms of usability and lots of other things. We decided that we will start working on a proof of concept system whether or not we can move into BSD preferably beehive and on the timeline around September 2023 we received the go-ahhead from the faculty that yep go ahead with this thing by by November Broadcom finally announced that yes we have taken over so we actually bid them into getting to that bit and technically that is the whole presentation that we saw there is a problem we fixed it all Right, claps.
Thank you so very much. But as we all know, you are not here to just look at this bit. What you want to know is what we had been through throughout that entire process as to whether or not we saw some [ __ ] we saw some live parts and things like that. This is what you are here for.
A little bit of background. Uh I don't know whether I can reduce the volume a bit or maybe I should just speak a bit louder. Anyway, thank you. So we had a very small setup for the faculty. We had about 11 BSXI boxes. We had four storage servers, two dedicated N servers and a handful of switches. And occasionally we would find devices here and there that would pop up because somebody's doing some research on something or somebody's working on something. We'd wonder what the heck is that thing? But those things are always there to help us.
Yep. I'll hold on for a moment. Yay.
Thank you so very much. Fantastic. Well done. See, without the IT people, we would be so dead. Uh anyway, so getting back to the setup. So we had various flavors of Windows. We had uh Windows 10, Windows 11 servers. We had variations of Linux, Debian, Ubuntu, CentOS, you name it with or without GY, lots of different things that we had and we had a lot of free BSD boxes as well because like I said, I had been working with BSD for a very long time and when it comes to stability or when I want reliance, that's the only thing that I'd rely on.
We also had many different services both from from the core team like the IT team offering services for the students for the academics. We also had services that were requested by the academics because they're running different courses, different classes. So we had web ID, we had gitlab, we had web servers, proxies, NAS servers, interesting things that nobody talks about. For example, we had freezer temperature monitors. These freezers are sub-zero - 80 - 90 degrees because we had biological samples and things like that. So all of these things had to be supported by the faculty that's us and we had to find out a solution that would work for everybody.
Uh from the faculty we had five departments about 190 academics around 5k stu 6.5k students and of course colleagues and cos everybody they all used the systems as best as they could we were literally inundated with requests and things like that and the cortic team had five members we had lots of technical tutors I'm not sure whether you're familiar with the term so we have academics who are actually technicians as So they would help us in different tasks, different activities and they would often do it without even asking like we don't have to go and beg them please do that they would come forward they would help us but eventually the main task had to be done by the two of us it's not written here but because we didn't trust anybody else we decided no keep it contained keep it simple we will get through the whole thing and those of you who have not been to Middle Sex University take this as a to some extent correct visual representation. We are literally in the middle of a public park. So public has way of right through the park. They come, they sit around with us, they talk. Students, cohorts, they all go very well with each other. Outside the classroom situ in any institute, you get one or two people who you know who they are. You just tell them, "Oh, you know, I saw a crow the other day." and they come up with a whole bunch of eagles flowing and destructive things for unfortunately we mentioned that okay VMware is being taken over by Broadcom and that spread gradually but exponentially to some extent and this lovely beautiful situation turned into this.
You can very well imagine that what the heck are we going to do? So we decided no this thing cannot continue. We found some key players in the whole situation.
We got them around. We sat together. We explained to them that nothing to worry about. We are on top of it.
What normally we tend to do is in situations like this we have to create a business case because it has to be approved by the faculty. We tell them these are the possibilities, these are the paths that we can take whether you approve or not approve. These are the budgets. These are the costs. Lots of different things, the bureaucracy that we have to go through. And those of you who are familiar with the process, I know you love these kind of pictures where you see that oh this is the process. You investigate, you develop a proof of concept, you write a business case, you present, it gets rejected or it gets accepted whichever way it goes and you go back to the drawing board. Do you all agree that this is the process that you follow?
No. Why not? Oh, you silly. Anyway, never mind. I look at it like this.
This is the thing that you have to do when you go to the faculty. You create a business case. You give it to them. You roll your dice. You can either climb up a ladder because you might might have a don't know PC succeeded or extra funding approved or you might just come down the snake for whatever reason. But normally this is what happens.
So we decided that we are going to have some proof of concept systems. We have to try various different things just to be on the safe side and things we tested uh well we tested several things but these are the things that we considered that we are going to do. We need to select things that will have to some extent a reasonable uh stable history some kind of a uh committed uh commitment long-term commitment learning curve should be good because we have very limited number of people limited resources. So these are the things that we had to go through. But one of the primary concern we had is that whatever we do will that work on the hardware that we have because we had very age-old hardware and we did not want to put them through stress.
And as we suggested that yes this is what we are going to do. Of course our lovely colleagues came up with different ideas or different suggestions. Somebody asked us that you should be using Linux with cheml. We had people who said virtual box is the best thing there. We had people who said VM VMware is the best thing to go or maybe with the do something else.
Windows service we had that suggestion as well. Now you might think that this is getting out of hand. It is still not.
The crux of the thing that came to us was this.
when somebody said well I know somebody who knows somebody else and that somebody else knows somebody else as well and then this is the solution that whoever that person is has suggested and we are going to go to the board with that and you start thinking oh >> what have I done but anyway we went through all of that and we decided that we are going to focus on a certain number of possibilities even though it's not really some people say it's senseless it's and we to some extent agreed on doing certain things in a finite manner. So we decided no appliances because we had challenges with appliances in the past.
Uh many of you know about the NAS boxes that changes names every now and then which used to be free now it's commercial and whatnot. So we decided no appliance at the moment.
the challenges that we had, we did not have any spare hardware because it's a faculty, we are limited and we do not have extra funding. So whatever box we had is whatever box we had. So we juggled things around. We moved all the virtual machines from one box to the rest of the box, distributed them, freed one up, used that as a test box. uh but we needed the actual virtual machines to replic rep replicate it because we had virtual machines which relied on the UU ID or the Ethernet controller or different services that where the licenses were stuck with certain parts of the operating system. So unless we could do onetoone replication we could not really confirm whether that is going to work on the new system or not and that was one of the several challenges.
We also had challenges where the disk images we could not convert them. We tried various different ways but going from a VMDK into a beehive backended either an img or a ZFS which we wanted that proved to be quite a bit of challenge and we also had the problem that we had MBR machines but we wanted everything to be on EUI. We had to sort all of these through bit by bit and we wanted well some of us wanted reproducible automation whereby we could just click a button everything gets replicated from host to the target the host gets shut down the target gets running we went to some extent not entirely now things that we looked at we looked at Linux with chem it had it has long-term commitment which is good uh learning curve was moderate it and cost was free. So everything was okay. The problem came when it came to Linux. Now I know a lot of you love Linux. I love Linux as well. But for me I still cannot comprehend why something that is running so beautifully on I don't know uh maybe Debian does not really work that well on CentOS or Red Hat. They're all Linux. They are all supposed to be the same. it's supposed to be the same code libraries but obviously they're not. So we looked at it we put it into the side as a consideration.
We also tried smart OS and smart was fun. The challenge that we had is we could not really find any documentation or any solution through which we could pass through our devices. We had a few boxes with GPUs high-end GPUs that were used for various different research. we could not pass them through. So PCI pass through was a challenge. We had beehive it did most of the things well. We sadly did not try the GPU pass through but most of the other pass through that we tried it worked. So it was fun and of course because we are using freebl to some extent to various of our core systems. So I was happy with that. We looked at next bizen but I suspect the person that we requested to look at it might not have had enough time to go through the whole thing. So we did not really get a complete picture and we to some extent ran out of time to test it further. So we left it as it is. But the final push was that FreeBSD had a lot of documentation. In my personal experience, it has the best possible documentation that you can think of all over the world, all over the internet and everywhere else. And because I had been pursuing other people to use FreeBSD for many, many years. So, of course, I had a good support base that okay, if something goes wrong, at least I have people I can rely on to look after things. And our current knowledge and expertise was good enough at that point. So we went back to the system whereby we would propose a solution and then we talk to the faculty whether or not we can actually deal with it and this is what happens you go there you sit on the other side of the table now in all fairness these people at least in our faculty I don't know about where you wherever you are these are some of the loveliest people that you can meet outside this room.
You have any problem, you go to them, they will try their best to help you.
And this is a fact. This is not buttering people up. This is the reality. These people got to this position not because well, I don't know how they got there. But many of them I have come across that they are very supportive. They will tell you exactly where to go, what to do, who to talk to to find a solution. But when you are in this room, this is what they are because they have a responsibility to make sure that you did not come up with a halfbaked solution to spend a whole bunch of money and then get away with it. They have to make sure these are the guardians of the faculty. So they have to be very protective as to what they do, how they do and make sure that you are not who do you think send me to the legal team. Anyway, so they also heard about the things that other people suggested like why don't we use Windows, why don't you do that? And going there is not just about here's the solution, let's make it work. It's about why are the other solutions not working or why did you not consider the other bits and we have to go through all of that not just once a couple of times because sometimes they go back they think about things they come back to you and then they when they ask you all of these questions you use the magic board it's going to cost you a lot of money and that helps sometimes not always because then you have to prove why it's going to cost more money than the other bits it's it's a never- ending And as you are playing that game at some point in time the unexpected happens you get approved.
Happy days and the following day the unforeseen happens as well. So the PC system that we were using that was a very old hardware it died.
And when your PC system dies what do you do?
you have to go back to the game because now you need funding for a PC system which is totally separate from what the other thing but you have to go through the whole thing and somewhere in about 5 months time we did get the approval that okay now you can buy the PC system and the unforeseen continued we got a new brand new shiny box and the storage controller did not work with FreeBSD well it did work but it did not work the way we wanted the new controller that came in. It had enforced uh RAID system.
I wanted ZFS. So I wanted to have so I what I had to do technically is create a RAID zero for all the disks put ZFS in it and then with every boot the ZFS would break because Dell's card is doing something in the back end and my FreeBSD is trying to do something else in the other end. Thankfully, the vendor was very supportive when we explained to them what the problem is.
They they sent us a replacement very quickly and we were happy. But the GPUs that we had, we could not make the pass through GPU pass through work.
It was not a happy day.
Thankfully, we did not have too many. So we decided we will follow the gold golden rule and the golden rule is when it comes to a quick fix you have only three friends who can tell me who the three friends are guess any wild guess yourself yourself >> oh no no no it's not that that much thingy but yeah you have the ingenuity you have the duct tape and you have velcro and I kid you not this is a subset of the different kinds of velcro that we have in our This is this is from our own collection.
So we decided to do the right thing.
We got things working and because we managed to do things uh in good time. We were praised by the faculty.
We did not get a raise sadly.
We did not get any bonus either. That's another sad bit. We requested a few more equipment that got denied.
And we did not get much fire water from the faculty which which was the saddest part but at least we did get recognition that yes we helped the faculty to save a whole bunch of money to give you some idea as to the proportion of what has been saved. I do not know who made the deal with VMware when the faculty actually got the VMware. For those 11 boxes, we were paying silly money.
Literally almost nothing because we had a few courses which were which had VMware as a component of the course.
When we requested a quote from VM uh Broadcom later, it took us a long time because we pursued them for almost two years to find a quote like how much are you going to charge us? Eventually for those 11 boxes they wanted to charge us somewhere in the range of £80,000. I don't know how much in Canadian dollar I guess it's almost double plus VAT which is another 20%. So almost £100,000 that we are talking about and faculty was not going to pay 100,000 per year for this particular task. It it was not yes that was an annual subscription that they offered us. We tried to say that well we are an education institute we teach part of your things nah doesn't matter anyway so things happened things ended quite happily for us we did manage to save the faculty a whole bunch of money and got the recognition and then came the question as to what next.
So we are still on the what next phase. We do intend to look into this entire system slightly differently primarily because managing an infrastructure inhouse is required and we are going to do it for much of the things but there is a lot that we can actually push out to the cloud. That is what we think. We also think that a lot of resource even though we have very limited resource but we are not really making 100% of it. So we would very much like to look into how further we can improve that bit.
Thoughts, ideas, suggestions are always welcome.
And the last bit that we intend to do from the faculty at least is that we would like to increase our scope because right now we are going through to some extent a bit of change in the paradigm. So lots of the faculties other faculties are also looking into the possibility of using our resources and as you all know one of the more practical buzzword nowadays is AI. So we are trying to figure out how we can incorporate that bit into our area not necessarily for automation. No, we don't trust AI that much at least not yet.
And that brings me to the end of my presentation. A few credits. So images were of course very I I don't draw trust me. Uh they were edited with Photoshop.
Screenshots were taken and then edited.
Music is from Harip Prash Rashia Buddha Mukharji Lunagetika Verdi George James Bach and edited with a city and my colleagues friends they helped me to put everything together to keep it contained. And the special thanks goes to the audience. I honestly did not expect so many people. I was hoping that most of you would be somewhere else.
Anyway, the organizers who decided to accept my talk. Hopefully, they're not regretting it or if you're regretting it, keep it to yourself. Don't tell it to me. The faculty because they allowed me to do things that I love doing. And of course, my friends and who constantly remind me that as long as they're there, I don't really need enemies. They're good enough.
That's the end of it. Thank you.
Yes.
>> Did you look at freebsd?
>> I did not I did not realize that freebased has a zen. I thought that zen is a net based component and you cannot get out of it. But thank you for s suggesting I will definitely put into my to-do list to have a look for the next iteration. Thank you.
>> Please repeat the question for the stream. Ah the question was whether or not I have considered free BSD Zen and my response is that no unfortunately I did not realize that it was there. Yes, >> it's not a question it's a comment that your office is not that tidy.
>> Don't share the secrets with everybody.
See this is what I say with friends like them. Who needs enemies? But thank you.
>> Any further question? Yes.
>> Um so how would you with There's there's two completely different um API and management plan for it.
There's the Excel one which previously one which was originally from server which was then open sourced and then you've got um XCPNG which is the fully open source implementation of that which full management suite and also VMware migration tools uh and commercial support if you need you that's very solution >> I honestly have not whatever I have listed are the only things that I have uh tested or thought about trying. A maybe I did not know about that. B even if I did remember there were just two of us having to deal with the whole thing and we had a very limited time frame because we knew that whatever solution we come with we have to take it to the board to make sure that they get it approved and that approval has to come before the takeover of the VMware by Broadcom. We to some extent were running against clock. We did not even know when the clock is going to stop. As I have mentioned that I have been trying pursuing VMware, Broadcom and Dell all three companies. I can show you the email threads. Every month I'm sending them a reminder. Can I please get a quote an idea? Nobody coming is coming back to me. And I did not want to be in a situation whereby Broadcom says oh from tomorrow you have to pay for the license. whereas I do not have a solution and I have to continue with all the academics and the researchers whose work is of course very important. So I tried to limit my research to things that I knew or things that I knew is likely to be working but good suggestion and if you don't mind sending me an email with your thoughts then I'm happy to look into it for next iteration.
Yes. When you were still on VMware, was live migration part of your workflow?
>> Please repeat.
>> When you were still using VMware, was live migration something that you were using in uh in that space?
>> Right. So the question is when we I was on VMware, was live migration part of my workload? We could not do live migration because of the disk changing different shape and form. So we had to do it slightly differently as to doing a live migration instead of doing a live migration we had to go through a offline period. So for a system depending on the disk size it could be anything between half an hour to 3 hours and we had to negotiate that with the users and thankfully most of our users except for the academics they were okay and for the academics we actually had to shift it to the summer period when we don't have students. So we we negotiated and we did manage but no we could not do live migration.
>> Thank you.
>> Yeah no problem. Yes. So for the PCIe pass through that was difficult. Did you have use cases other than GPU pass?
>> We did try Ethernet pass through which seemed to have worked quite happily. We did USB pass through that also seemed to have worked. That gave us the confidence. Oh, it's going to work. We honestly did not anticipate that it's not going to work. Sorry. Yes.
>> Yes.
>> How are you managing beehive?
The question is how am I managing beehive? The answer is manually.
I'm still waiting for the project to come to a situation. I understand that behive now has a infrastructure model whereby I can have a GUI or a web- based component and do stuff. I have not gone into that bit to look farther. I have automation. So my automation takes care of the beehive platform for all of those things and to some extent I'm okay with that for the time being.
Any further questions? Yes.
>> What's your networking setup like? Do you pass mix through to the VMs? No. Do you touch them or >> No. So I have well I don't do tap thing.
I do bridge. So all of my boxes they have multiple nicks because we have four different sub different networks that we tap into.
>> So we have the public IP, private IP management and internal. So those individual nicks gets into a bridge.
The physical nicks get into a bridge and the virtual machines have virtual nick tap nicks that goes into the bridge. Do you not have any performance issues with that?
>> Thankfully, no, not yet.
>> Because the performance of tap into into beehive is kind of sad.
>> It is very sad. But then again, we have very limited uses.
>> Yeah.
>> And uh most of the cases that we had even with 100 meg or even gig transfer we did not really face much challenge.
Not yet.
>> Yes. Well, it it has not come up as a challenge yet.
Any more questions?
Yes.
>> On which storage technology was your old VMware system? Do you use?
>> No. So, the question is what was the old uh on the VMware? What was my network sorry storage technology? No, we had just had local storage for most of the systems. We did not use VSAN and we definitely did not use much of well we used some of the NAS boxes as uh NFS clients or NFS shares but beyond that there was not much challenge. It was the disk image transferred that was the challenge because VMware does VMDK and when you're trying to convert a VMDK into something that V uh V behive can understand that there is no clear path and there is always additional challenges.
>> Yes.
>> Can you tell us a little bit more about your approach to automation?
At the moment we are using uh Anible and Anible is taking care of a lot of different things.
We are looking into moving to salt and I don't know whether you want any more details or anything else you are trying to understand. Nope.
>> Okay.
>> Yeah. Yeah.
>> No, we are using well we had been using uh anible for quite some time and we were happy with it. So all we had to do is just make sure that whatever playbook we are creating or whatever role we are creating they can be agnostic to whatever host we are going to and that happened quite happily. That was not not a challenge so to speak.
>> Yeah. Thank you.
Any more questions? No. In that case I'd like to thank you all. Anybody?
>> Oh sorry. Yes.
You must convert into high visible format. How do you do that? And >> I did not >> no that's what I said that that was the challenge. So I could not do that.
>> Okay.
>> So we had to take a different approach.
You want to know the approach? I was trying to get rid of the technical bits but anyway. So what we did is we actually took a snapshot of the system, took it into a different machine and then ran privately in a private network uh a virtual machine using the VMware player but we booted that disc into Clonezilla and our target machine was also booted with Clonezilla. So we could just clone the disc from one end to the other.
It sounds simple but imagine that you have uh MBR system that you're trying to go to UFI.
That's not going to happen because the UFI will not boot your MBR system. So we had to go through multiple iteration of the original machine to make sure that the disk is ready to be booted with UFI before we could actually start sending them back. So that is what we had to do.
So there was no easy way to convert it from decay into >> I could not find one.
>> Okay.
>> I'm I'm not saying there is none because I honestly I tried everything that I could think of my imagination imagination could take me through. I failed. So I took the approach that I could manage and I could get with. If you have a better solution or a better option, I'm happy for you to share.
>> I'm also searching for this.
>> Yes. my company.
>> Uh we offer consultation service by the way.
>> No, it's okay. Yes.
>> You're asking how to convert for that.
>> That was part of the thing that we were discussing because the I could not find a easy path to go from VMDK into ZFS backed behive disk image.
Uh, Kimu has a conversion tool. I accept that. Does it work?
Unless you have tried it. I have tried it. It didn't work for me. It might work for you and all the best if it does. I tried various different format like even I went through intermediary steps from one format to another to another to behive. None of it worked. I spent quite a good amount of time with that. But if if if the thing this is about a year ago, so within this year, if something has improved, I'm happy.
>> Yes.
>> There's a there's a um a guy that a Unix guy that works for Windows >> that built a thing he calls CIS internals.
>> Ah, yes.
>> And he has one called a disc 2 VHD.
Um, I've used that before to take a physical Windows machine, turn it into a disc, and then I can run that.
>> I did try the VHD method. For some reason, it did not work. But I would suspect because I had MBR based systems where I was trying to go to a UFI system. That could have been one of the reasons. But the only practical solution that I came I could find was to convert everything before I need to migrate and then migrate through uh using the clone that that worked for me.
>> Okay. And then if you are going to switch to salt I use salt and anible um salt has a salt master and if you if you're upgrading it >> you're going to need to upgrade the salt master first.
>> We have not gone to the salt yet. We are looking into it. So yes but thank you for giving the heads up. I'm definitely going to look into it.
>> Yeah, because otherwise your minions will get all all Yeah, something will go wrong and that'll be the reason.
>> Okay, thank you for the heads up.
>> For questions like these, there is a Beehive production user call every Thursday. All are welcome.
If you gave me that link, I could have put it up there, but there's still time.
Thank you. Any more? Yes. Any more questions?
Nope. In that case, thank you everybody.
Thanks for being here.
Check, check, check. Test, test, test.
One, two, one, two. From here, farther away.
>> Need to bring it closer to my mouth.
>> Good. Okay, great. Thanks, All right. So, you got your microphone there.
>> Yes. Perfect. Thank you.
>> Yeah.
>> Thanks.
>> Oh, they did a retro.
Do you have any You're leaving?
>> Oh, you're not here tomorrow.
>> You're not here tomorrow?
>> Okay.
>> We'll see.
You're leaving Ottawa.
>> Okay.
>> I guess you're going to I guess you're going to miss half the talks.
>> I guess you're going to miss half of the talks then if you leave today because there's talks tomorrow as well.
>> Uh a whole bunch of talks. Uh I don't have a schedule but it's online but there there's a full day of talks tomorrow.
>> Yeah. Yeah. So, if you have the option to stay and >> watch, >> you can watch them online. Yeah.
>> I guess you could talk to the tech people to see about watching tomorrow's talks online.
>> I guess you can talk to the tech people about watching tomorrow's talks online.
I don't know if you've done it.
>> Mhm.
starting soon.
7 seconds.
>> 70. Okay, >> perfect.
tomorrow morning.
Okay, great.
>> Not exactly ad but within you know more or less.
All right. Uh all right. Good afternoon everyone. All right. All right. So, we have a bit of a story time. So, as my uh slides indicate, tell us how hard could it be modernizing Ziggmans for open ZFS2.4.0.
All right. And for some of you, you might be wondering like what is Ziggman?
Well, it's uh it's an open source NAS operating system based off of FreeBSD.
Uh which means it was descended from BSD which began as UC Berkeley extensions to at Unix. Actually, I probably don't need to go through most of this with the people in the room here. Uh but eventually uh there was previousd and that was the base for uh monow and but what about freess well a lot of you probably have heard about freess uh can I get a hands to >> you heard about freess yeah okay great awesome and uh but then like wasn't that the uh the base for what was true and then we'll started um where I started when I came in the picture so uh our story begins with a question that we have all asked ourselves at some point a question that has launched more doom projects more late nights than any other and a question is how hard could it be right uh in my defense I wasn't trying to do anything crazy uh I wasn't doing any kind of kernel development or making a new distribution I just wanted to upgrade my NAS and uh Ziggman NAS generally tracks previous versions right so 13x is 13x for previous and this was back in the summer of 2024 core.
Um I was running uh Ziggman 132 and uh there wasn't a 14.x release yet even though um FreeBSD 14.0 was already out for 7 months and then 141 just dropped and I thought okay well it's open source it's PHP like I know some PHP like why don't I just build the release myself like like how hard can it be? So download the source like run the build script call it and that took more than a day. Um, and what I didn't appreciate is that Ziggmanas, it's not just a project.
It's an archaeological site. Uh, it's a living museum, museum of assumptions, build scripts, make files, port options, and historical decisions whose original authors have long since ascended to a higher plane of existence.
So, here's some history of Freenass, which was the uh where Ziggnas kind of carries on. uh began with you can see monow back in uh 2003 and started for embedded firewalls very low specs right based on freeBSD 4.7 uh which later had FreeBSD6 and around this time which is kind of interesting let's see where Freeness started um 2005 do I have a date there for free which was based on FreeBSD6. So actually because uh RC's for 6.0 were actually uh quite stable. Sorry. Um Freen cancel. That's right. So embarrassing.
uh was based off FreeBSD6 uh and around um the 7 uh there was a fork in the system and they decid one of the developers uh decided they wanted to do their own um NAS based off of of Linux and then they uh end up producing open open media vault uh and back with the freess uh in E.X they decided they wanted to re rewrite in Django.
Um further down the line um after exists kind of took charge of truness uh of freeness became truness. There are different versions. Uh we had the freeness uh 11 series. There was a truness core which is based off of the original freebie but also they had the truness scale which is based off Linux.
And after the truness core kind of um I know an appropriate word maybe just kind of fizzled um the community came up with a fork of trust core called uh zedvault. I don't know too much about that. Um but this takes place back in the original freeness uh 0.7 code. Uh this is really originally written in PHP. Um, it uh was maintained by some of the original developers back in the day and they renamed it because owned the branding for free nest and they called it NAS for free and eventually that was back in 2012 eventually uh they renamed that um and protected with a trademark Ziggman which is uh trademark since 2018 and I came in uh summer of 2024 and started doing some work right so I thought okay uh how do we get started luckily There's a guide, quick start guide for developers, not really me, but interested users. Yeah, okay, that's me.
Um, and the process of this documentation was basically uh you set up uh they recommended a virtual machine to do it. Uh install FreeBSD and then you just build Ziggman, right? Just three steps and you're done. Okay, so open up the the documentation.
Um on the quick start guide, it says, "Oh, okay. Well, this will use the current Ziggs version based on FreeBSD 11.2.
Uh, and this is like based in like which was released in 2018. It's like okay well it's a little bit old but um you know maybe it's just a program that runs on top of FreeBSD. We'll see. We'll see how it goes. Uh and it guides you through creating a virtual machine. They recommended um 1 gig RAM, 35 gigabytes of storage, and since I was running um uh Ziggman NASA at home, uh that includes um Virtual Box in a front end on in PHP using PHP virtual box, I'll self-host in my basement. I'll do the Ziggman NAS development in my basement using what's already there.
All right, so create a VM and uh from there, the next step is to install FreeBSD. Amazing. great guides uh for this project. Uh since 14.1 out was out uh I decided okay I'll just install 14.1 um and then after you install it you update it. All right, going through the process. Um, recommend you do FreeBSD update fetch install. And, um, probably nobody here has a problem with like recognizing the prompt from your pager. But if you are new to Unix, right? You're just like, okay, update go. A lot of stuff happens and you're like, oh, it's it's it's still processing, right? Right. And it's like, yeah, because it's user user friendly. It tells you colon, right?
It's like, okay, I'm I'm done. Right.
make no you got to like quit that before uh it's telling you what's what's going to be upgraded. Uh after you update the system, you need to install some packages that it uses. Uh it actually um uh recommended ports and um so part of the build system it uses bash port upgrade uh CR tools for building the ISO um and for version control it uses sub version right. So then if you're new to this whole world you're like well like what is what is the port? Uh good thing is uh Freeb has an excellent uh handbook and it says a freebie port is a collection of files designed to automate the process of compiling an application from source code. The files that comprise the port contain all the necessary information to automatically download, extract, patch, compile, and install the application.
Okay, so this is great. How do we get it? Okay, looking at the document. All right, port snap. Perfect. Let's let's run port snap. And we get this uh command not found. We're like, "Oh, okay. No problem. I'll install it.
Install port snap." Oh, it's there. It's deprecated. Okay, great. Uh, and we run the extract. It's like, okay, the uh prerequisite directory is not there. So, just some some little uh speed bumps along the way of of trying to build us even get our our host environment, our VM started, right? and um proceeding uh alternate ways to get uh the port straight. Right? As of 2021, uh git is the primary way to to get the port collection as most of you probably already know. Um and in addition to that, we also need the source code for FreeBSD. So, how do we get that? All right, going back to the quick start guide, CVS up. Um, longer story short, um, that kind of migrated to cub. Uh, there's time for subversion, but then it's recommended the git.
Finally, we get um all our source code uh for FreeBSD. We can compile it. Now, let's move on to Ziggman, the actual source. We install subversion. We check the repository and we're going to build this thing, right? How do we build it?
Right? Most programs you think see it's like configure make and it's like okay let's run make shell it's a big bash script all right so let's what's let's let's run it what happens it's going to it's going to compile everything build everything and give us our our our our image right we run it and we get we get this right uh kind of a choose your own venture on the menuing system right uh here are a bunch of steps that you need to take and you're going to produce your your uh your ISO file, your image file. And as you iterate through the numbers, a lot of it spent in the compile menu. Number two, when you uh there are different options for the different types of images. So I'll talk more into the mic.
Types of images you want to produce. Uh I got number 10, the embedded uh you got live, USB, CD, etc. Uh but you spend a lot of number a lot of time in number two under the compile menu. Uh when you're compiling, we have a lot of different things you can compile. you can uh upgrade the previously source stream import which we already did but if you wanted to do it this way it's kind of a menu driven way a little bit uh more easy for things you might do regularly you want to build the file system structure and most of this is just um hitting the numbers going through the menus and it does everything pretty much automatic come to uh number six the ports that's a lot of the time spent building uh any kind of distribution is building the the software there's huge amount of software Right. So, you pop in the ports menu.
Uh, and for the most part, you just you just go through all the items, right?
Some of you may have seen this if you're familiar with XKCD.
Um, so while running building the ports, all the ports required, I came across this.
And if you're a PHP developer, you're like, oh, that's clearly not PHP, right?
You're like, "What? What is going on?"
Right? And um like I just want to compile this. There's like a uh there's a port. There's a problem with the port.
Foops fops. I don't know how to pronounce that.
And um possibly there's an undefined macro. Maybe, maybe not. Uh so going down a bunch of rabbit holes figured out that oh uh in the port it assumed that you have a bunch of things installed and it was missing lib tool automake gmake.
Um finally got the resolved and going through uh fixing that port we came across a lot more compilation.
Some things went uh really well some things took a little bit of work. Um uh side note, there's going to be a lot of side note side quests here. Um uh not all installations will include some version of dialogue which is kind of required. Uh star uh and then while compiling things um a lot of it was uh was not speedy at all.
Right. So encountered our first kind of dragon.
All right. Does anybody actually know um I guess two things is kind of a mashup of two images. Does anybody know either one of these images? It's from >> Final Fantasy 1. Yeah. Awesome. Uh and the kind of creature there. Does anybody know that?
>> Is that from the Razer?
>> Razer. Uh no. Good good uh good guess.
It's actually um >> logo.
>> Correct. Yes. Yeah. LVM. Yes. Correct.
Yeah. So, so River and so as I'm compiling, um, this is on a VM single core with how much RAM did it recommend?
One, one gig of RAM, right? One gig RAM on a on on virtualized IO. Not >> Yeah. Right. So, one gig. So, this was not speedy. And And then I I go to my my uh terminal. I see like, okay, what's what's going on? It's it's building LLVM. I'm like, why why do I need LLVM for my NAS? And it's like oh it's because it's uh part of like cute six tools which is part of like virtual box which is required for the PHP virtual box. I was like kind of I guess I mean okay sure. Um so building uh through the ports if like if you're not really aware of what you're doing with ports you can make some mistakes and some missteps and um because I did kind of mention dragons. Um there we go. dragons. Uh you come across the Xterm uh source um because um you are dealing with X. If you take a misstep, you can compile X uh was it Xorg apps which includes Xterm which has this little snippet of source.
Uh and that's not the only dragon kind of counter. This is kind of more playful. Um through the previous source, there's an old or screen saver uh console screen saver. Uh this is uh unfortunately not included in in Ziggman but that's like uh an alternate dragon that uh I did encounter. All right. So going back to uh building right after all the compilation um there were other steps along the way to build the image and finally uh going through all these steps.
Several days later we produce our our working 14 uh.1 image. Yay. Perfect. Uh, I'm up to date uh on on the latest version. I thought, okay, um, that was a lot of work. Um, what what do I want to do if the the next version uh, FreeBSD comes out, like 15 comes out, like how do like what do I do? I'm going to go through all these steps again. Is there some way I can kind of make it automated or faster or do things in parallel? Um, and it's like I know some shell scripting like how how hard can it be, right? Uh, do I just keep this for myself? Like maybe someone can like benefit this from this. I can upstream it, but is like is like, you know, are they going to take some some like patch from some rando on the internet, right?
Uh, so who am I? Well, um, by day uh I'm a I'm now a yoga instructor, but uh I used to do some PHP programming on FreeBSD and I've been using FreeBSD on and off since like like the mid n uh mid late 90s. Um, and I thought, okay, so if I'm gonna work on this project, uh, I want to make sure like I'm using it.
Like I can't break things, right? How how do I go about doing this, right? So it's got bit of surgery going on here.
Uh, I want to like break things apart and see where I can change things, but I don't want to actually break things. Um, right. So I'm trying to build automation and speed, but also there's like the social aspect. just like okay I want to make sure that the changes that I'm doing are safe like they're easily reviewable like small little chunks and you know cuz this is a project I want to like be respectful and just to to follow their guidelines and this came in the form of of separating out the main build uh the main make.sh sh into into uh functions.inc.
So that um basically make.sh is gives me a shell that calls all the functions and then if I want to add new functions I can add there and then make a brand new shell script that's going to say like you know create everything.sh.
Um and this is kind of where um a big chunk of this talk uh diverges.
So uh is anyone familiar with yak shaving?
>> Heard of this? Yeah. So, it's kind of like procrastination or kind of these like annoying things that kind of pop up along the way that you kind of have to do but maybe you don't have to do little uh uh side quests. So, originally uh I just sshed in. and I was using using uh vi and I thought okay well scrolling through this giant actually I don't have the the file size here but uh scrolling through this giant uh file of shell code um and I'm not super proficient I'm not proficient at vi uh like it it was a lot so I'm like okay I want something easier it'd be nice if I had VS Code and then maybe I could just do something like um SSH uh SFTP and then just do the files back and forth but VS Code has this nice nice thing called uh VS VS Code server.
Uh has anyone used VS Code Server? Yeah.
Okay. So, if you haven't um what allows you to do is uh there's a server on the back end of um that VS Code installs and allows you to open like a a local terminal. You don't have to have like a separate um uh like FTP client, STP client like I'm trying to think like transmit on Mac or or Cyberduck, something like that. It um it it it's as though you're you're locally editing uh the files. You can open up remote files when you go like uh click on open. You can you can view the files uh remotely.
So you don't have like two different programs and not your your your FTP or SCP and and your editor. It's all built into one. But the problem is that um it expects Linux which is this is not right. So uh what to do there? So thankfully uh Freebusty has a Linux compatibility layer. So while underneath it's freebviously on top um it's uh looking like Linux right like a and then the mascot is tux so I thought it'd be kind of cute to have VC wearing a tuxedo um AI is kind of kind of interesting these days all right so then uh looking online it's like okay what can I do you can make uh a Linux jail and it's not quite what you think maybe you thought oh yeah I want like a like a like a VM of Linux but it's a jail Um, I can just have the full thing there. Um, and so, uh, I kind of dabbled in different, uh, OSS and, uh, for some like spare computers in the in the basement, I I've been playing with Ubuntu. And so, I got more familiar with, uh, devs versus RPMs. Uh, and I thought, okay, well, I'll create like a a jail and uh there's the Debian Bootstrap uh tool that will like bootstrap uh a directory with the the files from the distribution. And then I thought, okay, well, I'll do a version of Ubuntu. I don't have the version that was supported, but it's it's old. Uh and the gibb C provided was um not what what uh VS code server required. It required 2.28 28 at the time. I think it's newer now. And I thought, okay, well, maybe I can also while I'm on this path of like figuring out how to to to get my DRO, I want something without systemd.
So searching like what has systemd and supports dev files. It's uh dev one, but bootstrap doesn't support that. Uh so I'm like, okay, like how am I how am I going to get my my dev environment set up? Uh longer story short, I think I took um just like the Docker OCI image and then just like splatted that uh into the file system. So finally I set it up and yay, we got it working. We right, we can finally start to look at the the the source code, right? We can scroll through it and it has a nice code folding. Uh you can see here on the bottom um uh we're running am Oh, I'm actually running surprise. I'm running 15 already on here. Um but it's it's uh doing the Linux compatibility here. Um so uh this was a VM and I thought okay it's slow. I'm on a single VM. How can I make things faster? Right? Instead of having just the the one CPU with one gig of RAM virtualized IO um instead of building building ports with just no J no J flag I want to throw like all the J's at it. And um let's let's build this in a in a jail, right? And then we can use all the cores, all the RAM. I don't need to dedicate all my RAM because I'm doing this part-time. I don't want to like dedicate all the RAM to the to the VM. And then, you know, later on I want to do something else and have no RAM because it's all in the VM.
Um so I end up creating a jail. Um uh at the time I was using QJL. And let's just go through the process. All right. go through the pro process to build the image. Um there's kind of this this uh the way it builds it is it uh creates a memory device uh and it creates a file system and then it mounts it uh and it copies the files over right um but running in jail some of you may or may not know you get this when trying to do this in jail. It's like, oh, like it's permissions. Okay, so like I'll just maybe like add allowmount.ufs, which doesn't exist. And and which seems kind of strange for for for Unix, right?
It's like okay, like I'm I'm root and like I'm like not permitted. It's like like why can't I just like just do it?
Like I have all the permissions. Just like do it. Um, and as it turns out, like it's not really kind of supported. And I thought, okay, well, you know, previous is open source. Like, can I just change this?
Like, just allow it. Like, it's my own machine. I just want to build it. Like, I know there I was going down the forums and they're saying it's it's kind of a security issue. It's like, yeah, we're not going to do this. Uh, but I'm like, well, it's open source. Why don't we just do it? Um, longer story short, like, uh, I found out someone had posted on how to do it. Uh, not recommended. I thought like I don't want to maintain a separate FreeBSD patch just for this uh so I can build my system. So there's got to be like a different way. Um, and I thought, okay, well, it works on the base system. Why don't I just trude in um the kind of minor thing I thought about that was like, uh, if I have a different file system, uh, when I SSH in, uh, when I use like my my graphical manager, I don't want to have to go through this hierarchy of of of directories to get it there. I want uh a jail for the organization, but I also want um for organization, but I want a truth to actually run the build. Uh so I ended up having like uh just truding the the the the directory of where my jail lives. So I can have uh SSH to do the editing the files and then I'd go to truth through uh the host system and then just run the build from there. So I'm like, okay, this a bit better. Uh, and then alternate timeline further down. I found out uh I can't remember which pod I was listening probably listening to BSD now and I heard through updates through uh FreeBSD 15 that they had um uh reproducer builds and you don't need roots. I'm like, "Oh, perfect. Like I could whatever they did, I want some of that. How can I build this like in a jail?" Because I I want to build in a jail. Um uh and it wasn't new, but uh definitely new to me. Um, they used make a fest and make image and I'm like, okay, that's perfect. Like that's now I can build things in a in a jail.
Everything's happy. Um, but I'm still the problem is I'm still single threaded, right? It's like very manual, not automated. Like what can I do? Like what can I do for how can I speed things up? Um, because I was doing a lot of builds, I want to make sure that all my changes didn't affect anything negatively. So every change I did, I wanted a brand new system. Uh, I'm gonna do a reinstall and then I'm at the time my internet was extremely terrible. Uh, and I'm downloading all these packages.
So, like what's one thing I can do to like make package installation a bit better? It's like, okay, well, everything that's downloaded is cached.
Let's see if I can have this and I'll just uh uh null mount my varcache package just so I don't have to download everything from scratch all the time.
And that that's going to make things a bit faster.
Um, the initial setup, right? uh port snap um was I think gone and uh it recommended git clone right gip cloning um all the ports it's like I don't need the entire history uh and again my internet is kind of kind of quite slow so I thought okay but I found across okay if you just do depth one it's much faster but even faster just for download to getting the ports free on your machine uh was a fetch simple fetch and that was faster than any other methods Um, all right. Okay. So, that improved it. What else could we do? Right? We had to build the world and the kernel. Uh, instead of that, even though we did uh update previously, why can't we just use the previously update process in the future and then building on that? Why can't we just use package base? Right? A lot of this is very manual compilation, which when you're building a DRO, you probably don't need to do, right? So, we'll just use package base for a lot of the heavy lifting. And I thought, okay, perfect, right? How can we make this better, right? We get we get more horses, make it more automated, right?
Uh and then so it turns out that like the best way to to have a package compiled is to not have to compile at all. Have someone else do it for you and and we're done, right? We just install the package.
Um that worked but unfortunately there are some packages uh that were not uh pushed upstream to the free official uh FreeBSD ports. Um and while I did migrate uh I did include some of the packages um some of the files I was uh manly tweaking by hand.
There's a list of files what goes in your distribution uh and I had to kind of handpick them from from the from the packages. And then in the in the boot sequence uh what happens is we have uh two kind of compressed file systems. We have our memory file system root u and then we have our user look with like basically all the all the packages installed and they are of set sizes. So when switching to using packages um right um you can choose what options that you want in your packages. uh with building from ports you have the more flexibility and some of the options had some extra things so I thought it might be nice I'll include it ran out of space right and I didn't realize that right I'm like oh it just built an image perfect let's boot it and what happens and and you get an error okay not found um and I guess uh freeness was probably one of the first or I guess monol was probably one of the first uh distros that um where the config was in XML and had um had PHP uh run command scripts, right? So, um I did not get all the uh dependencies correct on on PHP. So, I had uh some problems. This was not nearly as bad as some of the other problems I booted. And then going back to like how can I make things faster? How can things in parallel? Well, when we're compressing our image, our rootfest image for gip, why don't we add more threads? uh gzip um for my knowledge doesn't support parallel um threads but there's a version called uh pi pig pigy pi pi gz right so I'm like perfect we'll use that xzed was used to crest the um the user line because I think freebies in the embedded boot doesn't didn't support it only supported gip so I'm like okay we'll just take that crank up the number of threads and we run into a memory limit right uh so to bypass set uh professional level 9 t is the number of threads zero is all by default and when you hit memory error uh wall you can say I want to use uh uh a maximum of 75% to have these rapidly iterating uh tests I decided well like I don't need maximum compression I just want to see if it works so found out that uh -1 very very very low compression super fast uh awesome for testing for the for these uh dev test cycles right for testing. Um after I create this image, I want to see that it works.
I need to boot it. I'll boot it in the VM. Um I was using Virtual Box at the time. That's had it. And I thought, okay, maybe since I'm trying I'm trying to improve things, maybe I can get into FreeBSD uh using using Beehive, right?
that way. Um, I can do uh ZFS rollbacks.
Um, and um, and I I can do a bit more auto automation with like uh rolling back my my VMs and starting them.
All right. So, all of this eventually Oh, hang on a sec. Um, so as kind of pre uh mentioned before, I finally did get it working on FreeBSD15, which does which was a reclet that had open ZFS 2.4.0 and with a bunch of shell scripts running T-Max.
Uh, so this is a sped up version uh where I'm kind of compiling it. On the bottom, I've just uh restored uh Ziggman. You can see it's 14.2.0.6.
The top one is rebuilding and deploying.
Uh it's extracting all the packages and it's going to take what it needs from the packages, build the image and and then throw it to the VM and then reboot the VM. All right. Um Ziggman NASA and FreenS because it's got a web gooey, you can just kind of select which which um embedded image that you want to to update to. Uh it also supports uh uh command line versions, so you can do that. And you can see now uh we're on uh back then uh 15.0.0 uh.7 all right for Ziggmans. So everything which took many many many hours was very laborious very manual I can get it down to four minutes and 13 seconds on my on my machine in the basement right um and then I had someone ask me great now that you've done all this have you updated the documentation and no so as I was coming here to make a presentation I thought oh would be kind of cool what kind of presentation software can I use is there something that kind of runs natively on FreeBSD So I was looking and looking and looking looking and um pro probably most of you have heard of Libre Office or Open Office or probably Star Office. There's something called uh Applexware. Has anyone heard of Applexware? Yeah. Okay.
So back then Applexware um I think Walnut Creek had it on a CDROM and this is uh a binary that I think it was made on FreeBSD 3.
>> Yeah. So I'm like, "Oh, okay. Well, it's FreeBSD. It's open source." Like like like how hard could it be? let's get let's get get to work, right? Like I'll install compat 3.x, which doesn't exist.
I'm like, oh, okay. Uh, and it's like 32-bit, but I'm running FreeBSD 1564.
Um, so this was uh way beyond my scope.
So, um, I asked AI, I found a guide on how to get it running on on FreeBSD 6 32-bit. Uh, and I thought, okay, that's going to be a good start. How can I get it going on 64-bit?
and um uh an AI tool made a a shim cuz um I'm trying to think of some of the things has changed significantly. Um I think it was um the etc password something has changed in previous D15 since I don't know when when the change was but it was significantly different.
Um, so I have not yet updated the documentation because I got uh rooted in with this side project, but then uh the night before traveling um I wanted to get a a FreeBSD machine set up, but then it turns out that um my my video adapter didn't work. So that's why I'm doing it on a Mac today. Uh so after getting everything working um I thought okay there um is room to a lot of room for improvement, right? uh building a dro, you're taking little bits and pieces and just kind of like mixing your giant pot.
Uh what is some future work you can do?
Right? What are some additional things you can add? Um it'd be nice uh kind of taking inspiration from Tuness. Uh if we can have some kind of apps or something um uh maybe uh do some work with Podman um for managing VMs. Uh before it was PHP virtual box. It'd be nice if there was something to manage beehive. I've been using VM Beehive. Uh it' be nice if there was a a web guey for that. Um um building everything from ports. Uh instead of that, try to use uh packages where possible upstream the packages so we can just use the ports. Uh maybe even having um some boot environments, modern UI.
And um so one of my side experiments um there was uh a project uh I'm not pronounc sure if I'm pronouncing it correctly sil or sylvvi uh silv yeah and um uh so I did experience with that and um and at the time I didn't uh I don't know if it's still the case but while it was great uh as a front end um it uh it seemed to be lacking uh a CLI version.
So like when when you create a VM or a JL, you can do it all through a guey, but then when you're trying to like trace back to where where that came from, it it it seemed um um I was accustomed to uh Bastil and and VM Beehive where you have like your your files that would create uh your your VM for the Beehive or the JL and be nice to have something like that. Um, so again I ask AI, oh like there's a web API, can you just make me a CLI for the web API and that way I can have a CLI version to do all this stuff. So it's a bit easier uh for me to understand, but if you know how to do uh BI directly, you don't need it. Uh, and that's basically uh my little adventure. Anybody have any questions?
>> Yep. Is there reason you stuck to having the like MD route instead of just like would it make sense for Z to use set of SS file system?
>> Ah yes. Yeah. So then um that's one of the things I wanted to do. Uh I I would think so. Uh I wanted to add like uh ZFS didn't have like boot environments as well.
>> Yeah. And I thought okay instead of like doing all of this work on a project that's not mine, I'm just kind of like this drive by like giving patches.
That's why I thought like, oh, it'd be better if I could just have uh ZFS as Exactly. Yes, I I agree. And then so I had ventured into doing some stuff like that as well. Yeah.
>> And have you ever looked at the tool for building ports?
>> I did and that's where I came across um so when I was building stuff with packages, uh that's where I came across like use package depends which is a huge timesaver. Uh so normally when you build something from from port say I want to build uh um so no I haven't actually looked too much >> okay >> so you would be able to basically download previous from package base and use that to get the base image and then use the depends get all the ports except for any you want to compile with different default options uh and then build it all into image all built for you.
>> Okay, amazing. Yeah, very nice.
>> It might save a lot of pain over some of the stuff you had to do.
>> Yeah, for sure. Yeah, because I was thinking it's like I'm not uh the maintainer of this. I just run it in my basement for my own personal use. So, it's like I don't mind doing it, but then I imagine people would want to do that as well. A question at the back >> is zigma is itma is it can it be used as a drop in replacement for true whatever the the version of traz is called I forget which one it is can it be used as a direct drop in replacement and if it can't do you have any plans to write something for migrating so that all the lost souls stuck on that and get an upgrade.
>> Um, so for true, no, the there was a significant uh fork back in the uh 8.x X days there was basic actually they they said it was a rewrite more than the fork rather switching from um PHP to Django and I think also maybe the build system I think they may have used nanobs I don't know when system yeah so it kind of systems kind of took took over over uh that and I think they switched the build system entirely so this is back to the the old build system so this would be more compatible with true ness that's right freeness >> I'm more interested in installing it and it recognizes my data sets. Yeah. Some of the other stuff not as concerning because it could be reconigured.
>> So it could import the pool.
>> Yeah.
>> And have all your files, but >> you'd have to write something that would walk the silly SQL database and config out of it. So that's like ZFS means your files are all there no matter what to Yeah. So this is uh pre preIX uh codebase.
>> Any other questions?
>> No. Okay.
>> You're all true. I mean the way you discover everything is exactly the same.
>> Okay.
>> The same problem. just like discovering.
>> Yeah. Because you're like, I I don't know. I don't know what I'm doing. I don't know what I'm trying to solve. And I'll just like I just kind of bumbling my way into walls. Yeah.
It's great and very comforting. All right. Thank you all.
Related Videos
LBF101 Creating an XML Changelog
liquibase7511
3K views•2026-06-15
Alta Labs Cloud Dashboard Real time Network & Xnet Insights!
ShinyTechThings
158 views•2026-06-17
Wait... Group Policy Not Applying? Check This First!
keeplearning_iT
144 views•2026-06-15
Leetcode Weekly Contest 506 | Life's boring these days
Pudeesht
2K views•2026-06-14
microJAM: MAKING A MICRO GAME FOR A GAME JAM IN CLOJURESCRIPT AND TOTALLY NOT C
janetacarr
156 views•2026-06-18
Partitioning vs Bucketing vs Clustering: How to Make Queries 100x Faster
thedataandaiguy
194 views•2026-06-16
Design Claude Code Like a Senior Engineer
hayk.simonyan
344 views•2026-06-19
Linus Torvalds: AI Won’t Replace Understanding Code
SavvyNik
140 views•2026-06-19











