Nathaniel brilliantly bridges the gap between complex foundation models and DIY hardware by rethinking action space design for real-world robustness. It is a masterclass in turning high-level robotics theory into a practical, open-source solution for household automation.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Can you run a state of the art foundation model on your own 3D printed cable robot?Added:
Hi, I'm Nathaniel and I created Stringman, an open- source room scale CDPR compatible with Lay Robot and designed for picking up laundry. The hardware is stable and launched. So now I get to just drive it around without worrying about missed steps or anything.
And since then, I've been focused on finding the best combination of AI models to give Stringman a useful floor cleaning routine. My aim with this product is to create a lowcost home organization appliance. I call it an appliance because I want to just turn it on and leave it while it does its job and turn itself back off. So, what is the best way to make Stringman pick up laundry, toys, and trash and put them in the correct bins? I've talked about how hard it is to get an ACT model to learn complex behavior, though I have gotten it to do simple, narrow tasks. And because of that, I was relying on ACT networks only for one subtask, grasping things. I was using handwritten behavior to carry the items off to the laundry hamper and drop them. I don't think there's anything wrong with this if it's good enough, but it really wasn't good enough. And I know that there are great foundation models out there that should be able to handle this task. If only they can be fine-tuned to drive Stringman. Lay robot makes it pretty easy to try out new models. You just call them by name. I have a couple of data sets covering various tasks with stringman. So, I've tried training almost every model they have. Most of the time though, nothing happens. the robot just hangs there. But after I found and fixed a ridiculous bug in data collection during my egg hunt episode, I was finally able to fine-tune PI 0.5 and got some pretty reliable behavior out of it. It just picks up items most of the time and it does okay at centering and picking up things that are not in the data set. But the particular data set I ran this experiment on didn't contain anything other than grasps. So I didn't expect more out of it than that. But I was encouraged by these results and I started collecting a data set with more elaborate episodes. In this data set, each episode calls for the robot to seek out an item, grasp and lift it, and then carry it to a suitable destination and drop it there. I collected 200 episodes of this type across two different rooms.
The model did learn to do room traversals, but it seemed to do them in an incoherent direction, and it only grabs something if it spots it right below the gripper, and then it usually carries the thing off in another random direction and drops it somewhere.
Amusingly lifelike, but totally useless.
In hindsight, though, it makes sense.
The action space I chose is velocity in the frame of reference of the gripper camera. So, if something's visible in that camera, the mapping is really simple. But for room traversals, the model would have to do some serious mental rotation.
There's no need to make a huge model like this. Learn that from data. We could just do it precisely with matrices. After all, that's what the human gets to do in the UI. There's buttons to drive from any of the camera perspectives and a few more. And if it weren't for those, it would be undrivable. I want my models to be able to drive in the same way, looking at whichever camera has the most salient spatial relationships and producing a direct and intuitive velocity vector from that. So in the next data set, I used an oversp specified action space. I record the commanded velocity rotated into the frame of reference of every camera. That way, no matter which camera or cameras a target is visible in, there's a straightforward mapping to at least one component of the action space.
In other words, there's three redundant copies of the commanded velocity, and the model will predict all of them at once, and we'll have to somehow recombine them. But I'll just average them for starters. I wasn't able to reabel my existing data with this new action space because I didn't collect enough metadata in the first place, but it doesn't take long to collect a new one anyways. The process is a lot easier now because you can do it entirely from the UI and use cloud-based resources to run the recording session. So, I trained a new model on this data and tested it out. Here's a time lapse of it moving.
Unfortunately, it's really disappointing and I don't always have a happy ending for these videos, but I do have time to honestly show you what happened.
Somehow, this model did not learn how to produce the three redundant velocity outputs like I expected it to. I think maybe the input training set uh was too varied or there's just a bug in my data recording.
But for now, I'm going to go back to what I know works, which is using models just for grasping and then using simple pointto-oint motion for everything else.
In other news, I made it possible to share robots with other loggedin users, use cloud-based sessions for data recording and model evaluation and made another pass on all the documentation.
So, if you're thinking of printing and building a Stringman yourself, now is a great time to do it. Check the links in the description. Finally, I'll be hosting a live demo at the 79° West Coing Center on May 6th at 11:00 a.m.
So, if you're local to Pittsboro, North Carolina, come on over and check it out and you can drive a string man around.
Thanks for watching. See you next time.
Related Videos
Beyond Robotics | European Rover Challenge 2026
beyondrobotics
189 views•2026-06-01
Beatbot Sora70: JetPulse Technology and AI obstacle avoidance and navigation!
DroidModderX
26K views•2026-06-02
Tesla FSD 14.3.3 Hits Phoenix Streets - FIRST LOOK
anthonystesla
114 views•2026-05-29
Elon Musk Just Revealed Fremont Line for Optimus Gen 3 Mass Production
TheAINexusOfficial
180 views•2026-05-30
人機一体「零式人機 ver.2」 子ども企画【おもしろ発見!モビリティー】 #乗り物 #automobile #robot #shorts
KyodoNews
1K views•2026-05-28
China’s New Luna AI Robot Looks Shockingly Human...
NextGenHumanoids
850 views•2026-05-28
Reachy Mini: the $300 open source robot you can actually hack — Andres Marafioti, Hugging Face
aiDotEngineer
662 views•2026-05-29
柔軟指×AI画像処理食品の仕分け作業システム!#柔軟指 #ロボット #自動化 #製造業をもっと盛り上げたい
KiQ_Robotics_Corp.
113 views•2026-05-28











