This approach elegantly bridges the proprioceptive gap by ensuring mechanical parity between human demonstration and robotic execution. It transforms data collection from simple imitation into a high-fidelity transfer of physical intuition.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Getting a Grip on Robotic Data CollectionAdded:
Well, you know, one of my favorite things about this particular design is just how much more expressiveness we have because we have 3° of freedom versus one and just not just using tools, but you can sort of interact with two objects at the same time. I can pick up two separate things because of this thumb design that we've chosen.
Uh and I think we've seen some really creative applications just from our team of utilizing this gripper in ways that maybe weren't intended and uh I I'm sort of excited about just adding the these few degrees of freedom how much more we could do.
Hi, my name is David Watkins. I'm a research lead for the data capture team.
I'm Tim Fofanoff. I'm a mechanical lead on the data capture team.
What are the hardest problems that that you see in manipulation right now for robotics?
Well, I do think about grasping a lot.
Um you know, some of the things are you know, how do we going to do this intricate maneuver? How are we going to be able to twist this knob or turn this uh nut onto a stud or how are you going to be able to um you know, know that you've put the right amount of force on on an object in a certain uh environment. I mean, also for me I I I I think it's important to ultimately eventually have that robot decide what to do. You know, how how does it how does it explore? How does it know like I've I've I've come across this thing. Do I know it's an object? Do I know what it is? Is it classified? Can I can I decide what to do with it next? Um you know, maybe something's happened uh that is unexpected and I think that that's the exciting part of you know, where robotics is going to go is it'll be able to handle things that aren't predicted. Um you know, something that's not not expected to be there. And when I think about humans interacting with the world, there is this tendency for them to do things that are compliant.
And And when I think of compliance, I mean things are moving, things are soft, things are easy to sort of grab with our hands. Mhm. And so I might think of making a bed.
I might think of folding a t-shirt. Uh even picking up one of these grippers is compliant cuz it moves around. It's not stuck. It's not rigid uh in the scene.
Those problems are somewhat easier to model because it's it's this there's this looseness to what is a valid action in the scene. There's a lot of ways to do that thing because things are mobile in a way that's much harder if things are rigid. Mhm. Uh it's it seems counterintuitive, but the the more rigid it is, the fewer solutions that there really are to be able to solve the problem.
And that means that your data has to really indicate very clearly that that is a unique solution.
That I can't have a wide variety of data in order to address that singular problem.
And if we're doing stuff outside, if we're looking to get robots out of the laboratory setting and in truly in the wild, that's going to require a level of being able to handle unstructured problems and being able to apply forces onto the scene and change the way the world looks around them live. We can't statically model the scene and solve the problem in it. We're we're going to have to react to a changing set of sensory information in order to accomplish certain kinds of tasks. So ultimately, we need a lot of data, right? So um you know, making that huge pile of data, um using sensors that are accurate and on a ground truth that makes sense and um you know, can be reused by others and and applied to other totally unrelated geometry would be ideal. Agreed. There's all these different methods that we have for doing data collection.
There's simulation, there's teleoperation, there's watching YouTube videos, there's strapping cameras onto people, and then this Yumi method comes out.
And that sort of gives us this entirely new way of thinking about data-driven manipulation.
And even with just this parallel mechanism, this just one degree of freedom, we have so much capability. And so as we're, you know, thinking through these problems, we have these powerful model representations, these different methods that cover a wide gamut of of different sources of data.
>> Mhm.
And we have to think very carefully about how to collect that data.
Um and be very intentional with the conditioning we provide the model. And simulation is a really effective way, if we can model that problem, uh to quickly get lots of data. Mhm. It requires doing some real-to-sim, which is modeling the real world and getting it into a simulator, uh getting all of the different intricacies of how those objects interact and what the masses are and what the mass distribution are correctly modeled so that I can then apply some reinforcement learning, uh I can even do some teleoperation in simulation, uh and collect sort of diverse data where I'm changing the backgrounds, I'm changing the colors of objects, I'm changing the lens distortion. Like I I I have a lot of more control very cheaply over the domain randomization or the categories by which we apply augmentations to make learning easier.
But there are lots of interactions that are really hard to model in simulation.
If I want to pour out some liquids, all of a sudden the simulation becomes very expensive. If I want to render things, if I want to visually see what's going on in the simulation, it becomes much more expensive. If I want to fold a sheet, that's really that is It's challenging problem in simulation to be able to model. Uh and so we can do each of these things.
Uh and certainly if we had one problem we wanted to solve, simulation would be able to solve a lot of them.
But it struggles to do this for all problems. The real world right now is still its best own simulator. And if we can reduce the overall cost of collecting in the in the world, that's that's really beneficial. Yeah. Yeah.
That's definitely um been something to see from the hardware side looking at, well, you know, you really do need to be careful on how the data is collected. Mhm. Do you think that that's going to um be easier as we go into the future and going forward? Uh it's difficult to get robots into certain places. It's less difficult to get humans there. Uh and we we would be able to collect in these these locations. We can scale this up easier because it's cheaper to build a bunch of handheld interfaces than it is to build a robot per teleoperator.
Yeah. And uh I'm really excited about sort of this new direction that we've had only for a couple of years now uh for doing completely different way of data-driven manipulation.
And so one of the first things we did is we designed this sort of spot gripper to uh sort of accomplish that. Do you Do you remember working on >> Yeah. Yeah. I mean, obviously, you can really picture you know, the the user being the robot, right? Doing exactly where you'd imagine the robot to be.
It's very easy to imagine.
Um and it has everything you need to record you know, where you are and and a view that gives you enough idea of how you approach the objects or the scene or whatever you want to do, task.
But if we wanted to do more dextrous contact-rich manipulation, is that position enough? Are we able to address some of the problems that you've identified previously in manufacturing or assembly just with position control?
Well, there's a lot to that. So there's there's the actual manipulation. Can you turn or pinch or grab or pull a thing in the way you need to.
Um but then there's also there's the whole part of are you pushing on the environment hard enough to accomplish the task. If you're trying to even write with a marker, you you need to apply some amount of force. And you can't you can see the feedback of perhaps the marker, but you wouldn't be able to necessarily know that this is the right thing uh the right amount of force is applied. Um So, certainly that would be an advantage to record. Mhm. That kind of thing. So, the thing that's interesting to me about this is uh if you don't mind Yeah, don't mind.
Um we have these two force sensors just embedded into the fingers directly.
Uh why is this not more commonly seen in other kinds of devices? Why are people not integrating these force sensors across the board even on the robots themselves?
Uh this way.
>> I think we're going to see that more and more. I think it's [clears throat] you know, it's an engineering task and it's probably not where to start. And the reality is the policy needs to use the data that you collect. So, there's certainly going to be some work to do there to to represent it, gather that data, make sure that it's usable. There's a noise that has to be handled. There's all sorts of um details that need to be worked out.
And so, it's I think a matter of where to start. And I think that the UMi was a great great choice in terms of where it began.
Um but I do think that them touch and force sensors are going to be you know, more and more often used.
So, this is a single degree of freedom.
This is a parallel mechanism. There's a lot of things you can do with a parallel mechanism uh that I think would surprise people especially when you're picking this up and getting to interact with things in the environment.
But obviously we had other ideas about how to build more capable grippers uh and I think one of the first things we sort of set off on this project was to do a different paradigm where instead of building a gripper and then building an interface to program the gripper, let's design both things at the same time.
And I think that led us to the Koala grippers.
100%. um right. So, this was you know, obviously um the originally laid out in order to um mimic an existing platform that was available for a robot. Um but going forward, we wanted to design that gripper that would be on the robot.
And that agnostic to whatever robot that is. It could be an arm, could be something like a spot, it could be anything. But, we wanted that to match what the human would be using to gather.
So, we we thought hard about you know, how many degrees of freedom could a human comfortably actuate and um and try to start as kind of in a in a simple or metered way. Um but at the same time, we wanted to be able to handle human tools, tools that are made for humans to use. To and do a large uh spectrum of sizes. You know, be able to pick up something like a pencil or something like a 2-liter bottle. Uh that kind of range of item. Um be able to do a power grasp around um an object or also do a pinch grasp where you really want to, you know, pick something off a table or um or manipulate something that is very fine.
So, as we were designing Koala, what were some of the tasks that we were interested in being able to do with them? We had so many things that we could possibly choose from.
What are things that we haven't seen that we want robots to be able to do?
Well, we were thinking about, you know, human human tools in in kind of like a lab setting, you know, a workshop. And so, things like using a hammer or um threading in a screw or sorting out bolts on a table or um perhaps using a saw or a drill or um you know, ratcheting um a bolt into place or a nut into place.
Some of the things that that got me very excited are both using the tool in the scene, but also being able to just pick up the screw itself Mhm. all in the same exact session. Right.
>> That the robot was able to do uh picking up the nailer, singulating that object from a whole pile of them. Uh and then one of the grippers could come in, hold it in place, and then I could hammer it with the other one.
Uh and I didn't have to use two different kinds of grippers in order to solve that particular kind of problem.
>> 100%. You know, you could you could write with a marker on a sticky note and then peel it off and put it somewhere, and then maybe pick up a screw that's in the way, and then grab a hammer and and and you know, hit a nail. So, certainly being able to do all those things with the same um manipulator uh really opens up that that ability.
We were really thinking about how easy is it going to be for the human to use this device? You know, we didn't want to put overwhelm the human. We don't want to be puppeteering something there. You use a lot of training, and um we wanted something where anybody could just walk up, use it, and you within a couple minutes know how to do it. Um so, that was really important.
And you wanted to be comfortable. So, we ended up with three sizes for this device, so that, you know, the user could use it for a while.
We have a lot of different grippers in the the field that we can purchase. What are some of the like examples that come to mind of things that people typically use for research?
For us, we want to be able to have the human um feel the contact with with that uh item that the the gripper is in contact with, or maybe determine you know, the pinch the force that's being exerted.
>> Mhm. Um so, that means putting linkages into those that the human can then um feel through um by having a gear ratio or a mechanism that um allows that.
Um for us, we want the human to proprioceptively feel the environment through the linkages, and so when we design our grippers, we think of that first, and that really has to do with how the human hand motions can even interact in that way, and then we use that linkage and make sure that at the same time it will work in a device on a robot. So, we co-design the two things together so that we can use that linkage with the hand as well as with actuators on a robot in a packaged way that can actually fit. One thing I know always notice about just how smooth it is to move those fingers.
What is Why is that important? What does that get us when we're actually going and executing things out in the wild?
Oh, I think it's it's enormously important. I mean, that we had this early decision to do linkage-based um motions. Mhm. We wanted to have little back flash. We wanted to really have as much proprioceptive feedback to the user as possible. Meaning that we want the the human to really know that the contact's been made and really kind of feel what they're doing with their with their hand. And um that is something that we also aim to record, and that's really been the focus. We wanted everything to be going down that uh path, and the future version that we have of this in development, right, has has the ability to record all of that. You can get a full wrench of what's happening with this device, and you can have all of the forces that happen on every finger. Um so, that we know what the pinch is, and you also know, you know, how much contact you've made with the environment.
Even how much you know, you can feel the inertia of it. You can even record that.
So, all of that is um I mean, I think of two different kinds of data collection, right? There's these demonstrations uh where we curate them, we think through very carefully what the person is doing and how they do it, and we start and stop demonstrations and and record just that one thing, and we collect a large data set of here's hammering data, here's drilling data.
And then there's this long-running sort of unstructured data collection.
And that's where we'll just pick up the grippers and we'll just go about our day using the grippers in the wild and collect all of this diverse interaction data that is really hard to get a person to intentionally collect.
>> And Mhm. Yeah, I mean, if I tell you, "Hey, go hammer this thing 60,000 10,000 times."
You're not necessarily going to get some more interesting interaction data of uh maybe leaning on a surface with the gripper or what it means to uh bump into a table in the middle of the day. Things that we do with our hands that we are not even aware of most of the time, but are really helpful for defining the space of what it means to interact in the physical world. And we want to sort of describe this language of physical interaction such that the robot is able to more generally understand its own existence in space-time.
Are you feeling good about, you know, the bridge the two a number of tasks together or combining things um in a logical order?
I'm feeling great about it.
But I think it's not something that we're going to solve.
It's something that we're going to understand how to apply to a lot of different things.
I think there's this nice idea about building a general purpose robot that is going to do everything and there's some truth to doing everything, right? Like what what is the everything we care about?
And what we're doing is we're reducing the overall amount of time that takes to engineer one of these things.
If I want a robot to do something new, I now have a finite number of hours I can I I can do a certain set of things and I can have that behavior.
Something that would have taken me months to model uh years ago is now something I can do in a few hours.
So, Tim, what's next for the hardware design for what we're doing on capture?
Um well, I would like to manipulate in a more uh capable manner. Um in particular, being able to twist things, um turn things, you know, use a screwdriver, things that are actually pretty difficult typically with a with a gripper.
Um And but then also just more data coverage. Being able to really uh sense touch uh in addition to the, you know, proprioceptive forces and, you know, the kind of more basic wrenches that we need for the arms, we also want to be able to have within the mechanisms um sensors that would be able to detect how um things have shifted or maybe um how they respond to your force, you know, if they you've squished them or something like that.
David, what do you think is next for the software?
I think we're going to get a lot of very interesting behaviors. What you want to see uh from not just being able to collect it, we want to execute that on the robot. And I think what's what's coming next is series of robust behaviors.
Um we're looking to improve the overall experience of collecting that data and demonstrating that that improvement results in better robot behavior. I think a lot of what we've been doing is getting from zero to one. We want to go from idea to the actual implementation uh before we get to scaling these devices up. And so we've been very intentional about what we've designed and how we've designed it uh so that we can build more robust policies. And I'm really looking forward to forceful interactions on the robot itself.
There's this amazing set of tasks that we have, not just in laboratory settings, but in the entire world.
Uh where I could go to a beach and I could go clean up a beach. I could go out into a forest and I could plant trees. I could go into uh urban centers, pick up trash. These are all things that not only can we collect data on and and simultaneously clean things up and and and do good for the world, but help inform robots on how to do these things.
And it's not just that that is altruistic.
It is also that it is the kind of diverse data that we're not collecting.
That's going to be messy. That's going to be incredibly diverse. We can be very prescriptive about the behavior of doing that thing, but no two demonstrations are ever going to look the same. We're going to concentrate that behavior in the wild.
And that is a completely different capability than what we are able to do just repeating things over and over again in these laboratory settings.
Thanks for chatting with me, Tim. I'm really excited about what comes next for data capture. Likewise. I can't wait to see what's next.
Thanks, David.
Related Videos
Beyond Robotics | European Rover Challenge 2026
beyondrobotics
189 views•2026-06-01
Beatbot Sora70: JetPulse Technology and AI obstacle avoidance and navigation!
DroidModderX
26K views•2026-06-02
Tesla FSD 14.3.3 Hits Phoenix Streets - FIRST LOOK
anthonystesla
114 views•2026-05-29
Elon Musk Just Revealed Fremont Line for Optimus Gen 3 Mass Production
TheAINexusOfficial
180 views•2026-05-30
人機一体「零式人機 ver.2」 子ども企画【おもしろ発見!モビリティー】 #乗り物 #automobile #robot #shorts
KyodoNews
1K views•2026-05-28
Reachy Mini: the $300 open source robot you can actually hack — Andres Marafioti, Hugging Face
aiDotEngineer
662 views•2026-05-29
China’s New Luna AI Robot Looks Shockingly Human...
NextGenHumanoids
850 views•2026-05-28
柔軟指×AI画像処理食品の仕分け作業システム!#柔軟指 #ロボット #自動化 #製造業をもっと盛り上げたい
KiQ_Robotics_Corp.
113 views•2026-05-28











