Transcript
HostYou have probably seen those videos of shiny metal robots doing things like folding a shirt or putting a bag of chips into a bin. It looks like the future is finally here, but if you peek behind the curtain, there's often a human being wearing a VR headset and a motion suit making every single move for them. I have been wondering why we're still using people as high-tech puppeteers instead of just writing better code. What's actually going on when a person spends eight hours a day moving a robot arm like a marionette?
GuestIt comes down to a really big gap in how we teach machines. If you want to teach a computer to write a poem or find a bug in a piece of code, you can just feed it the entire internet. There are trillions of words out there for a brain in a box to read. But if you want to teach a robot how to pick up a soft strawberry without crushing it, there's no internet for that. You can't just give a robot a book on how to be handy. You have to show it what it feels like to move through the world, and that's where the humans come in. We're basically acting as the data source for their physical common sense.
HostBut we have had robots in car factories for decades. Those things are incredibly fast and they never miss a spot. Why can we not just use the same math we used for them?
GuestThe robots in car plants are basically just on a loop. They're bolted to the floor and they move to the exact same spot in space over and over again. If you moved the car frame two inches to the left, the robot would just weld thin air. But the new robots people are building now need to work in our world. They need to work in a kitchen or a warehouse where things are messy and stuff moves around. You can't write a line of code for every possible way a bag of chips might be crumpled or every way the light might hit a countertop. Instead of trying to write a rule for every tiny thing, these companies are having humans do the task thousands of times while the robot records every bit of data. The robot is watching what the human sees through its cameras and feeling the pressure the human puts on the grip. It's learning the vibe of the movement rather than a math formula.
HostThat sounds like an incredibly boring job for the human. I can't imagine standing in a room and moving a robot arm to pick up a toy block for eight hours straight. Is it really just about doing the same thing over and over?
GuestIt's actually a bit more complex than that. The people doing this work, who are often called robot trainers or data pilots, have to show the robot how to handle mistakes too. If the robot's hand slips, or if a door it's trying to open is stuck, the human shows it how to adjust. We have this deep, quiet knowledge in our bodies that we don't even think about. Like, when you pick up a heavy milk jug, you tense your arm before you even lift it. A robot doesn't know to do that. By puppeting the machine, we're giving it a map of all those tiny, split-second choices. Some companies have rows of people in VR gear doing everything from sorting laundry to unscrewing jars. They're building a massive library of human touch.
HostI still struggle to see how this scales up. If you need a human to teach every single robot how to do every single task, we're not really gaining much. It feels like we're just moving the labor from the warehouse floor to a VR office. Surely there's a point where the machine has to take off the training wheels?
GuestThat's the big gamble. The hope is that once a robot has seen ten thousand people pick up ten thousand different objects, it starts to understand the general idea of picking things up. It's like how a child learns. You don't explain the physics of a ball to a kid; they just play with it until their brain clicks. Once the robot gets that click, it can go into a simulator and practice a million more times on its own at lightning speed. But it needs that initial human spark to know what it's even trying to do. Without us showing them what a smooth, safe movement looks like, the robots just flail around or move in ways that would break their own gears.
HostSo we're essentially teaching them the feel of being alive in a physical body. But what happens when the robot's body is different from ours? Most of these robots don't have five fingers or a human shoulder. Does it not get confusing for the machine when it's trying to copy a person who has a totally different shape?
GuestThat's a huge hurdle. It's a problem of translation. When a person moves their arm, the software has to figure out how to map that onto a metal limb that might only have three joints instead of a whole ball-and-socket setup. It's like trying to play a piano piece on a flute. You have to change the notes but keep the soul of the music. The robot trainers have to learn how to move in a way that the robot can actually follow. If they move too fast or in a way the robot's motors can't handle, the data is useless. So the humans are actually training themselves to be more like robots while they teach the robots to be more like humans. It's this weird loop where both sides are meeting in the middle.
HostThe robot trainer ends up becoming a bridge between our messy world and the cold logic of the machine.
GuestExactly, and the wild part is that we still don't know if this will be the one way we finally get robots into our homes, or if we'll find a way for them to learn just by watching videos of us on the internet.
HostThe person in the VR headset is the only reason that metal hand knows not to crush the strawberry we're asking it to hold.
GuestThat person is giving the machine the only thing it can't find in a textbook, which is the simple, heavy weight of reality.
HostThe milk jug and the laundry pile are still the hardest things for a computer to master, even if it can write a symphony in seconds.
Made with Wander
A world of curiosity you can listen to. Explore endless questions, or ask your own.
Get the app