Transcript
HostI was walking down the street the other day and saw one of those self-driving cars with the spinning bucket on top. It's wild to think that thing is basically screaming light at everything around it just to figure out where the sidewalk is. I have been wondering how a computer takes all that reflected light and actually makes sense of a messy city street.
HostHow does the car take those fast flashes of light and turn them into a world it can actually drive through?
GuestIt's a lot like how a bat sees in the dark, but instead of sound, it uses lasers. That spinning thing on the roof is called Lidar. It shoots out millions of tiny bursts of light every single second in every direction. These bits of light hit things like trees, dogs, or other cars and bounce back to a sensor. Since the car knows exactly how fast light travels, it can time those trips to see how far away everything is down to a fraction of an inch. But when it gets all that data back, it doesn't look like a photo or a movie. It's just a massive, swirling swarm of millions of tiny dots hanging in empty space. We call it a point cloud. To the car, the whole world starts out as this ghostly, glowing fog of points.
HostBut a fog of dots sounds like a mess. If I'm looking at a million points of light, I don't see a person or a bike. I just see a blur. How does the car know that one cluster of dots is a kid chasing a ball and another is just a fire hydrant?
GuestThat's where the brain of the car has to do some heavy lifting. It looks for gaps. If it sees a bunch of dots that are all very close to each other, it assumes they belong to the same object. It basically plays a high-speed game of connect the dots. It draws a box around a cluster and says, okay, this group of dots is one solid thing. But then it has to figure out what that thing is. It compares the shape of that box to a library of shapes it already knows. It knows a tall, thin box moving slowly is usually a person walking, and a long, low box moving fast is a car. It's not looking at the color of your shirt or the brand of your shoes. It's just looking at the size and the outline.
HostThat seems like it would be easy to fool. If I'm carrying a large piece of plywood or wearing a weirdly shaped costume, does the car just see a big floating rectangle and freak out?
GuestIt might get confused for a split second, but it has a trick to handle that. It doesn't just look at a single snapshot. It looks at how those dots move over time. If that big rectangular shape is moving at three miles an hour and it's on the sidewalk, the car thinks, well, that's probably a person or maybe a slow bike. It tracks the path. It looks at the history of those dots. If a group of dots was ten feet away a second ago and now it's five feet away, the car can tell exactly how fast it's coming. It also looks for things like the way a shape bobs up and down. A person walks with a certain rhythm that a rolling trash can doesn't have. The car picks up on those tiny clues to make a better guess about what it's dealing with.
HostI still find it hard to believe it can be that sure. I mean, what if it's pouring rain? Does the car think every single raindrop is a tiny little obstacle it needs to swerve around?
GuestThat's actually one of the biggest hurdles. To the laser, a thick raindrop or a snowflake is a solid object. It hits the water and bounces back, which could make the car think the air is full of tiny walls. To get around this, the car uses filters. It knows that a raindrop is very small and disappears almost instantly, while a parked car stays in the same spot and has a much larger, steadier group of dots. The computer basically ignores the noise. It looks for the dots that stay consistent. It also uses other tools like cameras and radar to double check what the lasers are seeing. If the laser says there's a wall of water but the camera sees the road clearly, the car can decide to trust the camera more in that moment.
HostSo it's really more like the car is building a 3D model in its head that it constantly updates. It's not just seeing; it's predicting.
GuestThat's the best way to put it. It's building a live map of the world that changes every millisecond. The car is always asking, where was this dot a moment ago and where will it be a second from now? It's even smart enough to fill in the blanks. If a person walks behind a parked van, the laser can no longer see them because the van is blocking the light. But the car doesn't just forget the person exists. It remembers that a person-shaped box went behind the van, and it expects that box to pop out on the other side. It's basically dreaming up a version of the street so it can stay safe even when it can't see everything perfectly.
HostThe car is really just tracking the ghost of where we used to be to figure out where we're going next.
GuestThe street looks less like a row of buildings and more like a moving dance of shapes that the car has to keep in its memory to stay on the road.
HostThat spinning bucket on the roof is really just a way to make sure the car never loses its place in the story of the street.
Made with Wander
A world of curiosity you can listen to. Explore endless questions, or ask your own.
Get the app