Open in app
Cover art for Why AI chatbots confidently make things up

Why AI chatbots confidently make things up

Technology · 5 min listen

Get the app on mobile
Download on the App Store Get it on Google Play
Cover art for Why AI chatbots confidently make things up
0:00
0:00
Transcript

HostIt's so strange when you're talking to a chatbot and it tells you something that sounds totally right, but then you look it up and it's just plain wrong. It feels like the bot is looking you right in the eye and lying, even though we know it doesn't have eyes or a brain. I have been thinking about why these smart tools can be so wrong and so sure of themselves at the same time. Why does a tool built for information end up making things up?

GuestIt helps to start with what these bots are actually doing when we type a prompt. We call it talking, but for the bot, it's more like a very high stakes game of fill in the blanks. Think about the way your phone tries to guess the next word you're going to type in a text message. If you type see you, it might guess soon or later. These chatbots are doing that, just on a much bigger scale. They have read almost everything on the open web, so they know which words usually go together. When you ask a question, the bot isn't looking at a list of facts. It's just picking the next word that sounds most likely to follow the one before it based on all the patterns it has seen.

HostSo it's just a guessing game. But that feels like such a huge gap between what it looks like and what's actually happening. If it's just guessing words, why does it feel like it's trying to teach me something?

GuestBecause the patterns it learned are very good. It has seen millions of pages of people being helpful, writing news stories, or explaining science. So it copies that tone. The problem is that the bot doesn't have a map of the real world. When you ask me what color the sky is, I think of the sky and say blue. When you ask a bot, it just knows that in its massive pile of data, the word sky is often followed by the word blue. It doesn't know what a sky is or what blue looks like. It just knows the words fit. So when it gets to a niche topic where it doesn't have a clear pattern to follow, it doesn't stop. It just keeps on guessing the next most likely word. Since it was trained on confident sounding text, it keeps that confident tone even when it's totally off the rails.

HostThat sounds like a big problem if we want to use these for anything serious. Is there a way to give it a sense of what's real? Like, can we just give it a big book of facts to check before it speaks?

GuestPeople are working on that right now. One of the main ways we try to fix this is by giving the bot an open book test. Instead of letting it just guess from memory, we tell it to search a specific set of trusted files first. We call this tying the bot down to a source. It finds a piece of text that has the answer, and then its job is to put that answer into a nice sentence for you. It helps a lot because the bot has a reference point. But even then, things can go wrong. The bot might find the right fact but still trip up when it tries to turn that fact into a sentence. It might swap a name or a date because, at the end of the day, it's still just a word guesser. It doesn't actually understand that a date being wrong by one year makes the whole thing a lie.

HostIt seems like we're trying to put a leash on something that was designed to run wild. If the whole system is built on guessing, can we ever truly trust it not to make things up? Or is that just a part of the deal we make when we use it?

GuestThat's the big debate in the field right now. Some experts think we can keep adding more layers of fact checking until the errors are so rare they don't matter. Others think this tendency to make things up is actually the same thing as the bot being creative. The reason a bot can write a poem or a funny story is because it can put words together in new ways. If you make it so it can only say things that are one hundred percent proven, you might end up with a bot that's very boring or one that refuses to answer most questions. We're trying to find a middle ground where it can be smart and creative without telling you that a fake war happened in nineteen forty five. It's a tough balance because the very thing that makes them sound like humans is the same thing that leads them to make mistakes.

HostI guess it's like having a friend who's a great storyteller but gets the details wrong every time he tells a joke.

GuestThat's a perfect way to put it. We just have to decide if we want a friend who tells great stories or a boring one who never gets a single fact wrong.

HostThe more we use these tools, the more we have to remember that they're built to sound right, not necessarily to be right.

GuestOne of the biggest steps for us is just learning to treat these bots more like a brainstorm partner and less like a search bar that can't fail.

HostMaybe the real fix isn't in the code at all, but in how we choose to listen to what the machine says.

Made with Wander

A world of curiosity you can listen to. Explore endless questions, or ask your own.

Get the app