How a small open-source AI model rattled the industry

Technology · 5 min listen

Get the app on mobile

0:00

Transcript

HostIt feels like for a long time, the story of big tech was always about who had the most money and the biggest sheds full of humming computers. But then this one small file got out onto the web and suddenly the giants seemed a lot less sure of themselves. Why did this one little model cause such a massive stir?

GuestIt really comes down to the fact that it broke the main rule we all believed in. For years, the thought was that to make a smart tool, you had to make it huge. You needed billions of dollars and more power than a small city. Then a model called Llama got out. It was small enough to run on a normal computer you could buy at a store. People realized that if the recipe is good enough, you don't need a giant kitchen. Within weeks, people at home were taking this base model and tweaking it to do things that the big companies thought would take them years to figure out. It proved that the smarts were in the math and the data, not just the size of the bill.

HostBut if a company like Google or OpenAI has billions to spend, why would they care if a few people are messing around with a smaller version on their laptops?

GuestThey care because it meant they lost their wall. In business, people talk about a moat, which is just a big gap that keeps rivals away. Their moat was the cost of the gear. If it costs a hundred million dollars to train a brain, then only a few players can play the game. But when this small model showed up, that moat dried up overnight. An engineer at one of those big firms even wrote a note that leaked out. He said they had no moat, and neither did their rivals. He saw that people working for free in their spare time were doing things faster and better than the teams with the big budgets. When you give the tools to everyone, the pace of new ideas just explodes. You go from a few labs doing tests to millions of people trying things all at once.

HostI find it hard to believe that some guy in his basement can actually beat a team of a thousand experts just because he has the same recipe. Is it really as good as the giant ones?

GuestWell, the giant ones are still the kings for the really hard stuff, but the gap is closing fast. Here is why. Think of a giant model like a massive library that has every book ever written, but most of them are junk. A small model is like a tiny shelf of just the best books. If you train the small one on really high-quality data, it can punch way above its weight. People found that if you take a small model and give it very specific, very clean information, it can act almost as smart as the big ones for most tasks we actually do every day. Plus, because it's small, it's fast. You can talk to it and it answers back instantly. You don't have to wait for a signal to go to some huge data center and back. That speed and the fact that it stays on your own device is a huge win for things like privacy.

HostSo is the era of these massive, secret models basically over then?

GuestNot quite over, but the focus has shifted. The big guys are now trying to prove that their size still gives them an edge in deep reasoning or planning. But they're also being forced to release their own small versions now. They realized that if they don't give the builders a small version to play with, the builders will just use the free ones. It has turned into a race to be the most helpful, rather than just the biggest. And the weirdest part is that the open versions, the ones where the recipe is public, are getting better at a rate we have never seen. People are finding ways to shrink these models even more, squeezing them into things like phones or even tiny chips in a car. It's not just about a chat box anymore. It's about putting a little bit of smarts into every single object we own.

HostIt's a bit scary though, right? If anyone can grab the code and change it, do we lose control over how these things are used?

GuestThat's the big worry. When a model is closed and secret, the company can put guardrails on it. They can try to stop it from saying mean things or giving out dangerous help. When it's open, those locks can be picked. Someone can just go into the code and strip the safety parts away. So there's this tension. On one hand, you have this amazing burst of new ideas because everyone can help build it. On the other hand, you have no way to pull it back once it's out there. It's like the invention of the printing press. You can use it to spread great books, or you can use it to spread lies. Once the tech is small and easy to share, you can't put the genie back in the bottle.

HostThe whole thing makes the future feel a lot more crowded with these little brains.

GuestThese small models prove that the most powerful thing in the world isn't a giant pile of chips, but a clever idea that anyone can run.

HostThat old computer on my desk might just be the most powerful tool I own after all.

Made with Wander

A world of curiosity you can listen to. Explore endless questions, or ask your own.

Get the app