Open in app
Cover art for How modern codecs shrink video without hurting quality

How modern codecs shrink video without hurting quality

Technology · 6 min listen

Get the app on mobile
Download on the App Store Get it on Google Play
Cover art for How modern codecs shrink video without hurting quality
0:00
0:00
Transcript

HostWe spend hours every day watching video on our phones and laptops without ever really thinking about the plumbing behind it. It feels like a bit of a miracle that a sharp, clear movie can just play right away over a shaky wireless signal.

HostWhat's actually happening to those huge video files to make them small enough to fly through the air?

GuestIt's all about being a very smart cheat. If we tried to send every single dot of light for every single frame of a movie, the whole internet would basically break. To give you an idea, one second of a high end movie in its raw form would be huge. It would be like trying to download hundreds of big photos every single second. Your phone just couldn't keep up with that much data. So we use what we call a codec. It's just a bit of code that shrinks the file on one end and grows it back on your screen. The way it works is by throwing away almost everything and only keeping the parts your brain actually cares about.

HostBut if you're throwing away almost everything, shouldn't it end up looking terrible? I have seen those old videos online that look like they're made of big, ugly squares.

GuestThat happens when the code is pushed too far or the internet is too slow. But when it's done well, it uses some clever tricks to hide the gaps. The first big trick is just not repeating yourself. Think about a video of a person standing in front of a brick wall. From one frame to the next, the wall doesn't move at all. The person might move their head a little, but most of the picture stays exactly the same. The code is smart enough to see that. Instead of sending the whole picture again, it just says, hey, use the wall from the last frame and only update the bits where the person moved. We're basically reusing the past to save space in the now.

HostThat makes sense for a wall, but what about when there's a lot of action? If a car zooms across the screen, the whole view is changing. Does the code just give up then?

GuestNot at all. It gets even craftier there. It uses something called motion prediction. The code looks at a block of pixels, like a car door, and sees it moving from right to left. Instead of drawing that car door over and over, it just tells your phone, take that block of pixels we already have and slide it over there. It's like a puppet show where you move the same cut out across the stage instead of painting a whole new scene for every tiny movement. If the car door changes a little because of a reflection or a shadow, the code just sends a tiny "fix" file to go on top. It only sends the difference between what it guessed and what's actually happening.

HostWait, it sounds like the computer is doing a lot of guessing. Is that why sometimes if a video glitches, you see a weird smear of colors that looks like a ghost moving across the screen?

GuestYeah, that's exactly what you're seeing! That happens when your phone loses the "key" frame, which is the full picture the code was using as a base. If the phone misses that base picture, it keeps trying to move those old pixels around based on the instructions it's getting. It's trying to move a car door that isn't there anymore, so it smears whatever was left over. It shows you just how much of what we watch is actually a guess.

HostThat's a bit wild to think about. So, we're reusing old parts of the picture and guessing where things move. Are there any other ways it trims the fat?

GuestThe biggest way is by tricking your eyes. Human eyes are actually kind of lazy. We're very good at seeing how bright or dark something is, but we're much worse at seeing fine detail in colors. If you have a sharp line between light and dark, you notice it right away. But if I slightly blur the colors in a busy scene, your brain usually won't even catch it. So, before the video even gets to you, the code throws away about half of the color data. It keeps the brightness sharp so the edges look clean, but it smudges the color underneath. Since your brain fills in the gaps, you think you're seeing a perfect, rich picture, but you're actually seeing a bit of a lie.

HostI don't know if I believe that. I feel like I would notice if half the color was gone. I like to think I have pretty good eyes for that kind of thing.

GuestMost people think that, but the math proves otherwise. We have done tests where we show people a full color version and a version with the color data cut in half, and almost nobody can tell them apart at normal speeds. It's the same for very fine details in the corners or in dark shadows. The code knows that if something is moving fast or if a part of the screen is very dark, your eye won't be able to pick up the small stuff. So it saves its energy for the parts you're actually looking at, like a person's face. It's a constant game of deciding what's important enough to keep and what's okay to toss in the bin.

HostIt's funny to think that the reason we can stream movies at all is that we just aren't as good at seeing as we think we are.

GuestEven the most beautiful movie you stream is mostly just a very smart set of guesses about what the next frame should look like.

HostThose movies on the bus only look so sharp because the code knows exactly which parts of the screen it can get away with ignoring.

Made with Wander

A world of curiosity you can listen to. Explore endless questions, or ask your own.

Get the app