The last bastion of easy-to-believe evidence — Video — is about to become as untrustworthy as photographs and text.
We already know that, just because we read that “The President said blah blah blah” — that the President might not have said that. It’s easy to fake text, all it takes is a keyboard and a fibber with an agenda. BAM! Fake News, in your face. Nobody automatically trusts that text is evidence of anything.
Photography, once worth a thousand words, is now worth nothing, when it comes to evidence. Pyramids are rearranged, women become inhumanly thin and long, and people’s faces often appear atop the wrong bodies. As John Stewart once said, “Yeah, we have Photoshop.” Bzzzz — fake news!
We have however, as a society, come to trust video. If we can see it happen, we believe that it happened. Sure, there’s a little blurring around the edges — the camera angle might hide some of the action, the quality might obscure some detail. But in general, seeing is believing,
That trust is about to die.
So far, creating convincing fake video is difficult, expensive, and slow. It takes a Hollywood specifical-effects house and millions of dollars to make a scene that looks, feels, and sounds real. As humans, we’re good at reading human facial expressions — we have low tolerance for RobotFace. We’re good at recognizing voices — we know an impostor when we hear one. When we watch an approximation of a real-life scene in a movie, we marvel when it gets close.
In a year or two, creating fake video is will become quick, cheap, and easy. How?
Here’s one way — Adobe is working on a project called VoCo. It can take 20 minutes of an audio track of a person speaking, and create a voice model based on it. Then you can type in arbitrary text, and hear the person speak your text. In a believable fashion. It sounds good — and will undoubtedly get much better before it’s near-future launch.
Here’s another — a group of university researchers have demonstrated a system that maps your face — from a webcam — to another face. Any facial expression, tick, or emotion that crosses your face is reflected on the other. In this demonstration of Face2Face, they show an everyday joe driving The Donald’s face with his own. And it looks really, really good.
These technologies will be finding their way into consumer products, for fun, gaming, ebook-reading, live-mediated acting, and thousands of other applications. As humans, we’ve always enjoyed mimicry and identity-bending, and these technologies will be all the rage at a Christmas not too far from now.
But what does this mean for recordings of news? The next time you see one man killing another in a video, will you believe what you’re seeing? If you see a politician make a speech and say something controversial, will you wonder if it’s her, or will you assume that it is some kid with Adobe’s Creative Cloud?
As a society, we’ve developed ways of dealing with easy-to-fake evidence. We’ve evolved “authority” — voices that we trust like the Police, News organizations, the Government. At least — our parents used to trust them. We’ve been raised to question everything we’re told, whether it’s news about the President or science about the weather. Many people find it easier to believe completely falsified “news” if it confirms their existing biases and worldview, regardless of the truth. We’ve left “Truth” behind and become a “Truthy” culture, in which nobody can convince anybody of the truth of anything.
Even our major tool in convincing others — Evidence — is about to become “Evidency.”
And down the rabbit-hole we go.