Slashdot Mirror


Researchers Have Figured Out How To Fake News Video With AI (qz.com)

An anonymous reader quotes a report from Quartz: A team of computer scientists at the University of Washington have used artificial intelligence to render visually convincing videos of Barack Obama saying things he's said before, but in a totally new context. In a paper published this month, the researchers explained their methodology: Using a neural network trained on 17 hours of footage of the former U.S. president's weekly addresses, they were able to generate mouth shapes from arbitrary audio clips of Obama's voice. The shapes were then textured to photorealistic quality and overlaid onto Obama's face in a different "target" video. Finally, the researchers retimed the target video to move Obama's body naturally to the rhythm of the new audio track. In their paper, the researchers pointed to several practical applications of being able to generate high quality video from audio, including helping hearing-impaired people lip-read audio during a phone call or creating realistic digital characters in the film and gaming industries. But the more disturbing consequence of such a technology is its potential to proliferate video-based fake news. Though the researchers used only real audio for the study, they were able to skip and reorder Obama's sentences seamlessly and even use audio from an Obama impersonator to achieve near-perfect results. The rapid advancement of voice-synthesis software also provides easy, off-the-shelf solutions for compelling, falsified audio. You can view the demo here: "Synthesizing Obama: Learning Lib Sync from Audio"

2 of 87 comments (clear)

  1. Lipreader says no. by mykro76 · · Score: 5, Informative

    I'm deaf and have been lipreading for more than 40 years. I can confirm these videos are not lip-readable - many words are only half formed (one syllable where there should be two) and the mouth transitions are too jerky. It's a good attempt and I'm positive the tech will just keep getting better, but right now, it's not there yet.

  2. Autist says no. by Anonymous Coward · · Score: 5, Informative

    Being on the autism spectrum, I have a tendency to focus on peoples' mouths when they speak. I would characterize the quality of the generated content as abysmal.

    One of the major giveaways is that phonemes which involve the lips interacting with teeth are way off. Just not even close. The word "visited" looks for all the world like what's being said is "dizited". How can they generate a 'd' motion for a 'v' sound and still have the balls to publish their paper, let alone make any sort of claims that it's believable? It's absolutely galling, precisely because it's only accurate enough to fool people who want to be fooled, leaving those of us who know better shouting weakly from the proverbial back of the room.