Deepfake Arms Race

Woman hiding behind white mask © iStock / EPFL 2021

Woman hiding behind white mask © iStock / EPFL 2021

Stories of fakes, forgeries, fortunes and folly have intrigued people throughout the ages, from the Athenian Onomacritus, who around 500 BC was said to have been a forger of old oracles and poems, to Shaun Greenhalgh, who between 1978 and 2006 infamously created hundreds of Renaissance, Impressionist and other art forgeries, amassing more than a million euros and ultimately spending time in prison.

At the beginning of the digital twenties, with increasingly easy access to the AI and machine learning that create deepfakes, as EPFL professor Touradj Ebrahimi says, “we are at a tipping point. We have democratized forgery, and once that has happened, trust disappears.”

Deepfakes are a synthetic media in which a person, or thing, in an existing image or medium is replaced with someone or something else’s likeness to create fake content. They are developed using deep learning methods and involve training generative neural network architectures, such as autoencoders or GANs – generative adversarial networks.

Deepfakes in the garden of good and evil

Despite exploding onto the scene only four years ago, deepfakes are now notorious for their use in nonconsensual celebrity and revenge pornography, fake news, hoaxes and fraud. But there is a positive side too.

The technology has been used in everything from public health campaigns to education and cultural installations. In late 2020 the former professional footballer David Beckham was digitally transformed into a 70-year-old man for the Malaria Must Die So Millions Can Live campaign. Historical figures have been brought back to life in museums, for example, Salvador Dalí “appearing” at the Salvador Dalí Museum in St. Petersburg, Florida.

In entertainment, deepfake technology is used to create locations and to enable “ghost or hologram acting.” In the 2019 film Star Wars: The Rise of Skywalker, for example, Carrie Fisher was featured as Princess Leia three years after the actor’s death.

The end of the supermodel

In another corner of EPFL, Sabine Süsstrunk, professor and head of the Image and Visual Representation Lab in the School of Computer and Communication Sciences, demonstrates her latest work.

“We took the pretrained StyleGAN2 model and found the semantic vectors that create the eyes or mouth or nose, refining them so we could edit locally. Say you are creating a fake image and you like it, but you don’t like the eyes. You can use another fake reference image and start changing them. Now we can even change mouths and eyes and ears without needing a reference image. I can easily modify a face from serious to a smile, from big eyes to small, nose up, nose down.”

A key potential use of these deepfakes is advertising. As Süsstrunk says, it might be the end of the supermodel. “These are fake people pretending to be a fake something. You have no copyright issues, no photographer, no actor, no model. We can’t do the body yet, but that’s just a matter of time.”

It’s these kinds of images that Ebrahimi’s research is targeting. As head of the Multimedia Signal Processing Laboratory in the School of Engineering, he has worked in compression, media security and privacy throughout his career. Four years ago he also began focusing on a new problem – how AI can be used to breach security in general. Deepfakes are a clear example of this.

A game of cat and mouse

“As the problem is caused by AI, I wondered whether AI can also be part of the solution. Can you fight fire with fire?” he says. “We create deepfakes and detect them, making the algorithms challenge each other, getting better in what they do. But it’s an arms race, or a game of cat and mouse. And when you’re in that game, you want to make sure you’re not the mouse. Unfortunately, we are the mice and this game is not winnable beyond the short term.”

In addition to detection, Ebrahimi has also started working on the idea of provenance, the issue that brought down master forger Shaun Greenhalgh in the early 2000s. In digital media it’s an approach in which metadata is embedded in content when it’s created, certifying its source and history. One industry initiative is the Coalition for Content Provenance and Authenticity (C2PA), led by Adobe, Microsoft and the BBC. In parallel, Ebrahimi is working with the JPEG Committee to develop a universal, open-source standard under the International Organization for Standardization (ISO). Provenance won’t prevent manipulation, but it should transparently provide end users with information about the status of any digital content they encounter.

Technology versus society

Süsstrunk agrees that detection is a short-term game, and supports provenance, adding that her most recent deepfakes would be undetectable because the digital assets contain no artifacts. She would also like to see the conversation focused as much on the philosophical as on the technical.

“We need to get more sophisticated in explaining what digital and AI actually mean. There is no intelligence in artificial – I’m not saying we won’t get there, but at this point in time we are misusing the terminology. If somebody creates a deepfake, there’s no computer system trying to screw with you. There’s a person with either good or bad intent behind it. I truly believe that education is the answer – this technology is not going away.”

“Often these deepfakes will be shared in closed social media groups that we don’t have access to. There’s a whole closed world that is a conduit for any kind of fake information that the rest of us will know nothing about. That is not a technical discussion anymore, but one that includes societal values and the regulation of tech companies.”

Looking ahead, Ebrahimi is concerned about a lack of provenance or standardization activity beyond visual information. “Recently, we were asked by Swiss television to create a deepfake of the Swiss President Guy Parmelin, and those who detected that it was a deepfake did so from the audio, not the video. Even if you have perfect audio and a perfect video, the synchronization between the two is extremely difficult to handle. I want to deal with deepfakes in a multimodal way, to address the relationship between audio and video. I’ll also be working on the tools for the security of provenance – if you can forge the content, you can forge the metadata. So, this will be critical to making that approach work.”

If you would like to read more on deepfakes you can find more articles here.