Deepfakes as a security risk: what's behind them?

Jan Tissler

"Deepfakes" take deception to a new level: the voice and even the face of a person can be automatically imitated. This has consequences for security measures and, not least, enables new forms of phishing.

Many of us have already learned not to trust photos. They are too easy to edit and falsify. A similar trend is now developing in the areas of audio and video. One particularly astonishing category here goes by the name of "deepfake": the face and voice of one person are automatically replaced by those of another. This means you can have a person say whatever you want in sound and moving images.

The word deepfake is made up of "deep learning", a method in the field of artificial intelligence (AI), and fake. This means that instead of painstakingly creating the fake manually, an AI is used to help. Above all, it needs appropriate material to learn from, i.e. as many video and audio templates of the target person as possible.

The phenomenon is not entirely new. But the tools used to implement such forgeries are becoming more powerful. What a few years ago was more of a fun trick that was easy to see through is becoming increasingly precise. At the same time, suitable tools are available to anyone who is interested. The startup Deeptrace has determined that the number of deepfakes on the internet increased by 330% between October 2019 and June 2020.

And this has consequences not least for various security measures and technologies.

Example of a deepfake: The fake Tom Cruise

A recent example of an amazing deepfake is the TikTok account @deeptomcruise. In several short clips, you can see the actor Tom Cruise as he lives and breathes - or at least you think you do. It's hard to believe that these videos are actually fake. You have to look very closely to spot the signs.

The creator behind these viral clips, Belgian video effects specialist Chris Ume, has now explained how they were created. He enlisted the help of a professional Tom Cruise lookalike: Miles Fisher. Firstly, he provided a helpful basic resemblance to the actor and secondly, he was able to imitate the facial expressions, gestures and voice.

In the next step, the software was responsible for modifying the face so that Miles Fisher was completely transformed into his role model Tom Cruise. For simple cases, this can now be done automatically: applications with the right artificial intelligence can analyze two faces and voices, find their special features and then transform one into the other.

However, the work on the fake Tom Cruise took months and required a lot of fine-tuning. In this respect, a fake as convincing as this one cannot be created in passing, even today.

But it still shows what is already possible. Tomorrow, technology will be even further advanced. And the counterfeit does not always have to be so perfect to be effective.

Dangers from deepfakes

One area in which such false videos can cause a stir is "fake news" and disinformation campaigns. It is still possible to debunk deepfakes in many cases. But we know how quickly sensational reports spread online and how stubbornly lies can stand up to all attempts at clarification.

At the same time, deepfakes can have an impact on security mechanisms. Think of video identification procedures. Today, the technology is not yet advanced enough to generate a sufficiently good forgery live. In this respect, technical countermeasures are available. But how much longer can we assume that the person on the screen is really the person sitting in front of the camera?

Another possible point of attack is simple facial recognition methods. Apple's "Face ID" process in iPhones and iPads, for example, is not one of them because it not only evaluates the camera image, but also relies on other specialized sensors. A photo or video is therefore not enough to take you by surprise. It should even be able to recognize masks.

However, not all processes are so secure if they do not have the necessary hardware. A recent study by South Korea's Sungkyunkwan University, for example, shows that commercially available facial recognition services from providers such as Microsoft and Amazon are susceptible to deepfake attacks. In some cases, the service even found the fake to be more convincing than the original.

The good news from the study: already available detection mechanisms for deepfakes worked well in their tests. If they are installed upstream, the services are significantly less vulnerable.

Moreover, the companies concerned are not idle: As part of a "Deepfake Detection Challenge", Amazon, Microsoft and Facebook have joined forces with several universities to research detection methods.

The danger in the audio sector is currently more pressing, as voices can already be imitated much better than faces in videos. This opens up new avenues of attack for phishing attacks, for example, which have so far mainly used email. And this is not just speculation: the manager of a British energy company was apparently tricked into transferring 220,000 euros to a Hungarian supplier. He thought his German superior had asked him to do so in a phone call. The fraudsters had not only imitated the voice deceptively realistically, but also the typical tone of voice, including the accent. And there are other examples.

Closing words

As explained in another article, the first line of defense against such attacks is to be aware that they are even possible. Phishing often uses familiarity with a person to weaken our internal alarm systems. And with deepfakes, this no longer just applies to emails, but also to phone calls and, in future, even video calls.

In an interview with The Verge, video effects specialist Chris Ume compares it to Photoshop: "20 years ago, only a few people knew what photo manipulations were possible. Today it is common knowledge. The same will happen with deepfakes.