From deepfakes to Safe Fakes

August 31, 2021

These AI researchers use deepfakes to anonymize images of people, paving the way for a privacy-friendly training of face detection algorithms.

Deepfakes, images or sounds generated by artificial intelligence, have a bad reputation. Still, there are useful applications for this ‘fake’ content, such as in the film industry, the medical world and in art. A new application, developed by researchers at Eindhoven University of Technology (TU/e) in collaboration with the University of Amsterdam, could well be the solution for AI research that uses privacy-sensitive images to train algorithms. Due to European privacy regulations, this research is becoming increasingly difficult. "With safe fakes, we are able to train algorithms with images of people that are not identifiable." 

For Sander Klomp, researcher at TU/e in the VCA Research group and also working at TU/e spin-off ViNotion, it began as a very practical problem.

"At ViNotion, we create smart algorithms that, for example, allow local authorities to monitor intersections where different traffic flows converge. To train those algorithms, an awful lot of images of vehicles, cyclists and pedestrians are needed. The new EU privacy rules have made this a lot harder, as the faces can be traced back to real people. According to the EU this is in principle only allowed with their explicit consent."

The solution may seem simple. You anonymize the images, for example, by making the faces more blurry or grainy, or by adding a black bar (see image). As opposed to facial recognition, which requires information about somebody’s individual facial features, person detection is not interested in somebody’s looks. It is sufficient that the algorithm recognizes that it is a human being.
 

Three ways to anonymize a human face: by bluring, blocking and by using deepfakes. NOte that in this case the face also changes gender.
Three ways to anonymize a human face: by bluring, blocking and by using deepfakes. NOte that in this case the face also changes gender.

ImageNet

This is also the path chosen by the likes of ImageNet, with over 14 million images the largest and most widely used database for AI research. in March 2021 it decided to blur all faces.

"But for us it's different", says Klomp. "We actually use those images to train our algorithms how people look. If their faces are always blurred, the algorithm will conclude that this is what people always look like. You can imagine what that does to the accuracy of your AI systems. There's got to be a better way!"

Deepfakes

Klomp and his colleagues immediately thought of deepfakes, artificial images or voices generated by artificial intelligence. "By replacing the faces on our images with random fake faces, you can protect the privacy of the people involved, and at the same time train your detectors."

That solution is not entirely new; the first attempt to anonymize faces using deepfakes was made in 2018. But the results at the time were quite disappointing. However, in the meantime, the algorithms to generate artificially realistic faces, so-called GANs, have gotten a lot better.

Generative Adversarial Networks, a type of machine learning, consist of two neural networks, which play a game with each other. On one side you have a generator, which generates faces at random, and on the other side a discriminator, which determines whether that face is sufficiently 'real'.

This process repeats itself countless times, until finally the generator has become so smart that it can create faces that are indistinguishable from a real face. The images can then be used to train a face detector.

The face of Barack Obama with five keypoints.
The face of Barack Obama with five keypoints.

DeepPrivacy

The researchers tested several GANs for their research. In the end, DeepPrivacy, which at the time of the study was the most advanced algorithm for generating fake faces, proved to be the best. It outperformed not only traditional ways of anonymizing faces (like blurring), but also other GANs.

"We see in our tests that detectors trained with DeepPrivacy's fakes achieve a detection score of around 90 percent. That's only 1 to 3 percentage points less than detectors trained on non-anonymized data. Still very good, especially when you consider the alternative: not being able to use data at all because of privacy regulations."

The reason that training on DeepPrivacy images works so much better than with older GANs is that it requires fewer keypoints (see image).

Klomp: "Keypoints are points that are characteristic of the face, such as the position of the eyes or the ears. The anonymizer detects these for each face, and then blacks out the rest of the face so that it becomes unidentifiable. Then the generator "sticks" a new fictional face on it. DeepPrivacy uses only seven keypoints in total, which means it is able anonymize faces as small as 10 pixels."

Practice

The researchers are very pleased with the result, because they are the first to show that you can train good face detectors on images anonymized with deepfakes.

The next step is to use DeepPrivacy, or possible successors to this GAN, to anonymize the data of ViNotion.

Klomp: "The nice thing is that our method can also be used by researchers who are not at all interested in face detection. Think of a camera on a self-driving car that has to be able to recognize other cars. These images often show identifiable people. You can, of course, anonymize them by blurring, as ImageNet does, but then you lose precision. Our method works better."

More info

Sander Klomp, Matthew van Rijn, Rob Wijnhoven, Cees G.M. Snoek, Peter de With Safe Fakes: Evaluating Face Anonymizers for Face Detectors. The paper has been accepted by IEEE International Conference on Automatic Face and Gesture Recognition 2021 and will be published by the end of this year. The preprint can be found here.

[Translate to Engels:] Sander Klomp

Sander Klomp is a PhD student in professor Peter de With's research group Video Coding and Architectures in the department of Electrical Engineering. He is currently researching the use of artificial intelligence in detecting roadside bombs. He hopes to receive his PhD in 2022. Like the research on privacy-friendly algorithms, his work on roadside bomb detection has great social relevance. "I think that's important," Klomp says.

In addition to working at TU/e, Klomp also works at ViNotion, an Eindhoven-based spin-off of TU/e founded in 2007 by Egbert Jaspers. ViNotion develops smart algorithms that are used to automatically map traffic or visitor flows. Customers include municipalities or organizers of music festivals.

TU/e has a great track record when it comes to developing smart image recognition software. Among other things, Peter de With's research group has developed algorithms that help detect esophageal cancer, automatically drive cars and design 3D models to detect calamities in an urban environment. The researchers are working closely with TU/e's AI institute, EAISI.

Media contact

Henk van Appeven
(Communications Adviser)

Latest news

Keep following us