South Korean researchers have created a novel framework for detecting deepfakes — images or videos that deceptively substitute one person's face for another's — as well as an accompanying dataset of fake images, generated by experts using Adobe Photoshop.
In the study, published March 9 in Applied Soft Computing, the team introduced Shallow-FakeFaceNet, a model that can detect fake images produced by humans or computers. Neural networks, of course, need to be trained on good data. Seeing a need for more samples of human-made fake images, the researchers recruited several skilled Adobe Photoshop users to produce more than 1,500 fake images, which were sorted into three difficulty levels, ranging from obviously fake to troublingly convincing. This was the group's Handcrafted Facial Manipulation dataset.
Shallow-FakeFaceNet, the researchers' classifier system, was part of their end-to-end pipeline for spotting deepfakes. After cropping the faces from images, the pipeline scaled up any images that were very small using a method known as facial super-resolution. The team compared that approach with the nearest neighbor upscaling method, a different approach for enlarging very small images. They found that facial super-resolution avoided the sort of unwanted pixilation that is caused by the other technique.
"Since the Facial Super Resolution technique is based on deep neural networks, it uses more computational power and takes more time than [nearest neighbor method]," Sangyup Lee, a researcher at Sungkyunkwan University and the lead author of the paper, told The Academic Times. "Because both methods [generate] the pixels when upscaling, it is an estimation of the image."
The researchers' pipeline also incorporated data augmentation. They fooled the model into thinking their dataset included more images, letting them train it with that dataset multiple times. "For example, one image can be shown as six different images if we are using six different augmentations," Lee said.
Importantly, their approach used RGB channel information about the image rather than metadata, which is vulnerable to manipulation. "Metadata is relatively easy to change or can be erased," Lee explained. "On the other hand, RGB information is the same information that people use to observe images, and it is altered when an image is forged." Their strategy was also designed so as to detect fake images with professionally smoothed-out edges, which might fool other existing algorithms.
With the help of their Handcrafted Facial Manipulation dataset and other data, the authors compared their approach to other methods for pinpointing fake images. It performed very strongly, detecting fake images from different age groups and races as well as faces with beards, sunglasses and heavy makeup, among other challenging characteristics.
Notably, their model used a shallow network, which they found to be better than deeper networks at classifying very small images from a relatively large dataset. Yet Lee acknowledges that deeper networks may be useful when detecting more complicated deepfakes.
"Deeper networks can capture more features of complex images and better differentiate between more classes," said Lee. "If bigger fake images have a lot of classes or complex structures (e.g., more than three color channels), it might be better to utilize a deeper network."
While their technique is good at spotting very small fake images, the authors noted that it may not work as well with tiny, low-resolution faces in images with several other people. Although they hope to ensemble, or combine, multiple detectors in the future, that too may come with a downside. "If we use the ensemble method, there is an optimization problem, and [it] takes a long time to train and test compared to using only one model," Lee said.
Lee and his colleagues first became interested in deepfakes when several such fake videos went viral online. An accompanying program code was easy enough for non-specialists. "We thought that if these deepfake videos are used maliciously, it can cause huge social problems, including fake identification and defamation," he explained. "So, we have decided to build a robust and efficient detection mechanism utilizing powerful deep learning-based methods."
In the past, marginalized groups have fought to be able to use aliases or other alternatives to their legal names on social media. In 2014, Facebook apologized to members of the drag queen community whose accounts had been labeled fake because they were based on the users' preferred names. Yet Lee does not expect those groups would be harmed by his team's contribution.
"Detecting deepfakes can stop propagating fake profiles on social media by warning the user or the media system itself," Lee said. "We think that these politically marginalized people have less access to the mainstream media, and they mostly communicate over social media such as Facebook or Twitter. If the attacker makes a fake profile of them, it might be more impactful."
For now, the researchers are working to apply their findings in the real world. This is all the more crucial given that deepfakes have been labeled an "emerging threat" by the cybersecurity service NortonLifeLock and a potential source of disinformation by Georgetown University's Center for Security and Emerging Technology.
"Our lab is currently doing research on Deepfake-in-the-Wild detection utilizing different methods," Lee noted.
The paper, "Detecting handcrafted facial image manipulations and GAN-generated facial images using Shallow-FakeFaceNet," published in Applied Soft Computing on March 9, was authored by Sangyup Lee, Shahroz Tariq, and Simon S. Woo, Sungkyunkwan University; and Youjin Shin, State University of New York Korea.