Implicit Identity Leakage The Stumbling Block to Improving Deepfake Detection Generalization Shichao Dong1 Jin Wang1 Renhe Ji1 Jiajun Liang1 Haoqiang Fan1 Zheng Ge1

2025-05-08 0 0 6.75MB 12 页 10玖币
侵权投诉
Implicit Identity Leakage:
The Stumbling Block to Improving Deepfake Detection Generalization
Shichao Dong1,, Jin Wang1,, Renhe Ji1,, Jiajun Liang1, Haoqiang Fan1, Zheng Ge1
1MEGVII Technology
{dongshichao,wangjin,jirenhe,liangjiajun,fhq,gezheng}@megvii.com
Abstract
In this paper, we analyse the generalization ability of bi-
nary classifiers for the task of deepfake detection. We find
that the stumbling block to their generalization is caused by
the unexpected learned identity representation on images.
Termed as the Implicit Identity Leakage, this phenomenon
has been qualitatively and quantitatively verified among
various DNNs. Furthermore, based on such understand-
ing, we propose a simple yet effective method named the ID-
unaware Deepfake Detection Model to reduce the influence
of this phenomenon. Extensive experimental results demon-
strate that our method outperforms the state-of-the-art in
both in-dataset and cross-dataset evaluation. The code is
available at https://github.com/megvii-research/CADDM.
1. Introduction
Recently, face-swap abusers use different face manip-
ulation methods [23,36,36,40,78] to generate fake im-
ages/videos. Those images/videos are then used to spread
fake news, make malicious hoaxes, and forge judicial ev-
idence, which have caused severe consequences. In order
to alleviate such situations, an increasing number of deep-
fake detection methods [17,20,51,71,72,76] have been pro-
posed to filter out manipulated images/videos from massive
online media resources, ensuring the filtered images/videos
are genuine and reliable.
Previous methods usually dealt with the task of deep-
fake detection with binary classifiers [1,6,14,56,62]. These
methods have achieved great detection accuracy in detect-
ing the seen attacks learned in the training datasets (i.e. the
in-dataset evaluations). However, when confronted with
media generated from newly-proposed deepfake methods
(i.e. the cross-dataset evaluations), these methods often suf-
fered from significant performance drops. Though plenty of
researchers have designed effective methods [41,88,89] to
*Equal contribution
Corresponding author
Genuine
Identities
Fake
Identities
Target (ID-2)
Source (ID-1)
Genuine
(Unseen)
Fake
(Unseen)
Fake (ID-3)
Genuine Fake
Cross Dataset
In Dataset
Identity Boundary
ID-1
Information
ID-2
Information
Source Image Target Image Fake Image New Target Image
(a)
(b)
Figure 1. The Implicit Identity Leakage phenomenon. Since
the fake image retains some features of its source image, its iden-
tity should not be completely regarded as its target image. As a
consequence, there exists an implicit gap between genuine iden-
tities and fake identities in the training set, which is unintention-
ally captured by binary classifiers. When confronted with images
manipulated by unseen face-swap methods, the classifier tends to
misuse identity information and make false predictions.
improve the generalization of deepfake detection models, it
still lacks a thorough analysis of why binary classifiers fail
to perform well on the cross-dataset evaluation.
In this paper, given well-trained binary classifiers of
deepfake detection, we find that the stumbling block for
their generalization ability is caused by the mistakenly
learned identity representation on images. As shown in Fig.
1(a), a deepfake image is usually generated by replacing
the face of the source image with the face of the target im-
age. However, we notice that the procedure of synthesizing
the fake image [9,23,36] may cause the information loss
of ID representations. The identity of the fake image can
not be considered as the same as either its target image or
its source image. In particular, when the face of the target
image is swapped back with the face of the fake image, it is
noticeable that the identity of the target image is altered.
In this way, as shown in Fig. 1(b), when learning a
1
arXiv:2210.14457v2 [cs.CV] 10 Mar 2023
deepfake detection model, there exists an implicit decision
boundary between fake images and genuine images based
on identities. During the training phase, binary classifiers
may accidentally consider certain groups of identities as
genuine identities and other groups of identities as fake
identities. When tested on the cross-dataset evaluation, such
biased representations may be mistakenly used by binary
classifiers, causing false judgments based on the facial ap-
pearance of images. In this paper, we have qualitatively
and quantitatively verified this phenomenon (termed as the
Implicit Identity Leakage) in binary classifiers of various
backbones. Please see Sec. 3and Sec. 5.2 for analyses.
Furthermore, based on such understanding, we propose
a simple yet effective method named the ID-unaware Deep-
fake Detection Model to reduce the influence of Implicit
Identity Leakage. Intuitively, by forcing models to only fo-
cus on local areas of images, less attention will be paid to
the global identity information. Therefore, we design an
anchor-based detector module termed as the Artifact Detec-
tion Module to guide our model to focus on the local artifact
areas. Such a module is expected to detect artifact areas on
images with multi-scale anchors, each of which is assigned
a binary label to indicate whether the artifact exists. By lo-
calizing artifact areas and classifying multi-scale anchors,
our model learns to distinguish the differences between lo-
cal artifact areas and local genuine areas at a finer level, thus
reducing the misusage of the global identity information.
Extensive experimental results show that our model
accurately predicted the position of artifact areas and
learned generalized artifact features in face manipulation
algorithms, successfully outperforming the state-of-the-art.
Contributions of the paper are summarized as follows:
• We discover that deepfake detection models super-
vised only by binary labels are very sensitive to the
identity information of the images, which is termed as
the Implicit Identity Leakage in this paper.
We propose a simple yet effective method termed as
the ID-unaware Deepfake Detection Model to reduce
the influence of the ID representation, successfully
outperforming other state-of-the-art methods.
We conduct extensive experiments to verify the Im-
plicit Identity Leakage phenomenon and demonstrate
the effectiveness of our method.
2. Related Work
With the development of Generative Adversarial Net-
work (GAN) [12,25,33,34,47] techniques, forgery im-
ages/videos have become more realistic and indistinguish-
able. To deal with attacks based on different face manip-
ulation algorithms, researchers tried to improve their deep-
fake detectors [30,54,61] from different perspectives, such
as designing different loss functions [7], extracting richer
features [21,83], and analyzing the continuity between con-
secutive frames [29,57]. Most of these deepfake detection
methods can be roughly summarized into two categories.
2.1. Binary Classifiers
Many researchers [1,6,14,56,62] treated the deepfake
detection task as a binary classification problem. They used
a backbone encoder to extract high-level features and a clas-
sifier to detect whether the input image has been manip-
ulated. Durall et al. [22] first proposed a model analyz-
ing the frequency domain for face forgery detection. Masi
et al. [52] used a two-branch recurrent network to extract
high-level semantic information in original RGB images
and their frequency domains at the same time, by which
the model achieved good performance on multiple public
datasets. Li et al. [39] designed a single-center loss to
compress the real sample classification space to further im-
prove the detection rate of forged samples. Binary classi-
fiers achieved high detection accuracy on in-dataset evalu-
ation, but they could not maintain good performance when
facing unseen forged images.
2.2. Hand-crafted Deepfake Detectors
Many works attempted to improve the generalization ca-
pability of deepfake detectors by modeling specific hand-
crafted artifacts among different face manipulation meth-
ods. Li et al. [42] believed that some physical characteris-
tics of a real person cannot be manipulated in fake videos.
They designed an eye blinking detector to identify the au-
thenticity of the video through the frequency of eye blink-
ing. Since 3D data can not be reversely generated from the
fake image, Yang et al. [84] did the face forgery detection
task from the perspective of non-3D projection generation
samples. Sun et al. [74] and Li et al. [41] focused on pre-
cise geometric features (face landmark) and blending ar-
tifacts respectively when detecting forged images. Liu et
al. [46] equipped the model with frequency domain infor-
mation since the frequency domain is very sensitive to up-
sampling operations (which are often used in deepfake de-
tection models), and used a shallow network to extract rich
local texture information, enhancing the model’s general-
ization and robustness.
In summary, hand-crafted deepfake detectors guided the
model to capture specific artifact features and indicated ma-
nipulated images/videos by responding to these features.
However, these methods have a common limitation: when
forgeries do not contain specific artifacts that are introduced
in the training phase, they often fail to work well.
3. Implicit Identity Leakage
The Implicit Identity Leakage denotes that the ID repre-
sentation in the deepfake dataset is captured by binary clas-
2
(a) Celeb-DF
(b) FF++ (c) LFW
Figure 2. ID linear classification on frozen features of binary
classifiers. Results show that binary classifiers of different back-
bones learned ID representation of images, even without explicit
supervision of identity labels.
sifiers during the training phase. Although such identity in-
formation enhances the differences between real and fake
images when testing the model on the in-dataset evaluation,
it tends to mislead the model on the cross-dataset evalua-
tion. In this section, we conduct thorough experiments to
verify this hypothesis. First, we conduct the ID linear clas-
sification experiment to verify that binary classifiers capture
identity information during the training phase. Second, we
quantified the influence of such ID representation on the in-
dataset evaluation and cross-dataset evaluation respectively,
to verify its effect on the task of deepfake detection.
3.1. Verifying the Existence of ID Representation
Hypothesis 1: The ID representation in the deepfake
dataset is accidentally captured by binary classifiers dur-
ing the training phase when without explicit supervision.
In this section, we performed the ID linear classifica-
tion experiment to verify that binary classifiers accidentally
learn the ID representation on images.
Inspired by previous unsupervised pre-training methods
[11,27], we finetuned the frozen features extracted from
classifiers to evaluate the generalization of the learned ID
representation. Given a binary classifier trained on FF++
[67], we measured the linear classification accuracy of iden-
tities on features extracted from the classifier for FF++ [67],
Celeb-DF [45] and a face recognition dataset LFW [32]. To
be specific, we froze the input feature to the last linear layer
of ResNet-18/34/50 [28], Xception [13] and Efficient-b3
[75] to demonstrate the universality of such phenomenon.
Fig. 2shows that linear classification on features of dif-
ferent classifiers converged to varying degrees and achieved
varying degrees of accuracy for identity classification. Such
results also indicate that although classifiers were never
trained on Celeb-DF and LFW before, they still extracted
Datasets ResNet-18 ResNet-34 ResNet-50 Xception EfficientNet-b3
FF++ 81.53 89.77 99.58 97.32 94.87
Celeb-DF 46.88 47.22 49.47 47.23 44.43
Table 1. Quantifying the influence of the ID representation on
the task of deepfake detection. Results show that although ID
representation could boost the performance of the in-dataset eval-
uation, i.e. FF++, it would hinder the improvements of the cross-
dataset evaluation, i.e. Celeb-DF.
substantial information about identities from images, es-
pecially on strong backbones (e.g., Efficient-b3). In other
words, deepfake detectors accidentally learned the ID rep-
resentation of images, without explicit supervision in par-
ticular.
3.2. Quantifying the Influence of ID Representation
Hypothesis 2: Although the accidentally learned ID
representation may enhance the performance on the in-
dataset evaluation, it tends to mislead the model on the
cross-dataset evaluation.
After verifying the existence of ID representation in
features of binary classifiers, we performed another
experiment to verify its effect on deepfake detection.
The key challenge is how to attribute the output of the
binary classifier to the ID representation of the input im-
age quantitatively. Intuitively, the identity of an image is
not decided by each image region individually, e.g. mouths,
eyes and noses. Instead, these regions usually collaborate
with each other to form a certain pattern, e.g. the identity of
the input image. Thus, we used the multivariate interaction
metric [85] to quantify the influence of the ID representa-
tion. Such a metric can be considered as the attribution
score disentangled from the output score of the input im-
age, which is assigned to the interaction of multiple units.
Let N={1,2,3, ..., n}denote all the units of an input im-
age. The multivariate interaction caused by the subset of
units SNis calculated as
I([S]) = ϕ([S]|N[S])X
iS
ϕ(i|Ni)(1)
where ϕ([S]|N[S])denotes the Shapley value [69] of the
coalition [S], which indicates the contribution of [S]to
the output score. ϕ(i|Ni)denotes the Shapley value of
the unit i, which indicates the contribution of the unit i.
NS=N/S ∪ {[S]}and Ni=N/i ∪ {i}.
In practice, to reduce the computational cost, we sam-
pled 5frames from each video and divided the input image
into 16 ×16 girds. Swas set as S=Nin experiments,
since the input faces are usually cropped and aligned to ex-
pand the whole images as a common protocol for the deep-
fake detection [88,89]. In this way, we used I([N]) as the
3
摘要:

ImplicitIdentityLeakage:TheStumblingBlocktoImprovingDeepfakeDetectionGeneralizationShichaoDong1,∗,JinWang1,∗,RenheJi1,†,JiajunLiang1,HaoqiangFan1,ZhengGe11MEGVIITechnology{dongshichao,wangjin,jirenhe,liangjiajun,fhq,gezheng}@megvii.comAbstractInthispaper,weanalysethegeneralizationabilityofbi-narycla...

展开>> 收起<<
Implicit Identity Leakage The Stumbling Block to Improving Deepfake Detection Generalization Shichao Dong1 Jin Wang1 Renhe Ji1 Jiajun Liang1 Haoqiang Fan1 Zheng Ge1.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:6.75MB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注