Implicit Identity Leakage The Stumbling Block to Improving Deepfake Detection Generalization Shichao Dong1 Jin Wang1 Renhe Ji1 Jiajun Liang1 Haoqiang Fan1 Zheng Ge1

2025-05-08 0 0 6.75MB 12 页 10玖币

侵权投诉

Implicit Identity Leakage:

The Stumbling Block to Improving Deepfake Detection Generalization

Shichao Dong1,∗, Jin Wang1,∗, Renhe Ji1,†, Jiajun Liang1, Haoqiang Fan1, Zheng Ge1

1MEGVII Technology

{dongshichao,wangjin,jirenhe,liangjiajun,fhq,gezheng}@megvii.com

Abstract

In this paper, we analyse the generalization ability of bi-

nary classiﬁers for the task of deepfake detection. We ﬁnd

that the stumbling block to their generalization is caused by

the unexpected learned identity representation on images.

Termed as the Implicit Identity Leakage, this phenomenon

has been qualitatively and quantitatively veriﬁed among

various DNNs. Furthermore, based on such understand-

ing, we propose a simple yet effective method named the ID-

unaware Deepfake Detection Model to reduce the inﬂuence

of this phenomenon. Extensive experimental results demon-

strate that our method outperforms the state-of-the-art in

both in-dataset and cross-dataset evaluation. The code is

available at https://github.com/megvii-research/CADDM.

1. Introduction

Recently, face-swap abusers use different face manip-

ulation methods [23,36,36,40,78] to generate fake im-

ages/videos. Those images/videos are then used to spread

fake news, make malicious hoaxes, and forge judicial ev-

idence, which have caused severe consequences. In order

to alleviate such situations, an increasing number of deep-

fake detection methods [17,20,51,71,72,76] have been pro-

posed to ﬁlter out manipulated images/videos from massive

online media resources, ensuring the ﬁltered images/videos

are genuine and reliable.

Previous methods usually dealt with the task of deep-

fake detection with binary classiﬁers [1,6,14,56,62]. These

methods have achieved great detection accuracy in detect-

ing the seen attacks learned in the training datasets (i.e. the

in-dataset evaluations). However, when confronted with

media generated from newly-proposed deepfake methods

(i.e. the cross-dataset evaluations), these methods often suf-

fered from signiﬁcant performance drops. Though plenty of

researchers have designed effective methods [41,88,89] to

*Equal contribution

†Corresponding author

Genuine

Identities

Fake

Identities

Target (ID-2)

Source (ID-1)

Genuine

(Unseen)

Fake

(Unseen)

Fake (ID-3)

Genuine Fake

Cross Dataset

In Dataset

Identity Boundary

ID-1

Information

ID-2

Information

Source Image Target Image Fake Image New Target Image

(a)

(b)

Figure 1. The Implicit Identity Leakage phenomenon. Since

the fake image retains some features of its source image, its iden-

tity should not be completely regarded as its target image. As a

consequence, there exists an implicit gap between genuine iden-

tities and fake identities in the training set, which is unintention-

ally captured by binary classiﬁers. When confronted with images

manipulated by unseen face-swap methods, the classiﬁer tends to

misuse identity information and make false predictions.

improve the generalization of deepfake detection models, it

still lacks a thorough analysis of why binary classiﬁers fail

to perform well on the cross-dataset evaluation.

In this paper, given well-trained binary classiﬁers of

deepfake detection, we ﬁnd that the stumbling block for

their generalization ability is caused by the mistakenly

learned identity representation on images. As shown in Fig.

1(a), a deepfake image is usually generated by replacing

the face of the source image with the face of the target im-

age. However, we notice that the procedure of synthesizing

the fake image [9,23,36] may cause the information loss

of ID representations. The identity of the fake image can

not be considered as the same as either its target image or

its source image. In particular, when the face of the target

image is swapped back with the face of the fake image, it is

noticeable that the identity of the target image is altered.

In this way, as shown in Fig. 1(b), when learning a

arXiv:2210.14457v2 [cs.CV] 10 Mar 2023

deepfake detection model, there exists an implicit decision

boundary between fake images and genuine images based

on identities. During the training phase, binary classiﬁers

may accidentally consider certain groups of identities as

genuine identities and other groups of identities as fake

identities. When tested on the cross-dataset evaluation, such

biased representations may be mistakenly used by binary

classiﬁers, causing false judgments based on the facial ap-

pearance of images. In this paper, we have qualitatively

and quantitatively veriﬁed this phenomenon (termed as the

Implicit Identity Leakage) in binary classiﬁers of various

backbones. Please see Sec. 3and Sec. 5.2 for analyses.

Furthermore, based on such understanding, we propose

a simple yet effective method named the ID-unaware Deep-

fake Detection Model to reduce the inﬂuence of Implicit

Identity Leakage. Intuitively, by forcing models to only fo-

cus on local areas of images, less attention will be paid to

the global identity information. Therefore, we design an

anchor-based detector module termed as the Artifact Detec-

tion Module to guide our model to focus on the local artifact

areas. Such a module is expected to detect artifact areas on

images with multi-scale anchors, each of which is assigned

a binary label to indicate whether the artifact exists. By lo-

calizing artifact areas and classifying multi-scale anchors,

our model learns to distinguish the differences between lo-

cal artifact areas and local genuine areas at a ﬁner level, thus

reducing the misusage of the global identity information.

Extensive experimental results show that our model

accurately predicted the position of artifact areas and

learned generalized artifact features in face manipulation

algorithms, successfully outperforming the state-of-the-art.

Contributions of the paper are summarized as follows:

• We discover that deepfake detection models super-

vised only by binary labels are very sensitive to the

identity information of the images, which is termed as

the Implicit Identity Leakage in this paper.

• We propose a simple yet effective method termed as

the ID-unaware Deepfake Detection Model to reduce

the inﬂuence of the ID representation, successfully

outperforming other state-of-the-art methods.

• We conduct extensive experiments to verify the Im-

plicit Identity Leakage phenomenon and demonstrate

the effectiveness of our method.

2. Related Work

With the development of Generative Adversarial Net-

work (GAN) [12,25,33,34,47] techniques, forgery im-

ages/videos have become more realistic and indistinguish-

able. To deal with attacks based on different face manip-

ulation algorithms, researchers tried to improve their deep-

fake detectors [30,54,61] from different perspectives, such

as designing different loss functions [7], extracting richer

features [21,83], and analyzing the continuity between con-

secutive frames [29,57]. Most of these deepfake detection

methods can be roughly summarized into two categories.

2.1. Binary Classiﬁers

Many researchers [1,6,14,56,62] treated the deepfake

detection task as a binary classiﬁcation problem. They used

a backbone encoder to extract high-level features and a clas-

siﬁer to detect whether the input image has been manip-

ulated. Durall et al. [22] ﬁrst proposed a model analyz-

ing the frequency domain for face forgery detection. Masi

et al. [52] used a two-branch recurrent network to extract

high-level semantic information in original RGB images

and their frequency domains at the same time, by which

the model achieved good performance on multiple public

datasets. Li et al. [39] designed a single-center loss to

compress the real sample classiﬁcation space to further im-

prove the detection rate of forged samples. Binary classi-

ﬁers achieved high detection accuracy on in-dataset evalu-

ation, but they could not maintain good performance when

facing unseen forged images.

2.2. Hand-crafted Deepfake Detectors

Many works attempted to improve the generalization ca-

pability of deepfake detectors by modeling speciﬁc hand-

crafted artifacts among different face manipulation meth-

ods. Li et al. [42] believed that some physical characteris-

tics of a real person cannot be manipulated in fake videos.

They designed an eye blinking detector to identify the au-

thenticity of the video through the frequency of eye blink-

ing. Since 3D data can not be reversely generated from the

fake image, Yang et al. [84] did the face forgery detection

task from the perspective of non-3D projection generation

samples. Sun et al. [74] and Li et al. [41] focused on pre-

cise geometric features (face landmark) and blending ar-

tifacts respectively when detecting forged images. Liu et

al. [46] equipped the model with frequency domain infor-

mation since the frequency domain is very sensitive to up-

sampling operations (which are often used in deepfake de-

tection models), and used a shallow network to extract rich

local texture information, enhancing the model’s general-

ization and robustness.

In summary, hand-crafted deepfake detectors guided the

model to capture speciﬁc artifact features and indicated ma-

nipulated images/videos by responding to these features.

However, these methods have a common limitation: when

forgeries do not contain speciﬁc artifacts that are introduced

in the training phase, they often fail to work well.

3. Implicit Identity Leakage

The Implicit Identity Leakage denotes that the ID repre-

sentation in the deepfake dataset is captured by binary clas-

(a) Celeb-DF

(b) FF++ (c) LFW

Figure 2. ID linear classiﬁcation on frozen features of binary

classiﬁers. Results show that binary classiﬁers of different back-

bones learned ID representation of images, even without explicit

supervision of identity labels.

siﬁers during the training phase. Although such identity in-

formation enhances the differences between real and fake

images when testing the model on the in-dataset evaluation,

it tends to mislead the model on the cross-dataset evalua-

tion. In this section, we conduct thorough experiments to

verify this hypothesis. First, we conduct the ID linear clas-

siﬁcation experiment to verify that binary classiﬁers capture

identity information during the training phase. Second, we

quantiﬁed the inﬂuence of such ID representation on the in-

dataset evaluation and cross-dataset evaluation respectively,

to verify its effect on the task of deepfake detection.

3.1. Verifying the Existence of ID Representation

Hypothesis 1: The ID representation in the deepfake

dataset is accidentally captured by binary classiﬁers dur-

ing the training phase when without explicit supervision.

In this section, we performed the ID linear classiﬁca-

tion experiment to verify that binary classiﬁers accidentally

learn the ID representation on images.

Inspired by previous unsupervised pre-training methods

[11,27], we ﬁnetuned the frozen features extracted from

classiﬁers to evaluate the generalization of the learned ID

representation. Given a binary classiﬁer trained on FF++

[67], we measured the linear classiﬁcation accuracy of iden-

tities on features extracted from the classiﬁer for FF++ [67],

Celeb-DF [45] and a face recognition dataset LFW [32]. To

be speciﬁc, we froze the input feature to the last linear layer

of ResNet-18/34/50 [28], Xception [13] and Efﬁcient-b3

[75] to demonstrate the universality of such phenomenon.

Fig. 2shows that linear classiﬁcation on features of dif-

ferent classiﬁers converged to varying degrees and achieved

varying degrees of accuracy for identity classiﬁcation. Such

results also indicate that although classiﬁers were never

trained on Celeb-DF and LFW before, they still extracted

Datasets ResNet-18 ResNet-34 ResNet-50 Xception EfﬁcientNet-b3

FF++ 81.53 89.77 99.58 97.32 94.87

Celeb-DF 46.88 47.22 49.47 47.23 44.43

Table 1. Quantifying the inﬂuence of the ID representation on

the task of deepfake detection. Results show that although ID

representation could boost the performance of the in-dataset eval-

uation, i.e. FF++, it would hinder the improvements of the cross-

dataset evaluation, i.e. Celeb-DF.

substantial information about identities from images, es-

pecially on strong backbones (e.g., Efﬁcient-b3). In other

words, deepfake detectors accidentally learned the ID rep-

resentation of images, without explicit supervision in par-

ticular.

3.2. Quantifying the Inﬂuence of ID Representation

Hypothesis 2: Although the accidentally learned ID

representation may enhance the performance on the in-

dataset evaluation, it tends to mislead the model on the

cross-dataset evaluation.

After verifying the existence of ID representation in

features of binary classiﬁers, we performed another

experiment to verify its effect on deepfake detection.

The key challenge is how to attribute the output of the

binary classiﬁer to the ID representation of the input im-

age quantitatively. Intuitively, the identity of an image is

not decided by each image region individually, e.g. mouths,

eyes and noses. Instead, these regions usually collaborate

with each other to form a certain pattern, e.g. the identity of

the input image. Thus, we used the multivariate interaction

metric [85] to quantify the inﬂuence of the ID representa-

tion. Such a metric can be considered as the attribution

score disentangled from the output score of the input im-

age, which is assigned to the interaction of multiple units.

Let N={1,2,3, ..., n}denote all the units of an input im-

age. The multivariate interaction caused by the subset of

units S⊆Nis calculated as

I([S]) = ϕ([S]|N[S])−X

i∈S

ϕ(i|Ni)(1)

where ϕ([S]|N[S])denotes the Shapley value [69] of the

coalition [S], which indicates the contribution of [S]to

the output score. ϕ(i|Ni)denotes the Shapley value of

the unit i, which indicates the contribution of the unit i.

NS=N/S ∪ {[S]}and Ni=N/i ∪ {i}.

In practice, to reduce the computational cost, we sam-

pled 5frames from each video and divided the input image

into 16 ×16 girds. Swas set as S=Nin experiments,

since the input faces are usually cropped and aligned to ex-

pand the whole images as a common protocol for the deep-

fake detection [88,89]. In this way, we used I([N]) as the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ImplicitIdentityLeakage:TheStumblingBlocktoImprovingDeepfakeDetectionGeneralizationShichaoDong1,∗,JinWang1,∗,RenheJi1,†,JiajunLiang1,HaoqiangFan1,ZhengGe11MEGVIITechnology{dongshichao,wangjin,jirenhe,liangjiajun,fhq,gezheng}@megvii.comAbstractInthispaper,weanalysethegeneralizationabilityofbi-narycla...

展开>> 收起<<

Implicit Identity Leakage The Stumbling Block to Improving Deepfake Detection Generalization Shichao Dong1 Jin Wang1 Renhe Ji1 Jiajun Liang1 Haoqiang Fan1 Zheng Ge1.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Implicit Identity Leakage The Stumbling Block to Improving Deepfake Detection Generalization Shichao Dong1 Jin Wang1 Renhe Ji1 Jiajun Liang1 Haoqiang Fan1 Zheng Ge1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: