DETECTION OF REAL-TIME DEEPFAKES IN VIDEO CONFERENCING WITH ACTIVE PROBING AND CORNEAL REFLECTION Hui Guo Xin Wang Siwei Lyu_2

2025-05-06 0 0 4.17MB 6 页 10玖币

侵权投诉

DETECTION OF REAL-TIME DEEPFAKES IN VIDEO CONFERENCING WITH ACTIVE

PROBING AND CORNEAL REFLECTION

Hui Guo, Xin Wang, Siwei Lyu

Department of Computer Science and Engineering

University at Buffalo, State University of New York, USA.

{hguo8,xwang264,siweilyu}@buffalo.edu

ABSTRACT

The COVID pandemic has led to the wide adoption of online

video calls in recent years. However, the increasing reliance

on video calls provides opportunities for new impersonation

attacks by fraudsters using the advanced real-time DeepFakes.

Real-time DeepFakes pose new challenges to detection meth-

ods, which have to run in real-time as a video call is ongoing.

In this paper, we describe a new active forensic method to de-

tect real-time DeepFakes. Speciﬁcally, we authenticate video

calls by displaying a distinct pattern on the screen and using

the corneal reﬂection extracted from the images of the call

participant’s face. This pattern can be induced by a call partic-

ipant displaying on a shared screen or directly integrated into

the video-call client. In either case, no specialized imaging or

lighting hardware is required. Through large-scale simulations,

we evaluate the reliability of this approach under a range in a

variety of real-world imaging scenarios.

Index Terms—Real-time DeepFake, Corneal Reﬂection

1. INTRODUCTION

Video calls are increasingly replacing in-person meetings and

phone calls in recent years, mainly due to the high demand of

remote working during the COVID pandemic. For instance,

at the end of 2019, the Zoom video conferencing platform

had only about 10 million users. By late April of 2021, that

ﬁgure had surged to over 200 million, a 20-fold increase. How-

ever, the wide adoption of video calls as a means of meeting

and inter-person communication has also given rise to new

forms of deception. In particular, the lack of physical presence

opens the gate for digital impersonation in video calls using

DeepFakes (i.e., AI-synthesized human face videos). The most

recent tools (e.g., Avartarify [

] and DeepFaceLive [

]) have

enabled the synthesis of DeepFakes in real-time and piped

through a virtual camera. The DeepFakes are either in the

form of face-swap or face puppetry [

]. Although there are

still artifacts in the real-time DeepFakes [

], the continuing

improvement of the synthesis technology means that it will be-

come increasingly difﬁcult to distinguish a real person from an

AI-synthesized person at the other end of a video call. Indeed,

Fig. 1

Left:

A video call attendant is being actively authenti-

cated with the live patterns shown on the screen.

Right:

A real

person’s cornea will produce an image of the pattern shown

on the screen, while a real-time DeepFake cannot. The ﬁgures

are for demonstration. For actual results, see Fig. 5.

recent years are seeing such frauds emerged with an alarming

speed and start causing real damage [5].

The real-time DeepFakes pose new challenges to existing

detection methods, which are mostly passive, in that they clas-

sify an input video into the category of authentic or DeepFake.

Most of these methods struggle to achieve the levels of accu-

racy that would be needed to be incorporated into a practical

video-conferencing application and run in real-time. On the

other hand, new approaches based on active forensics, which

interfere with the generation process to improve detection

efﬁciency and accuracy, e.g., [

], are gaining momentum

recently. In particular, the work of [

] exploits the unique con-

strained environment afforded by a video-conferencing call

to detect real-time DeepFakes by varying the lighting pattern

on screen and extracting the same lighting variation from the

attendant’s face. As the current real-time DeepFake synthesis

methods are not sufﬁciently adaptable to capture such subtle

changes, the lack of consistent lighting variation can be used

as a telltale sign of synthesis and impersonation. However,

controlling and estimating the subtle change of screen lighting

may not be reliable as it can be affected by other environmental

factors, such as the ambient light, room setting, and makeup.

In this work, we describe a new active forensic approach

to exposing real-time DeepFakes. The main idea is illustrated

in Fig. 1. This method can be initiated by a call participant

arXiv:2210.14153v1 [cs.CV] 21 Oct 2022

or directly integrated into the video-call client

. First, we

brieﬂy display a distinct pattern, which will be referred to as

the probing pattern, on the shared screen during an ongoing

video call. The image of the attendant’s face will be captured

by the camera, and we will focus on the cornea areas. As the

attendant sits before the camera in a video call and the human

cornea is mirror-like and highly reﬂective, the probing pattern

on the screen should leave a reﬂective image on the cornea

that can subsequently be extracted from the face image and

compared with the probing pattern. We provide an automatic

pipeline to display the probing pattern, capture the face image,

extract the cornea reﬂections, and compare with the original

probing pattern. Our experiments with several state-of-the-art

real-time DeepFake synthesis models show that they cannot

recreate the probing pattern at the synthesized cornea region

at all in a variety of real-world settings. Compared with the

work in [

], our active detection method is less limited by the

lighting environment. In addition, our method does not rely on

complicated trained models, which allows use in a real-time

video conferencing environment easily. On the other hand, our

method can reliably extract and compare probing patterns to

authenticate real persons under a range of imaging scenarios

and validate this approach.

2. RELATED WORKS

Real-time DeepFake Synthesis.

DeepFakes are made for

real-time synthesis in recent years. DeepFaceLive [

] was pro-

posed to DeepFake in the real video-conferencing scenario. It

obtains higher visual quality and real-time speed that could be

used in practice. Using the DeepFaceLive, the users can swap

their faces from a real webcam using trained face-swapping

models in real-time. The generated fake screen in the Deep-

FaceLive software can be passed to the video-conferencing

software via virtual camera software (e.g., OBS-VirtualCam

[

]). For example, in the Zoom software [

], the host can select

to use a virtual camera instead of the actual camera to display

the fake screen from the DeepFaceLive in the Zoom meeting.

Examples of running DeepFaceLive in a Zoom meeting are

shown in Fig. 2.

DeepFake Detection Using Eye Biometrics.

Biometric cues

from the eyes have been used for the detection of GAN-

generated still images [

]. The work [

]

uses the inconsistency of corneal specular highlights in the two

eyes to identify AI-synthesised faces. More recently, the work

[

] spot the AI-synthesised faces by detecting the inconsis-

tency of pupil shapes. These methods are further extended in

[

] by using an attention-based robust deep network, where

the inconsistent components and artifacts in the iris region of

the GAN-face are highlighted in the attention maps clearly.

Although effective in exposing GAN-generated faces in high-

We assume that a consensus can be obtained from the attendants to use

their imagery for authentication purposes without privacy issues. This would

be the same agreement required when the video call is live recorded.

Fig. 2

:Examples of video-conferencing DeepFake using Deep-

FaceLive [

]. For each pair,

Left:

The template Faces,

Right:

The DeepFakes.

resolution still images in a passive setting, these methods may

not work to catch real-time DeepFake videos that are used in

video conferences.

Active Detection of DeepFakes.

The active detection for

DeepFakes differs from the existing detection methods [

] in

that it interferes with the generation process to make detection

easier. Early work in [

] obstructs the DeepFake generation

by attacking a key step of the generation pipeline, i.e., fa-

cial landmark extraction. The method generates adversarial

perturbations [

] to disrupt the facial landmark extraction,

such that the DeepFake models cannot locate the real face to

swap. Active illumination artifacts are studied for exploring

the DeepFakes. For example, the work [

] shows that the cor-

respondence of brightness of the facial appearance in different

active illumination can be used as a signal for active Deep-

Fakes detection. Motivated by this work, [

] proposed a new

active method for video-conferencing DeepFakes detection

using active illumination.

3. METHOD

The overall process of our method is shown in Fig. 3. In a

standard video conference setting, a person sits in front of a

laptop computer, and her eyes are captured by the webcam,

Fig. 3(a). To verify if the attendant(s) is indeed a real person

instead of a synthesis from real-time DeepFake models, the

host will brieﬂy put up the probing pattern on the shared

screen. The probing pattern is a simple geometric shape on a

white background to have good contrast. We expect the real

attendants’ eyes have reﬂections of the probing pattern, while

a real-time DeepFake will not. We ﬁrst capture an image of

the attendant’s face and then run a face detector and extract

facial landmarks using Dlib [

], Fig. 3(b). From the facial

landmarks, we localize the eye region, Fig. 3(c), and then

segment out the iris part using an edge detector and the Hough

transform as in [

], Fig. 3(d). The segmented iris images

are then passed to the template matching steps for automatic

DeepFake detection, Fig. 3(e), where we compare the corneal

reﬂection with the probing pattern. The matching of the two

indicates a real person, and the lack of matching suggests a

possible real-time DeepFake impersonation.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DETECTIONOFREAL-TIMEDEEPFAKESINVIDEOCONFERENCINGWITHACTIVEPROBINGANDCORNEALREFLECTIONHuiGuo,XinWang,SiweiLyuDepartmentofComputerScienceandEngineeringUniversityatBuffalo,StateUniversityofNewYork,USA.fhguo8,xwang264,siweilyug@buffalo.eduABSTRACTTheCOVIDpandemichasledtothewideadoptionofonlinevideocalls...

展开>> 收起<<

DETECTION OF REAL-TIME DEEPFAKES IN VIDEO CONFERENCING WITH ACTIVE PROBING AND CORNEAL REFLECTION Hui Guo Xin Wang Siwei Lyu_2.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

DETECTION OF REAL-TIME DEEPFAKES IN VIDEO CONFERENCING WITH ACTIVE PROBING AND CORNEAL REFLECTION Hui Guo Xin Wang Siwei Lyu_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: