DEEPFAKE CLI Accelerated Deepfake Detection using FPGAs Omkar Bhilare12 Rahul Singh12 Vedant Paranjape12 Sravan

2025-05-06 0 0 1.73MB 12 页 10玖币

侵权投诉

DEEPFAKE CLI: Accelerated Deepfake

Detection using FPGAs

Omkar Bhilare?1,2, Rahul Singh?1,2, Vedant Paranjape?1,2, Sravan

Chittupalli?1,2, Shraddha Suratkar1,2, and Faruk Kazi1,2

1Department of Electrical Engineering, V.J.T.I, Mumbai, India.

2{oabhilare b19,rsingh b18, vvparanjape b18 schittupalli b18@el.vjti.ac.in,

sssuratkar@ce.vjti.ac.in, fskazi@el.vjti.ac.in

Abstract. Because of the availability of larger datasets and recent

improvements in the generative model, more realistic Deepfake videos

are being produced each day. People consume around one billion hours

of video on social media platforms every day, and that’s why it is very

important to stop the spread of fake videos as they can be damaging,

dangerous, and malicious. There has been a signiﬁcant improvement in

the ﬁeld of deepfake classiﬁcation, but deepfake detection and inference

have remained a diﬃcult task. To solve this problem in this paper, we

propose a novel DEEPFAKE C-L-I (Classiﬁcation – Localization –

Inference) in which we have explored the idea of accelerating Quantized

Deepfake Detection Models using FPGAs due to their ability of

maximum parallelism and energy eﬃciency compared to generalized

GPUs. In this paper, we have used light MesoNet with EFF-YNet

structure and accelerated it on VCK5000 FPGA, powered by

state-of-the-art VC1902 Versal Architecture which uses AI, DSP, and

Adaptable Engines for acceleration. We have benchmarked our

inference speed with other state-of-the-art inference nodes, got 316.8

FPS on VCK5000 while maintaining 93% Accuracy.

Keywords: Generative Models ·Deepfake Detection ·Deepfake

Classiﬁcation ·Machine Learning ·Quantized ·FPGAs ·MesoNet ·

EFF-YNet ·VCK5000 ·VC1902 ·Versal Architecture ·AI ·DSP ·

Adaptable Engines

1 Introduction

Deepfake is artiﬁcially created media in which a frame is created synthetically

using someone else’s features like face, structure, lip movements ,etc. They are

usually created by leveraging a Generative Adversarial Network to create a

picture or video which looks realistic enough to deceive any person. While

Deepfakes were initially created to prank individuals, they started getting

attention due to their use in illegal activities like celebrity pornographic videos,

fake news, and bullying. Hence, detecting deepfakes has become a very

?These Authors contributed equally to this work

arXiv:2210.14743v1 [cs.AR] 26 Oct 2022

2 O. Bhilare, R. Singh, V. Paranjape et al.

important issue in recent years.

Upon going through relevant literature and existing methods of deepfake

classiﬁcation, we observed that it is essentially an image classiﬁcation problem,

but the catch here is that pathological diﬀerences between the real and fake

images are quite small, as a result existing CNN models need to be modiﬁed

and tuned to detect these minute diﬀerences.

MesoNet is a CNN architecture which is used to detect Face2Face and

Deepfakes manipulations accurately [1]. It is designed in such a way that it

uses cropped faces from videos and analyses mid-level features. The model

focuses on the right amount of details by using an architecture with small

number of layers. XceptionNet is a complex model originally designed for

working with 2D Images which uses depthwise separable convolutional layers

with residual connections [2], as a result it gives higher performance than

MesoNet [1]. However, this model takes a lot of time to train. EﬃcientNets

which are relatively a newer family of CNN models aimed at providing eﬃcient

resource management by balancing model parameters like width, depth, and

resolution, outperform models which have a similar number of parameters [3].

Researchers have taken a step further and proposed classifying each pixel of

the image as real or fake. The U-Net architecture addresses this issue by

employing an encoder-decoder based network with skip connections [4]. To

improve classiﬁcation accuracy, Eﬀ-Ynet describes a novel architecture which

combines EﬃcientNet encoder with a classiﬁcation and segmentation

branch [5]. It is designed to classify an image and also ﬁnd regions where the

image is real and where it is fake. The job of segmentation helps train the

classiﬁer, and at the same time it also produces useful segmentation masks.

The inference of neural networks is usually slow on general-purpose GPUs [6].

One way to accelerate the inference of these networks is to use soft cores

emulating on the FPGAs, which allows the user to utilize the task and data

level parallelism to reach the performance of ASIC implementations while

taking reduced design time [7]. It is also possible to make RNN or CNN speciﬁc

hardware architectures. In this paper, they have accelerated Deep Recurrent

Neural Network (DRNN) on a hardware accelerator running on XILINX

ZYNQ FPGA [8]. These ZYNQ boards are heterogeneous in nature, which

means it has a hardcore CPU besides FPGA. In one of the paper, researchers

have mapped BNNs (Binarized neural networks) onto an FPGA device while

FP32 networks are mapped to the CPU, this hybrid mapping increases overall

neural network eﬃciency while maintaining inference speed [9]. Researchers

have found these ZYNQ boards perform better than CPUs and GPUs [10].

One of the paper, transforms the model into FP81format for speed up [11].

Upon evaluating this, we decided to go with INT82in our implementation,

thus quantized our model to INT8 precision. This means that our model, which

was originally trained on FP323precision, was converted to INT81precision.

1FP8 means 8-bit ﬂoating point representation

2INT8 means 8-bit integer representation

3FP32 means 32-bit ﬂoating point representation

DEEPFAKE CLI: Accelerated Deepfake Detection using FPGAs 3

By reducing the precision, the number of bits required to store the model

parameters are reduced, which means it requires less amount of memory to

store and reduces the number of clock cycles to transfer data between memory

and the accelerator over the PCIe bus.

We propose U-YNet, a combined segmentation, and classiﬁcation model which

consists of UNet Encoder and Decoder responsible for producing a

segmentation map to show the altered regions in the deepfake content and a

classiﬁcation branch present at the end of the UNet Encoder branch

responsible for classifying if the media content is real or deepfake origin. The

segmentation map is a novel way to identify the regions of a face that have

been mangled to create the deepfake, giving an insight into the construction of

a deepfake and paving the way for creating models to reverse a deepfake as

well. This model runs on a Deep Learning Processing Unit(DPU)6present in

the AMD-XILINX VCK 5000 Versal FPGA device. It consists of dedicated

processing units like hardware accelerated convolution engine which enables

convolution based model like UNet (and U-YNet) to run with higher inference

speeds, enabling real-time classiﬁcation of deepfake content.

2 Motivation and Background

Improvement in generative models and abundance of datasets has led to

evolution of models that can generate realistic looking deepfake videos,

deceiving the human eye and machines as well (Fig. 1). There is a huge

potential to spread deepfaked videos by malicious actors for their gains, as

more than 100 million hours of video content is watched every day on social

media.

Fig. 1: Completely believable deepfakes can be generated with ease nowadays.

Looking at this with the perspective of computation, faster computation

speeds and easily available compute resources means that deepfake videos can

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DEEPFAKECLI:AcceleratedDeepfakeDetectionusingFPGAsOmkarBhilare?1;2,RahulSingh?1;2,VedantParanjape?1;2,SravanChittupalli?1;2,ShraddhaSuratkar1;2,andFarukKazi1;21DepartmentofElectricalEngineering,V.J.T.I,Mumbai,India.2foabhilareb19,rsinghb18,vvparanjapeb18schittupallib18@el.vjti.ac.in,sssuratkar@ce.vj...

展开>> 收起<<

DEEPFAKE CLI Accelerated Deepfake Detection using FPGAs Omkar Bhilare12 Rahul Singh12 Vedant Paranjape12 Sravan.pdf

共12页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

DEEPFAKE CLI Accelerated Deepfake Detection using FPGAs Omkar Bhilare12 Rahul Singh12 Vedant Paranjape12 Sravan

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: