Ensemble Learning using Transformers and Convolutional Networks for Masked Face Recognition

2025-04-29 0 0 3.14MB 6 页 10玖币
侵权投诉
Ensemble Learning using Transformers and
Convolutional Networks for Masked Face
Recognition
Mohammed R. Al-Sinan, Aseel F. Haneef, and Hamzah Luqman
Information and Computer Science Department, King Fahd University of Petroleum and Minerals
SDAIA-KFUPM Joint Research Center for Artificial Intelligence, Dhahran 31261, Saudi Arabia.
Email: {g201354590, g201565430, hluqman}@kfupm.edu.sa
Abstract—Wearing a face mask is one of the adjustments we
had to follow to reduce the spread of the coronavirus. Having
our faces covered by masks constantly has driven the need
to understand and investigate how this behavior affects the
recognition capability of face recognition systems. Current face
recognition systems have extremely high accuracy when dealing
with unconstrained general face recognition cases but do not
generalize well with occluded masked faces. In this work, we
propose a system for masked face recognition. The proposed
system comprises two Convolutional Neural Network (CNN)
models and two Transformer models. The CNN models have
been fine-tuned on FaceNet pre-trained model. We ensemble
the predictions of the four models using the majority voting
technique to identify the person with the mask. The proposed
system has been evaluated on a synthetically masked LFW
dataset created in this work. The best accuracy is obtained
using the ensembled models with an accuracy of 92%. This
recognition rate outperformed the accuracy of other models and
it shows the correctness and robustness of the proposed model
for recognizing masked faces. The code and data are available
at https://github.com/Hamzah-Luqman/MFR.
Index Terms—Masked Face Recognition, Face Recognition,
Face De-occlusion, Transformer, Ensemble Learning, LFW
dataset
I. INTRODUCTION
Coronavirus or COVID-19 is a global pandemic that has
affected more than 227 countries and territories [1]. This
disease has led to a serious negative impact on people’s health
and the global economy. Wearing face masks has become a
necessity in our daily lives as a preventive measure to avoid
the disease. According to the Centers for Disease Control and
Prevention (CDC), the most effective way to avoid spreading
the disease or being infected with it is to practice social
distancing and wear face masks [2].
Face recognition systems have been extensively used during
this pandemic. Coronavirus can be transmitted quickly be-
tween people via surfaces. This forced several organizations
and entities to avoid using touchable authentication devices
such as fingerprint and password-based security systems.
These procedures increased the dependency on systems that
avoid unnecessary contact with surfaces. A face recognition
system is one of these systems that is used for user authen-
tication. In addition, they are used for security purposes that
involve people recognition and verification.
However, wearing masks has driven the need to understand
and investigate how these masks affect the existing digital
systems such as face detection systems, face recognition
systems, and face verification systems. According to Noa
et al. [3], face masks interfere with basic mechanisms of
face recognition accuracy for facial identity, gender, age, and
emotional identification. For example, face masks can cause
recognition systems to misinterpret disgusted faces as angry
faces. Another study has been conducted by the National
Institute of Standards and Technology (NIST) [4] to evaluate
some commercial facial recognition systems on masked face
images. This study reported an error rate of 5-50% with these
systems on recognizing faces with masks created digitally on
faces without masks.
The failure of the currently available face recognition
systems on recognizing masked faces can be attributed to
several reasons. The primary reason is the lack of adequate
visual and identity cues due to the facial mask that covers
almost half of the face. This occlusion takes away a large
percentage of human face features [5]. Therefore, this type of
occlusion adds some difficulties to the recognition models to
identify masked faces. Several techniques have been proposed
to address this problem. Some of these techniques depend
on the un-occluded regions of the face to identify the person
while other approaches involve the full masked face for the
recognition. Other approaches in the literature tackled this
problem by reconstructing the occluded regions in the face
and then recognizing the whole face.
Many of the current face recognition methods depend on
deep learning models for recognition. These models were
proven to have very high accuracy even beyond the human
recognition capability on non-occluded faces [6], [7]. How-
ever, few works targeted masked face recognition. In this work,
we propose a system for recognizing the identity of the person
wearing a facial mask. Five systems have been proposed in this
work. Three of these models are CNN-based models fine-tuned
on different pre-trained models. We also use the state-of-the-
art Transformer model for masked face recognition. To utilize
the features of the fine-tuned models and the Transformer, we
ensemble two CNN models and two Transformer models and
apply the majority voting technique for the final decision. The
proposed techniques have been evaluated on the LFW dataset
arXiv:2210.04816v1 [cs.CV] 10 Oct 2022
and a masked version of LFW was created in this work. The
obtained results show that the ensemble learning outperformed
other models.
This paper is organized as follows: Section 2 reviews the
related works. Section 3 presents the proposed approach.
Section 4 describes the experimental work and the obtained
results. Finally, the conclusions and future work are presented
in Section 5.
II. RELATED WORK
Several approaches have been proposed for masked face
recognition. These techniques can be categorized into recog-
nition techniques that recognize the face with a mask with-
out performing de-occlusion and techniques that perform de-
occlusion before recognizing the face.
A. Masked Face Recognition Without De-occlusion
Several approaches have been proposed for recognizing
masked faces without the need for reconstructing areas under
the mask. CNN has been extensively used and included in
the state-of-the-art architectures for unconstrained general face
recognition. However, when the face of the subject is occluded
by some objects such as the facial masks or scarves and
sunglasses, the accuracy of the CNN model drops significantly
[8]. This drop in the performance happens mostly when the
models are trained on unconstrained face images and tested on
occluded ones [9]. Therefore, some researchers trained their
model on a mix of these images to boost the model accuracy.
However, Song et al. [10] argued that adding a large amount
of partially occluded images is not enough because the learned
features of two faces with different occlusion conditions are
still inconsistent. Therefore, they introduced a method that
discards the facial mask and focuses on the features extracted
from other face regions. This approach was evaluated on AR
dataset and achieved an accuracy of 99.03%.
Li et al. [11] used Convolutional Block Attention Module
(CBAM) [12] for masked face recognition. The authors fed the
model with the subject’s eye extracted using different crop-
ping approaches. The proposed approach has been evaluated
on Masked-LFW [13] dataset and obtained an accuracy of
82.86%. They also tested their model’s recognition on masked-
Webface Dataset and achieved 91.525% accuracy compared
to 88.01% and 87.906% with Arcface [6] and Cosface [7]
methods, respectively. A similar approach was followed by
Hariri [14] for masked face recognition. The authors discarded
the occlusion portion of the face and kept only the area
around the eyes. The pre-trained VGG16 model was used to
extract features from the segmented eyes. This approach was
evaluated on the Real-World-Masked-Face-Dataset [15] and
accuracy of 91.3% was reported using 10-fold cross-validation
technique.
Wan et al. [16] proposed a deep trainable model, MaskNet,
that learns image features and neglects deformation by oc-
clusion. The authors claimed that the MaskNet model can be
involved in CNN architectures with minimum identity labels
and less computation. A verification accuracy of 96.4% was
reported on the LFW dataset when the face is randomly
occluded with a square of size 40. However, this accuracy
decreases as the size of the occlusion block increases.
Other approaches tried to improve the masked face recog-
nition accuracy by minimizing the intra-class and maximizing
the inter-class distances using different loss functions. Early
approaches used loss functions such as triplet loss [17] and
N-pairs [18] to optimize the distance while recent techniques
used other loss functions such as Arcface [6] and cosface [7].
Sface was proposed by Zhong et al. [19] to minimize the dis-
tance between a face with and without a mask by altering the
softmax loss function. Sface addresses the issue of overfitting
to low-quality training images and noisy labels by introducing
the Sigmoid-constrained Hypersphere loss function that re-
scales the gradients of intra-class and inter-class gradients
accordingly. SFace was evaluated on multiple benchmarking
datasets and achieved a verification accuracy of 99.82% and
90.63% on LFW and masked-LFW [13] datasets, respectively.
B. Masked Face Recognition with De-occlusion
A common approach to doing mask face recognition is
to restore the covered area with the mask [20]. Several
approaches have been used for masked face recognition by
face restoration. One of these approaches is by extracting
the key facial features with the help of pre-trained models.
The restored face is then matched to the original face to
recognize the person. The quality of the restored region plays
an important role in masked face recognition. Iizuka et al. [21]
proposed a generative model for face restoration. The proposed
model employed an adversarial training approach using global
and local context discriminators. The global discriminator
assesses the entire image and the local discriminator looks
at a small area in the completed region to ensure consistency
with generated patches. An improvement to this approach was
proposed by Yu et al. [22]. The improvement was to split
the image completion network into a coarse network and a
refinement network. The refinement network takes the initial
coarse prediction and produces refined results. The authors
used Wasserstein GAN (WGAN) [23] in their network to
improve the results.
In contrast to several cases of image inpainting where the
missing part is small and not complex in shape, a facial mask
covers a big region of the face which makes this task more
challenging. Din et al. [5] proposed a generative network
consisting of two discriminators to learn the general face shape
and one generator. This approach was capable of removing the
facial mask using the binary map and synthesizing the missing
regions while keeping the initial face structure. The proposed
mask extraction encoder uses five blocks of convolution layers.
The decoder component has the same architecture as the
encoder except for the convolution layers that were replaced
with deconvolution layers. This approach was evaluated on
CelebA Dataset and structural similarity (SSIM) of 0.864 was
reported.
Yu et al. [24] used a gated convolutional network that
provides a learnable dynamic feature selection mechanism
摘要:

EnsembleLearningusingTransformersandConvolutionalNetworksforMaskedFaceRecognitionMohammedR.Al-Sinan,AseelF.Haneef,andHamzahLuqmanInformationandComputerScienceDepartment,KingFahdUniversityofPetroleumandMineralsSDAIA-KFUPMJointResearchCenterforArticialIntelligence,Dhahran31261,SaudiArabia.Email:fg2...

展开>> 收起<<
Ensemble Learning using Transformers and Convolutional Networks for Masked Face Recognition.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:3.14MB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注