Deep Learning for Iris Recognition A Survey

2025-04-26 0 0 1.79MB 35 页 10玖币
侵权投诉
Deep Learning for Iris Recognition: A Survey
KIEN NGUYEN, Queensland University of Technology, Australia
HUGO PROENÇA, University of Beira Interior, IT: Instituto de Telecomunicações, Portugal
FERNANDO ALONSO-FERNANDEZ, Halmstad University, Sweden
ABSTRACT
In this survey, we provide a comprehensive review of more than 200 papers, technical reports, and GitHub
repositories published over the last 10 years on the recent developments of deep learning techniques for iris
recognition, covering broad topics on algorithm designs, open-source tools, open challenges, and emerging
research. First, we conduct a comprehensive analysis of deep learning techniques developed for two main
sub-tasks in iris biometrics: segmentation and recognition. Second, we focus on deep learning techniques for
the robustness of iris recognition systems against presentation attacks and via human-machine pairing. Third,
we delve deep into deep learning techniques for forensic application, especially in post-mortem iris recognition.
Fourth, we review open-source resources and tools in deep learning techniques for iris recognition. Finally,
we highlight the technical challenges, emerging research trends, and outlook for the future of deep learning
in iris recognition.
CCS Concepts: Security and privacy Biometrics.
Additional Key Words and Phrases: Iris Recognition, Deep Learning, Neural Networks
ACM Reference Format:
Kien Nguyen, Hugo Proença, and Fernando Alonso-Fernandez. 2022. Deep Learning for Iris Recognition: A
Survey. 1, 1 (October 2022), 35 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
The human iris is a sight organ that controls the amount of light reaching the retina, by changing
the size of the pupil. The texture of the iris is fully developed before birth, its minutiae do not depend
on genotype, it stays relatively stable across lifetime (except for disease- and normal aging-related
biological changes), and it may even be used for forensic identication shortly after subject’s
death [36, 110, 170].
In terms of its information theory-related properties, the iris texture has an extremely high
randotypic randomness, and is stable (permanent) over time, providing an exceptionally high
entropy per mm.
2
that justies its higher discriminating power, when compared to other biometric
modalities (e.g., face or ngerprint). The iris’ collectability is another feature of interest and has been
the subject of discussion over the last years: while it can be acquired using commercial o-the-shelf
(COTS) hardware, either handheld or stationary, data can be even collected from at-a-distance, up to
tens of meters away from the subjects [
111
]. Even though commercial visible-light (RGB) cameras
are able to image the iris, the near infrared-based (NIR) sensing dominates in most applications, due
Authors’ addresses: Kien Nguyen, Queensland University of Technology, Australia, nguyentk@qut.edu.au; Hugo Proença,
University of Beira Interior, IT: Instituto de Telecomunicações, Portugal, hugomcp@di.ubi.pt; Fernando Alonso-Fernandez,
Halmstad University, Sweden, feralo@hh.se.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and
the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specic permission and/or a fee. Request permissions from permissions@acm.org.
©2022 Association for Computing Machinery.
XXXX-XXXX/2022/10-ART $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
, Vol. 1, No. 1, Article . Publication date: October 2022.
arXiv:2210.05866v1 [cs.CV] 12 Oct 2022
2 K. Nguyen, H. Proença, F. Alonso-Fernandez
to a better visibility of iris texture for darker eyes, rich in melanin pigment, which is characterized
by lower light absorption in NIR spectrum compared to shorter wavelengths. In addition, NIR
wavelengths are barely perceivable by the human eye, which augment users’ comfort, and avoids
pupil contraction/dilation that would appear under visible light.
A seminal work by John Daugman brought to the community the Gabor ltering-based approach
that became the dominant approach for iris recognition [
34
,
35
,
37
]. Even though subsequent
solutions to iris image encoding and matching appeared, the IrisCodes approach is still dominant due
to its ability to eectively search in massive databases with a minimal probability of false matches,
at extreme time performance. By considering binary words, pairs of signatures are matched using
XOR parallel-bit logic at lightening speed, enabling millions of comparisons/second per processing
core. Also, most of the methods that outperformed the original techniques in terms of eectiveness
do not work under the one-shot learning paradigm, assume multiple observations of each class to
obtain appropriate decision boundaries, and - most importantly - have encoding/matching steps
with time complexity that forbid their use in large environments (in particular, for all-against-all
settings).
In short, Daugman’s algorithm encodes the iris image into a binary sequence of 2,048 bits by
ltering the iris image with a family of Gabor kernels. The varying pupil size is rectied by the
Cartesian-to-polar coordinate system transformation, to end up with an image representation of
canonical size, guarantying identical structure of the iris code independently of the iris and pupil
size. This makes possible to use the Hamming Distance (HD) to measure the similarity between two
iris codes [
37
]. Its low false match rate at acceptable false non-match rates is the key factor behind
the success of global-scale iris recognition installments, such as the national person identication
and border security program Aadhaar program in India (with over 1.2 billion pairs of irises enrolled)
[
174
], the Homeland Advanced Recognition Technology (HART) in the US (up to 500 million
identities) [
128
], or the NEXUS system, designed to speed up border crossings for low-risk and
pre-approved travelers moving between Canada and the US.
Deep learning-based methods, in particular using various Convolutional Neural Network archi-
tectures, have been driving remarkable improvements in many computer vision applications over
the last decade. In terms of biometrics technologies, it’s not surprising that iris recognition has
also seen an increasing adoption of purely data-driven approaches at all stages of the recognition
pipeline: from preprocessing (such as o-axis gaze correction), segmentation, encoding to matching.
Interestingly, however, the impact of deep learning on the various stages of iris recognition pipeline
is uneven. One of the primary goals of this survey paper is to assess where deep learning helped
in achieving highly performance and more secure systems, and which procedures did not benet
from more complex modeling.
The remainder of the paper is structured as follows. Section 2 and 3 review the application of deep
learning in two main stages of the recognition pipeline: segmentation and recognition (encoding
and comparison). Section 4 and 5 analyze the state of the art of deep learning-based approaches in
two applications: Presentation Attack Detection (PAD) and Forensic. Section 6 investigates how
human and machine can pair to improve deep learning based iris recognition. Section 7 focuses on
approaches in less controlled environments of iris and periocular analysis. Section 8 reviews public
resources and tools available in the deep learning based iris recognition domain. Section 9 focuses
on the future of deep learning for iris recognition with discussion on emerging research directions
in dierent aspects of iris analysis. The paper in concluded in Section 10.
2 DEEP LEARNING-BASED IRIS SEGMENTATION
The segmentation of the iris is seen as an extremely challenging problem. As illustrated in Fig. 1,
segmenting the iris involves essentially three tasks: detect and parameterize the inner (pupillary)
, Vol. 1, No. 1, Article . Publication date: October 2022.
Deep Learning for Iris Recognition: A Survey 3
and outer (scleric) biological boundaries of the iris and also to locally discriminate between the
noise-free/noisy regions inside the iris ring, which should be subsequently used in the feature
encoding and matching processes.
This problem has motivated numerous research works for decades. From the pioneering integro-
dierential operator [
34
] up to subsequent handcrafted techniques based in active contours and
neural networks (e.g., [
68
], [
133
], [
147
] and [
175
]) a long road has been traveled in this problem.
Regardless an obvious evolution in the eectiveness of such techniques, they all face particular
diculties in case of heavily degraded data. Images are frequently motion-blurred, poor focused,
partially occluded and o-angle. Additionally, in case of visible light data, severe reections from
the environments surrounding the subjects are visible, and even augment the diculties of the
segmentation task.
Recently, as in many other computer vision tasks, DL-based frameworks have been advocated as
providing consistent advances over the state-of-the-art for the iris segmentation problem, with
numerous models being proposed. A cohesive perspective of the most relevant recent DL-based
methods is given in Table 1, with the techniques appearing in chronographic (and then alphabetical)
order. The type of data each model aims to handle is given in the ”Data” column, along with
the datasets where the corresponding experiments were carried out and a summary of the main
characteristics of each proposal (”Features” column). Here, considering that models were empirically
validated in completely heterogeneous ways and using very dierent metrics, we decided not to
include the summary performance of each model/solution.
Scleric Boundary Parameterization 2
Noise-free Texture Detection 3
Pupillary Boundary Parameterization 1
Iris Segmentation Main Tasks
Dimensionless Noise-Free
Representation
2
1+ + 3
Fig. 1. Three main tasks typically associated to iris segmentation: 1) parameterization of the pupillary (inner)
boundary; 2) parameterization of the scleric (outer) boundary; and 3) discrimination between the unoccluded
(noise-free) and occluded (noisy) regions inside the iris ring. Such pieces of information are further used to
obtain dimensionless polar representations of the iris texture, where feature extraction methods typically
operate.
Schlett et al. [
144
] provided a multi-spectral analysis to improve iris segmentation accuracy
in visible wavelengths by preprocessing data before the actual segmentation phase, extracting
multiple spectral components in form of RGB color channels. Even though this approach does
propose a DL-based framework, the dierent versions of the input could be easily used to feed
DL-based models, and augment the robustness to non-ideal data. Chen et al. [
22
] used CNNs
that include dense blocks, referred to as a dense-fully convolutional network (DFCN), where the
encoder part consists of dense blocks, and the decoder counterpart obtains the segmentation
masks via transpose convolutions. Hofbauer et al. [
72
] parameterize the iris boundaries based on
segmentation maps yielding from a CNN, using a a cascaded architecture with four ReneNet
units, each directly connecting to one Residual net. Huynh et al. [
76
] discriminate between three
, Vol. 1, No. 1, Article . Publication date: October 2022.
4 K. Nguyen, H. Proença, F. Alonso-Fernandez
distinct eye regions with a DL model, and removes incorrect areas with heuristic lters. The
proposed architecture is based on the encoder-decoder model, with depth-wise convolutions used
to reduce the computational cost. Roughly at the same time, Li et al. [
94
] described the Interleaved
Residual U-Net model for semantic segmentation and iris mask synthesis. In this work, unsupervised
techniques (K-means clustering) were used to create intermediary pictorial representations of the
ocular region, from where saliency points deemed to belong to the iris boundaries were found.
Kerrigan et al. [
85
] assessed the performance of four dierent convolutional architectures designed
for semantic segmentation. Two of these models were based in dilated convolutions, as proposed
by Yu and Koltun [
188
]. Wu and Zhao [
186
] described the Dense U-Net model, that combines dense
layers to the U-Net network. The idea is to take advantage of the reduced set of parameters of
the dense U-Net, while keeping the semantic segmentation capabilities of U-Net. The proposed
model integrates dense connectivity into U-Net contraction and expansion paths. Compared with
traditional CNNs, this model is claimed to reduce learning redundancy and enhance information
ow, while keeping controlled the number of parameters of the model. Wei et al. [
205
] suggested
to perform dilated convolutions, which is claimed to obtain more consistent global features. In
this setting, convolutional kernels are not continuous, with zero-values being articially inserted
between each non-zero position, increasing the receptive eld without augmenting the number of
parameters of the model.
More recently, Ganeva and Myasnikov [
55
] compared the eectiveness of three convolutional
neural network architectures (U-Net, LinkNet, and FC- DenseNet), determining the optimal pa-
rameterization for each one. Jalilian et al. [
79
] introduced a scheme to compensate for texture
deformations caused by the o-angle distortions, re-projecting the o-angle images back to frontal
view. The used architecture is a variant of ReneNet [
96
], which provides high-resolution prediction,
while preserving the boundary information (required for parameterization purposes).
The idea of interactive learning for iris segmentation was suggested by Sardar et al. [
142
],
describing an interactive variant of U-Net that includes Squeeze Expand modules. Trokielewicz
et al. [
172
] used DL-based iris segmentation models to extract highly irregular iris texture areas
in post-mortem iris images. They used a pre-trained SegNet model, ne-tuned with a database
of cadaver iris images. Wang et al. [
178
] (further extended in [
179
]) described a lightweight deep
convolutional neural network specically designed for iris segmentation of degraded images
acquired by handheld devices. The proposed approach jointly obtains the segmentation mask and
parameterized pupillary/limbic boundaries of the iris.
Observing that edge-based information is extremely sensitive to be obtained in degraded data,
Li et al. [
7
] presented an hybrid method that combines edge-based information to deep learning
frameworks. A compacted Faster R-CNN-like architecture was used to roughly detect the eye and
dene the initial region of interest, from where the pupil is further located using a Gaussian mixture
model. Wang et al. [
184
] trained a deep convolutional neural network(DCNN) that automatically
extracts the iris and pupil pixels of each eye from input images. This work combines the power of
U-Net and SqueezeNet to obtain a compact CNN suitable for real time mobile applications. Finally,
Wang et al. [
176
] parameterize both the iris mask and the inner/outer iris boundaries jointly, by
actively modeling such information into a unied multi-task network.
A nal word is given to segmentation-less techniques. Assuming that the accurate segmentation of
the iris boundaries is one of the hardest phases of the whole recognition chain and the main source
for recognition errors, some recent works have been proposing to perform biometrics recognition
in non-segmented or roughly segmented data [
132
][
135
]. Here, the idea is to use the remarkable
discriminating power of DL-frameworks to perceive the agreeing patterns between pairs of images,
even on such segmentation-less representations.
, Vol. 1, No. 1, Article . Publication date: October 2022.
Deep Learning for Iris Recognition: A Survey 5
Table 1. Cohesive comparison of the most relevant DL-based iris segmentation methods (NIR: near-infrared;
VW: visible wavelength). Methods are listed in chronological (and then alphabetical) order.
Method Year Data Datasets Features
NIR VW
Schlett et al. [144] 2018 MobBIO Preprocessing (combines dierent possibili-
ties of the input RGB channels)
Trokielewicz and
Czajka [166]
2018 Warsaw-Post-Mortem v1.0 Fine-tuned CNN (SegNet)
Chen et al. [22] 2019 ✓ ✓ CASIA-Irisv4-Interval, IITD, UBIRIS.v2 Dense CNN
Hofbauer et
al. [72]
2019 ✓ ✗ IITD, CASIA-Irisv4-Interval, ND-Iris-0405 Cascaded architecture of four ReneNet,
each connecting to one Residual net
Huynh et al. [76] 2019 ✓ ✗ OpenEDS MobileNetV2 + heuristic ltering postproc.
Li et al. [7] 2019 ✓ ✗ CASIA-Iris-Thousand Faster-R-CNN (ROI detection)
Kerrigan et al. [85] 2019 ✓ ✓ CASIA-Irisv4-Interval, BioSec, ND-Iris-
0405, UBIRIS.v2, Warsaw-Post-Mortem
v2.0, ND-TWINS-2009-2010
Resent + Segnet (with dilated convolutions)
Wu and Zhao [186] 2019 ✓ ✓ CASIA-Irisv4-Interval, UBIRIS.v2 Dense-U-Net (dense layers + U-Net)
Wei et al. [205] 2019 ✓ ✓ CASIA-Iris4-Interval, ND-IRIS-0405,
UBIRIS.v2
U-Net with dilated convolutions
Fang and
Czajka [50]
2020 ✓ ✓ ND-Iris-0405, CASIA, BATH, BioSec,
UBIRIS, Warsaw-Post-Mortem v1.0 & v2.0
Fine-tuned CC-Net [106]
Ganeva and Myas-
nikov [55]
2020 ✓ ✗ MMU U-Net, LinkNet, and FC-DenseNet (perfor-
mance comparison)
Jalilian et al. [79] 2020 ✓ ✗ ReneNet + morphological postprocessing
Sardar et al. [142] 2020 ✓ ✓ CASIA-Irisv4-Interval, IITD, UBIRIS.v2 Squeeze-Expand module + active learning
(interactive segmentation)
Trokielewicz et
al. [172]
2020 ✓ ✓ ND-Iris-0405, CASIA, BATH, BioSec,
UBIRIS, Warsaw-Post-Mortem v1.0 & v2.0
Fined-tuned SegNet [9]
Wang et al [178] 2020 ✓ ✓ CASIA-Iris-M1-S1/S2/S3, MICHE-I Hourglass network
Wang et al. [176] 2020 ✓ ✓ CASIA-v4-Distance, UBIRIS.v2, MICHE-I U-Net + multi-task attention net + postproc.
(probabilistic masks priors & thresholding)
Li et al. [94] 2021 ✓ ✗ CASIA-Iris-Thousand IRU-Net network
Wang et al. [184] 2021 Online Video Streams and Internet Videos U-Net and Squeezenet to iris segmentation
and detect eye closure
Kuehlkamp
et al. [91]
2022 ✓ ✓ ND-Iris-0405, CASIA, BATH, BioSec,
UBIRIS, Warsaw-Post-Mortem v2.0
Fined-tuning of Mask-RCNN architecture
3 DEEP LEARNING-BASED IRIS RECOGNITION
3.1 Deep Learning Models as a Feature Extractor
As illustrated in Fig. 2, the idea here is to analyze a dimensionless representation of the iris data and
produce a feature vector that lies in a hyperspace (embedding) where recognition is carried out.
In this context, Boyd et el. [15] explored ve dierent sets of weights for the popular ResNet50
architecture to test if iris-specic feature extractors perform better than models trained for general
tasks. Minaee et al. [
105
] studied the application of deep features extracted from VGG-Net for
iris recognition, having authors observed that the resulting features can be well transferred to
biometric recognition. Luo et al. [
102
] described a DL model with spatial attention and channel
attention mechanisms, that are directly inserted into the feature extraction module. Also, a co-
attention mechanism adaptively fuses features to obtain representative iris-periocular features.
Hafner et al. [
65
] adapted the classical Daugman’s pipeline, using convolutional neural networks
, Vol. 1, No. 1, Article . Publication date: October 2022.
摘要:

DeepLearningforIrisRecognition:ASurveyKIENNGUYEN,QueenslandUniversityofTechnology,AustraliaHUGOPROENÇA,UniversityofBeiraInterior,IT:InstitutodeTelecomunicações,PortugalFERNANDOALONSO-FERNANDEZ,HalmstadUniversity,SwedenABSTRACTInthissurvey,weprovideacomprehensivereviewofmorethan200papers,technicalrep...

展开>> 收起<<
Deep Learning for Iris Recognition A Survey.pdf

共35页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:35 页 大小:1.79MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 35
客服
关注