4 K. Nguyen, H. Proença, F. Alonso-Fernandez
distinct eye regions with a DL model, and removes incorrect areas with heuristic lters. The
proposed architecture is based on the encoder-decoder model, with depth-wise convolutions used
to reduce the computational cost. Roughly at the same time, Li et al. [
94
] described the Interleaved
Residual U-Net model for semantic segmentation and iris mask synthesis. In this work, unsupervised
techniques (K-means clustering) were used to create intermediary pictorial representations of the
ocular region, from where saliency points deemed to belong to the iris boundaries were found.
Kerrigan et al. [
85
] assessed the performance of four dierent convolutional architectures designed
for semantic segmentation. Two of these models were based in dilated convolutions, as proposed
by Yu and Koltun [
188
]. Wu and Zhao [
186
] described the Dense U-Net model, that combines dense
layers to the U-Net network. The idea is to take advantage of the reduced set of parameters of
the dense U-Net, while keeping the semantic segmentation capabilities of U-Net. The proposed
model integrates dense connectivity into U-Net contraction and expansion paths. Compared with
traditional CNNs, this model is claimed to reduce learning redundancy and enhance information
ow, while keeping controlled the number of parameters of the model. Wei et al. [
205
] suggested
to perform dilated convolutions, which is claimed to obtain more consistent global features. In
this setting, convolutional kernels are not continuous, with zero-values being articially inserted
between each non-zero position, increasing the receptive eld without augmenting the number of
parameters of the model.
More recently, Ganeva and Myasnikov [
55
] compared the eectiveness of three convolutional
neural network architectures (U-Net, LinkNet, and FC- DenseNet), determining the optimal pa-
rameterization for each one. Jalilian et al. [
79
] introduced a scheme to compensate for texture
deformations caused by the o-angle distortions, re-projecting the o-angle images back to frontal
view. The used architecture is a variant of ReneNet [
96
], which provides high-resolution prediction,
while preserving the boundary information (required for parameterization purposes).
The idea of interactive learning for iris segmentation was suggested by Sardar et al. [
142
],
describing an interactive variant of U-Net that includes Squeeze Expand modules. Trokielewicz
et al. [
172
] used DL-based iris segmentation models to extract highly irregular iris texture areas
in post-mortem iris images. They used a pre-trained SegNet model, ne-tuned with a database
of cadaver iris images. Wang et al. [
178
] (further extended in [
179
]) described a lightweight deep
convolutional neural network specically designed for iris segmentation of degraded images
acquired by handheld devices. The proposed approach jointly obtains the segmentation mask and
parameterized pupillary/limbic boundaries of the iris.
Observing that edge-based information is extremely sensitive to be obtained in degraded data,
Li et al. [
7
] presented an hybrid method that combines edge-based information to deep learning
frameworks. A compacted Faster R-CNN-like architecture was used to roughly detect the eye and
dene the initial region of interest, from where the pupil is further located using a Gaussian mixture
model. Wang et al. [
184
] trained a deep convolutional neural network(DCNN) that automatically
extracts the iris and pupil pixels of each eye from input images. This work combines the power of
U-Net and SqueezeNet to obtain a compact CNN suitable for real time mobile applications. Finally,
Wang et al. [
176
] parameterize both the iris mask and the inner/outer iris boundaries jointly, by
actively modeling such information into a unied multi-task network.
A nal word is given to segmentation-less techniques. Assuming that the accurate segmentation of
the iris boundaries is one of the hardest phases of the whole recognition chain and the main source
for recognition errors, some recent works have been proposing to perform biometrics recognition
in non-segmented or roughly segmented data [
132
][
135
]. Here, the idea is to use the remarkable
discriminating power of DL-frameworks to perceive the agreeing patterns between pairs of images,
even on such segmentation-less representations.
, Vol. 1, No. 1, Article . Publication date: October 2022.