
InterFace: Adjustable Angular Margin Inter-class Loss for Deep Face
Recognition
Meng Sang1,2, Jiaxuan Chen2,3, Mengzhen Li1,2,Pan Tan1,2, Anning Pan1,2, Shan Zhao1,2, Yang Yang1,2
1Yunnan Normal University, 650500, Kunming, China
2Laboratory of Pattern Recognition and Artificial Intelligence, 650500, Kunming, China
3Zhejiang University, 310058, Hangzhou, China
Emails: sangmeng.one@gmail.com, yyang ynu@163.com
Abstract— In the field of face recognition, it is always a hot
research topic to improve the loss solution to make the face
features extracted by the network have greater discriminative
power. Research works in recent years has improved the
discriminative power of the face model by normalizing softmax
to the cosine space step by step and then adding a fixed
penalty margin to reduce the intra-class distance to increase
the inter-class distance. Although a great deal of previous work
has been done to optimize the boundary penalty to improve
the discriminative power of the model, adding a fixed margin
penalty to the depth feature and the corresponding weight is
not consistent with the pattern of data in the real scenario.
To address this issue, in this paper, we propose a novel loss
function, InterFace, releasing the constraint of adding a margin
penalty only between the depth feature and the corresponding
weight to push the separability of classes by adding corre-
sponding margin penalties between the depth features and all
weights. To illustrate the advantages of InterFace over a fixed
penalty margin, we explained geometrically and comparisons
on a set of mainstream benchmarks. From a wider perspective,
our InterFace has advanced the state-of-the-art face recognition
performance on five out of thirteen mainstream benchmarks.
All training codes, pre-trained models, and training logs, are
publicly released. 1.
I. INTRODUCTION
With the development of face recognition technology, it
has been applied to various fields in life, such as finance,
security, and enterprises. The face recognition system con-
sists of the process of image acquisition, face detection,
face alignment, feature extraction, and feature matching. In
the process of face matching, the vectors generated in the
feature extraction are required to measure the similarity with
all the faces. Knowing that it is intuitive that the model
should have small intra-class distances for samples of the
same identity and large inter-class distances for samples
of different identities. Hence, in the face system, we are
required to make the right decision boundary, even if the
picture of the face changes dramatically under the same
identity, and also to reject the imposter under a different
identity.
Although the accuracy of face recognition has improved
greatly, it has not yet achieved the expected results [13].
Most of the recent related studies [20], [22], [4], [16], [15],
[24], [5], [10], [11], [6], [18], [2] have focused on improving
1https ://github.com/iamsangmeng/InterF ace
the loss function. The content of our work focuses on the
problem of fixed margin penalty existing in the direction of
Classification Task I-.0.b and we propose our our solution
based on ArcFace [5]. Next, we will sort out the related
work and problems in previous loss studies and summarize
them in two directions for metric learning and classification
tasks as follows.
a) Deep Learning: In the direction of metric learning,
the design of losses [20], [22], [4] is based on triplet. Facenet
[20] directly learns a mapping from face images to a compact
Euclidean space where distances directly correspond to a
measure of face similarity. N-pair [22] proposed objective
function firstly generalizes triplet loss by allowing joint
comparison among more than one negative examples – more
specifically, N−1negative examples – and secondly reduces
the computational burden of evaluating deep embedding
vectors via an efficient batch construction strategy using only
Npairs of examples, instead of (N+ 1)N. Contrastive [4]
losses learn a function that maps input patterns into a target
space such that the L1norm in the target space approximates
the ”semantic” distance in the input space. However, the
number of triplets explodes during the training period as the
number of samples in the training dataset increases.
b) Classification Task: In the direction of the classifi-
cation task, subsequent losses [16], [15], [24], [5], [10], [11],
[6], [18], [2] are designed on the basis of softmax losses. Liu
et al. [16] proposed a large-margin softmax (L-Softmax) by
introducing penalty margin ideas for softmax to encourage
intra-class compactness and inter-class separability between
learned features. SphereFace [15] extends previous work
on L-Softmax by further constraining the weights of fully
connected layers to impose discriminative constraints on
a hypersphere manifold, which intrinsically matches the
prior that faces also lie on a manifold. SphereFace deploys
a multiplicative angular penalty margin between the deep
features and their corresponding weights. In CosFace [24],
it is proposed to add a cosine angle between depth features
and weights. CosFace fixes the norm value of the depth
feature and the corresponding weight and proposes to scale
the norm of the depth feature to a constant s. ArcFace
[5] proposed additive angular margin by deploying angular
penalty margin on the angle between the deep features and
their corresponding weights. The great success of softmax
arXiv:2210.02018v2 [cs.CV] 9 Oct 2022