
CIKM ’22, October 17–21, 2022, Atlanta, GA, USA Xiangyang Li et al.
Model for Pre-Ranking System. In Proceedings of the 31st ACM International
Conference on Information and Knowledge Management (CIKM ’22), October
17–21, 2022, Atlanta, GA, USA. ACM, New York, NY, USA, 11 pages. https:
//doi.org/10.1145/3511808.3557072
1 INTRODUCTION
Existing industrial information services, such as recommender sys-
tem, search engine, and advertisement system, are multi-stage cas-
cade ranking architecture, which contributes to balancing the ef-
ciency and eectiveness in comparison with the single-stage ar-
chitecture [
22
]. Typical cascade ranking system consists of Recall,
Pre-Ranking, Ranking, and Re-Ranking stages (Figure 1(a)). The
early stages face a massive number of candidates, and thus using
simple models (e.g., LR and DSSM [
9
]) to guarantee low inference
latency. On the contrary, the later stages pursue subtly selected
items that meet the user’s preferences, and hence complex models
(e.g., DeepFM [
6
] and AutoInt [
26
]) are conducive to improve the
prediction accuracy.
Prediction Accuracy
Inference Efficiency
FTRL’ 11
DAT’ 21
DCN’ 17
IntTower’ 22
COLD’ 20
Pre-Ranking Model
Ranking Model
DSSM’ 13
FSCD’ 21
DeepFM’ 17
AutoInt’ 19
Millions
Cascade Ranking
Thousands
Hundreds
Ten s
Recall
Pre-Ranking
Ranking
Re-Ranking
Display
(a) Cascade ranking system. (b) Comparison between prediction accuracy
and inference efficiency.
xDeepFM’ 18
Figure 1: The multi-stage cascade ranking architecture and the
comparison of model prediction accuracy and inference eciency.
Pre-ranking stage is in the middle of a cascade link, which is
absorbed in preliminarily ltering items (thousands of scale) re-
trieved from the previous recall stage and generating candidates
(hundreds of scale) for the subsequent ranking stage. Therefore,
both eectiveness and eciency need to be carefully considered.
Figure 1(b) depicts some representative pre-ranking and ranking
models from the perspective of prediction accuracy and inference
eciency. Compared with the ranking models, pre-ranking models
need to score more candidate items for each user request. Therefore,
pre-ranking models have higher inference eciency while weaker
prediction performance due to simpler structure.
In the evolution of pre-ranking system, LR (Logistic Regres-
sion) [
19
] is the most basic personalized pre-ranking model, which
is widely used in the shallow machine learning era [
20
,
28
,
31
].
With the rise of deep learning, many industrial companies deploy
various deep models in their commercial systems. The dominant
pre-ranking model in industry is two-tower model [
9
] (i.e., DSSM),
which utilizes neural networks to capture the interactive signals
within the user/item towers. Moreover, the item representations can
be pre-calculated oine and stored in the fast retrieval container.
During the online serving, only user representations are required
to be calculated in real time while the representations of candidate
items can be retrieved directly. These “user-item decoupling architec-
ture” paradigm provides sterling eciency. Besides, COLD [
32
] and
FSCD [
18
] propose a single-tower structure to fully model feature
interaction and further improve the prediction accuracy.
Despite great promise, existing pre-ranking models are dicult
to balance model eectiveness and inference eciency. For the
two-tower model, the cost of high eciency is the neglect of the
information interaction between user and item towers. Two towers
perform intra-tower information extraction parallelly and inde-
pendently, and the learned latent representations do not interact
until the output layer, which is referred to as “Late Interaction” [
36
],
hindering the model performance critically. However, the inter-
active signals between user features and item features are vital
for prediction [
30
]. Though DAT [
35
] attempts to alleviate this
issue by implicitly modeling the information interaction between
the two towers, the performance gain is still limited. As for the
single-tower structure pre-ranking models (i.e., COLD and FSCD),
although several optimization tricks are introduced for acceleration,
the eciency degradation is still severe (×10).
To solve the eciency-accuracy dilemma, we propose a next
generation of two-tower model for pre-ranking system, named
Int
eraction enhanced Two-
Tower
(
IntTower
), as illustrated in the
Figure 3. The core idea is to enhance the information interaction be-
tween user and item towers while keeping the “user-item decoupling
architecture” paradigm. By introducing ne-grained feature inter-
action modeling, the model capacity of two-tower can be improved
signicantly and the sterling inference eciency can be maintained.
Specically, IntTower rst leverages a lightweight Light-SE module
to identify the importance of dierent features and obtain rened
feature representations. Based on the rened representations, user
and item towers leverage multi-layer nonlinear transformation to
extract latent representations. To capture the interactive signals
between user and item representations, IntTower designs FE-Block
module and CIR module from
explicit
and
implicit
perspectives,
respectively. FE-Block module performs ne-grained and early fea-
ture interactions between multi-layer user representations and
last-layer item representation. Thus, multi-level feature interac-
tion modeling contributes to improving the prediction accuracy
while the user-item decoupling architecture enables high inference
eciency. Moreover, CIR proposes a contrastive interaction reg-
ularization to further enhance the interactions between user and
item representations.
Our main contributions are summarized as follows: (1) We pro-
pose IntTower, the next generation of two-tower model for the
pre-ranking system, which emphasizes both high prediction accu-
racy and inference eciency. (2) IntTower leverages a lightweight
Light-SE module to obtain rened feature representations. Based
on this, FE-Block module and CIR module are proposed to capture
the interactive signals between user and item representations from
explicit and implicit perspectives. (3) Comprehensive experiments
are conducted on three public datasets to demonstrate the superi-
ority of IntTower over prediction accuracy and inference eciency.
Moreover, we further verify the eectiveness of IntTower on a
large-scale advertisement pre-ranking system.