A Study on the Efficiency and Generalization of Light Hybrid Retrievers Man Luo1Shashank Jain2Anchit Gupta2Arash Einolghozati2 Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari2

2025-04-27 0 0 681.59KB 8 页 10玖币
侵权投诉
A Study on the Efficiency and Generalization of Light Hybrid Retrievers
Man Luo 1Shashank Jain2Anchit Gupta2Arash Einolghozati2
Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari 2
1Arizona State University
2Meta Reality Lab
1{mluo26, chitta}@asu.edu
2{shajain, anchit, arashe, barlaso, debo, xilun, peymanheidari}@fb.com
Abstract
Hybrid retrievers can take advantage of both
sparse and dense retrievers. Previous hybrid re-
trievers leverage indexing-heavy dense retriev-
ers. In this work, we study “Is it possible to re-
duce the indexing memory of hybrid retrievers
without sacrificing performance?” Driven by
this question, we leverage an indexing-efficient
dense retriever (i.e. DrBoost) and introduce a
LITE retriever that further reduces the mem-
ory of DrBoost. LITE is jointly trained on
contrastive learning and knowledge distillation
from DrBoost. Then, we integrate BM25, a
sparse retriever, with either LITE or DrBoost to
form light hybrid retrievers. Our Hybrid-LITE
retriever saves
13×
memory while maintaining
98.0%
performance of the hybrid retriever of
BM25 and DPR. In addition, we study the gen-
eralization capacity of our light hybrid retriev-
ers on out-of-domain dataset and a set of adver-
sarial attacks datasets. Experiments showcase
that light hybrid retrievers achieve better gen-
eralization performance than individual sparse
and dense retrievers. Nevertheless, our analy-
sis shows that there is a large room to improve
the robustness of retrievers, suggesting a new
research direction.
1 Introduction
The classical IR methods, such as BM25 (Robert-
son et al.,2009), produce sparse vectors for ques-
tion and documents based on bag-of-words ap-
proaches. Recent research pays attention toward
building neural retrievers which learn dense embed-
dings of the query and document into a semantic
space (Karpukhin et al.,2020;Khattab and Za-
haria,2020). Sparse and dense retrievers have their
pros and cons, and the hybrid of sparse and dense
retrievers can take advantage of both worlds and
achieve better performance than individual sparse
and dense retrievers. Therefore, hybrid retrievers
are widely used in practice (Ma et al.,2021b;Chen
et al.,2021).
Figure 1: The teacher model (DrBoost) consists of N
weak-learners and produces embeddings of dimension
N*D. The student model (LITE) has one weak-learner
and produces two embeddings: one has dimension of D,
and one has dimension of N*D. The smaller embeddings
learn to maximize the similarity between question and
positive context embeddings, and the larger embeddings
learn the embeddings from the teacher model.
Previous hybrid retrievers are composed of
indexing-heavy dense retrievers (DR), in this work,
we study the question “Is it possible to reduce
the indexing memory of hybrid retrievers without
sacrificing performance?” To answer this ques-
tion, we reduce the memory by using the state-of-
the-art indexing-efficient retriever, DrBoost (Lewis
et al.,2021), a boosting retriever with multiple
“weak” learners. Compared to DPR (Karpukhin
et al.,2020), a representative DR, DrBoost reduces
the indexing memory by 6 times while maintain-
ing the performance. We introduce a LITE model
that further reduces the memory of DrBoost, which
is jointly trained on retrieval task via contrastive
learning and knowledge distillation from DrBoost
(see Figure 1). We then integrate BM25 with ei-
ther LITE and DrBoost to form light hybrid re-
trievers (Hybrid-LITE and Hybrid-DrBoost) to as-
sess whether light hybrid retrievers can achieve
memory-efficiency and sufficient performance.
We conduct experiments on the NaturalQuestion
dataset (Kwiatkowski et al.,2019) and draw inter-
esting results. First of all, LITE retriever maintains
arXiv:2210.01371v2 [cs.IR] 23 May 2023
98.7%
of the teacher model performance and re-
duces its memory by
2
times. Second, our Hybrid-
LITE saves more than
13×
memory compared to
Hybrid-DPR, while maintaining more than
98.0%
performance; and Hybrid-DrBoost reduces the in-
dexing memory (
8×
) compared to Hybrid-DPR
and maintains at least
98.5%
of the performance.
This shows that the light hybrid model can achieve
sufficient performance while reducing the indexing
memory significantly, which suggests the practi-
cal usage of light retrievers for memory-limited
applications, such as on-devices.
One important reason for using hybrid retriev-
ers in real-world applications is the generalization.
Thus, we further study if reducing the indexing
memory will hamper the generalization of light hy-
brid retrievers. Two prominent ideas have emerged
to test generalization: out-of-domain (OOD) gener-
alization and adversarial robustness (Gokhale et al.,
2022). We study OOD generalization of retriev-
ers on EntityQuestion (Sciavolino et al.,2021).
To study the robustness, we leverage six tech-
niques (Morris et al.,2020) to create adversarial
attack testing sets based on NQ dataset. Our exper-
iments demonstrate that Hybrid-LITE and Hybrid-
DrBoost achieve better generalization performance
than individual components. The study of robust-
ness shows that hybrid retrievers are always bet-
ter than sparse and dense retrievers. Nevertheless
all retrievers are vulnerable, suggesting room for
improving the robustness of retrievers, and our
datasets can aid the future research.
2 Related Work
Hybrid Retriever integrates the sparse and
dense retriever and ranks the documents by inter-
polating the relevance score from each retriever.
The most popular way to obtain the hybrid ranking
is applying linear combination of the sparse/dense
retriever scores (Karpukhin et al.,2020;Ma et al.,
2020;Luan et al.,2021;Ma et al.,2021a;Luo
et al.,2022). Instead of using the scores, Chen
et al. (2022) adopts Reciprocal Rank Fusion (Cor-
mack et al.,2009) to obtain the final ranking by
the ranking positions of each candidate retrieved
by individual retriever. Arabzadeh et al. (2021)
trains a classification model to select one of the
retrieval strategies: sparse, dense or hybrid model.
Most of the hybrid models rely on heavy dense
retrievers, and one exception is (Ma et al.,2021a),
where they use linear projection, PCA, and product
quantization (Jegou et al.,2010) to compress the
dense retriever component. Our hybrid retrievers
use either DrBoost or our proposed LITE as the
dense retrievers, which are more memory-efficient
and achieve better performance than the methods
used in (Ma et al.,2021a).
Indexing-Efficient Dense Retriever. Efficiency
includes two dimensions: latency (Seo et al.,2019;
Lee et al.,2021;Varshney et al.,2022) and mem-
ory. In this work, our primary focus is on memory,
specifically the memory used for indexing. Most
of the existing DRs are indexing heavy (Karpukhin
et al.,2020;Khattab and Zaharia,2020;Luo,
2022). To improve the indexing efficiency, there
are mainly three types of techniques. One is to
use vector product quantization (Jegou et al.,2010).
Second is to compress a high dimension dense vec-
tor to a low dimension dense vector, for e.g. from
768 to 32 dimension (Lewis et al.,2021;Ma et al.,
2021a). The third way is to use a binary vector (Ya-
mada et al.,2021;Zhan et al.,2021). Our proposed
method LITE (§3.2) reduces the indexing memory
by joint training of retrieval task and knowledge
distillation from a teacher model.
Generalization of IR. Two main benchmarks
have been proposed to study the OOD generaliza-
tion of retrievers, BEIR (Thakur et al.,2021b) and
EntityQuestion (Sciavolino et al.,2021). As shown
by previous work (Thakur et al.,2021b;Chen et al.,
2022), the generalization is one major concern of
DR. To address this limitation, Wang et al. (2021)
proposed GPL, a domain adaptation technique to
generate synthetic question-answer pairs in specific
domains. A follow-up work Thakur et al. (2022)
trains BPR and JPQ on the GPL synthetic data to
achieve efficiency and generalization. Chen et al.
(2022) investigates a hybrid model in the OOD set-
ting, yet different from us, they use a heavy DR
and do not concern the indexing memory. Most
existing work studies OOD generalization, and
much less attention paid toward the robustness of
retrievers (Penha et al.,2022;Zhuang and Zuccon,
2022;Chen et al.). To study robustness, Penha et al.
(2022) identifies four ways to change the syntax
of the queries but not the semantics. Our work
is a complementary to Penha et al. (2022), where
we leverage adversarial attack techniques (Morris
et al.,2020) to create six different testing sets for
NQ dataset (Kwiatkowski et al.,2019).
摘要:

AStudyontheEfficiencyandGeneralizationofLightHybridRetrieversManLuo1∗ShashankJain2AnchitGupta2†ArashEinolghozati2†BarlasOguz2†DebojeetChatterjee2†XilunChen2†ChittaBaral1PeymanHeidari21ArizonaStateUniversity2MetaRealityLab1{mluo26,chitta}@asu.edu2{shajain,anchit,arashe,barlaso,debo,xilun,peymanheidar...

展开>> 收起<<
A Study on the Efficiency and Generalization of Light Hybrid Retrievers Man Luo1Shashank Jain2Anchit Gupta2Arash Einolghozati2 Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari2.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:681.59KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注