A Study on the Efficiency and Generalization of Light Hybrid Retrievers Man Luo1Shashank Jain2Anchit Gupta2Arash Einolghozati2 Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari2

2025-04-27 0 0 681.59KB 8 页 10玖币

侵权投诉

A Study on the Efﬁciency and Generalization of Light Hybrid Retrievers

Man Luo 1∗Shashank Jain2Anchit Gupta2†Arash Einolghozati2†

Barlas Oguz2†Debojeet Chatterjee2†Xilun Chen2†Chitta Baral1Peyman Heidari 2

1Arizona State University

2Meta Reality Lab

1{mluo26, chitta}@asu.edu

2{shajain, anchit, arashe, barlaso, debo, xilun, peymanheidari}@fb.com

Abstract

Hybrid retrievers can take advantage of both

sparse and dense retrievers. Previous hybrid re-

trievers leverage indexing-heavy dense retriev-

ers. In this work, we study “Is it possible to re-

duce the indexing memory of hybrid retrievers

without sacriﬁcing performance?” Driven by

this question, we leverage an indexing-efﬁcient

dense retriever (i.e. DrBoost) and introduce a

LITE retriever that further reduces the mem-

ory of DrBoost. LITE is jointly trained on

contrastive learning and knowledge distillation

from DrBoost. Then, we integrate BM25, a

sparse retriever, with either LITE or DrBoost to

form light hybrid retrievers. Our Hybrid-LITE

retriever saves

13×

memory while maintaining

98.0%

performance of the hybrid retriever of

BM25 and DPR. In addition, we study the gen-

eralization capacity of our light hybrid retriev-

ers on out-of-domain dataset and a set of adver-

sarial attacks datasets. Experiments showcase

that light hybrid retrievers achieve better gen-

eralization performance than individual sparse

and dense retrievers. Nevertheless, our analy-

sis shows that there is a large room to improve

the robustness of retrievers, suggesting a new

research direction.

1 Introduction

The classical IR methods, such as BM25 (Robert-

son et al.,2009), produce sparse vectors for ques-

tion and documents based on bag-of-words ap-

proaches. Recent research pays attention toward

building neural retrievers which learn dense embed-

dings of the query and document into a semantic

space (Karpukhin et al.,2020;Khattab and Za-

haria,2020). Sparse and dense retrievers have their

pros and cons, and the hybrid of sparse and dense

retrievers can take advantage of both worlds and

achieve better performance than individual sparse

and dense retrievers. Therefore, hybrid retrievers

are widely used in practice (Ma et al.,2021b;Chen

et al.,2021).

Figure 1: The teacher model (DrBoost) consists of N

weak-learners and produces embeddings of dimension

N*D. The student model (LITE) has one weak-learner

and produces two embeddings: one has dimension of D,

and one has dimension of N*D. The smaller embeddings

learn to maximize the similarity between question and

positive context embeddings, and the larger embeddings

learn the embeddings from the teacher model.

Previous hybrid retrievers are composed of

indexing-heavy dense retrievers (DR), in this work,

we study the question “Is it possible to reduce

the indexing memory of hybrid retrievers without

sacriﬁcing performance?” To answer this ques-

tion, we reduce the memory by using the state-of-

the-art indexing-efﬁcient retriever, DrBoost (Lewis

et al.,2021), a boosting retriever with multiple

“weak” learners. Compared to DPR (Karpukhin

et al.,2020), a representative DR, DrBoost reduces

the indexing memory by 6 times while maintain-

ing the performance. We introduce a LITE model

that further reduces the memory of DrBoost, which

is jointly trained on retrieval task via contrastive

learning and knowledge distillation from DrBoost

(see Figure 1). We then integrate BM25 with ei-

ther LITE and DrBoost to form light hybrid re-

trievers (Hybrid-LITE and Hybrid-DrBoost) to as-

sess whether light hybrid retrievers can achieve

memory-efﬁciency and sufﬁcient performance.

We conduct experiments on the NaturalQuestion

dataset (Kwiatkowski et al.,2019) and draw inter-

esting results. First of all, LITE retriever maintains

arXiv:2210.01371v2 [cs.IR] 23 May 2023

98.7%

of the teacher model performance and re-

duces its memory by

times. Second, our Hybrid-

LITE saves more than

13×

memory compared to

Hybrid-DPR, while maintaining more than

98.0%

performance; and Hybrid-DrBoost reduces the in-

dexing memory (

8×

) compared to Hybrid-DPR

and maintains at least

98.5%

of the performance.

This shows that the light hybrid model can achieve

sufﬁcient performance while reducing the indexing

memory signiﬁcantly, which suggests the practi-

cal usage of light retrievers for memory-limited

applications, such as on-devices.

One important reason for using hybrid retriev-

ers in real-world applications is the generalization.

Thus, we further study if reducing the indexing

memory will hamper the generalization of light hy-

brid retrievers. Two prominent ideas have emerged

to test generalization: out-of-domain (OOD) gener-

alization and adversarial robustness (Gokhale et al.,

2022). We study OOD generalization of retriev-

ers on EntityQuestion (Sciavolino et al.,2021).

To study the robustness, we leverage six tech-

niques (Morris et al.,2020) to create adversarial

attack testing sets based on NQ dataset. Our exper-

iments demonstrate that Hybrid-LITE and Hybrid-

DrBoost achieve better generalization performance

than individual components. The study of robust-

ness shows that hybrid retrievers are always bet-

ter than sparse and dense retrievers. Nevertheless

all retrievers are vulnerable, suggesting room for

improving the robustness of retrievers, and our

datasets can aid the future research.

2 Related Work

Hybrid Retriever integrates the sparse and

dense retriever and ranks the documents by inter-

polating the relevance score from each retriever.

The most popular way to obtain the hybrid ranking

is applying linear combination of the sparse/dense

retriever scores (Karpukhin et al.,2020;Ma et al.,

2020;Luan et al.,2021;Ma et al.,2021a;Luo

et al.,2022). Instead of using the scores, Chen

et al. (2022) adopts Reciprocal Rank Fusion (Cor-

mack et al.,2009) to obtain the ﬁnal ranking by

the ranking positions of each candidate retrieved

by individual retriever. Arabzadeh et al. (2021)

trains a classiﬁcation model to select one of the

retrieval strategies: sparse, dense or hybrid model.

Most of the hybrid models rely on heavy dense

retrievers, and one exception is (Ma et al.,2021a),

where they use linear projection, PCA, and product

quantization (Jegou et al.,2010) to compress the

dense retriever component. Our hybrid retrievers

use either DrBoost or our proposed LITE as the

dense retrievers, which are more memory-efﬁcient

and achieve better performance than the methods

used in (Ma et al.,2021a).

Indexing-Efﬁcient Dense Retriever. Efﬁciency

includes two dimensions: latency (Seo et al.,2019;

Lee et al.,2021;Varshney et al.,2022) and mem-

ory. In this work, our primary focus is on memory,

speciﬁcally the memory used for indexing. Most

of the existing DRs are indexing heavy (Karpukhin

et al.,2020;Khattab and Zaharia,2020;Luo,

2022). To improve the indexing efﬁciency, there

are mainly three types of techniques. One is to

use vector product quantization (Jegou et al.,2010).

Second is to compress a high dimension dense vec-

tor to a low dimension dense vector, for e.g. from

768 to 32 dimension (Lewis et al.,2021;Ma et al.,

2021a). The third way is to use a binary vector (Ya-

mada et al.,2021;Zhan et al.,2021). Our proposed

method LITE (§3.2) reduces the indexing memory

by joint training of retrieval task and knowledge

distillation from a teacher model.

Generalization of IR. Two main benchmarks

have been proposed to study the OOD generaliza-

tion of retrievers, BEIR (Thakur et al.,2021b) and

EntityQuestion (Sciavolino et al.,2021). As shown

by previous work (Thakur et al.,2021b;Chen et al.,

2022), the generalization is one major concern of

DR. To address this limitation, Wang et al. (2021)

proposed GPL, a domain adaptation technique to

generate synthetic question-answer pairs in speciﬁc

domains. A follow-up work Thakur et al. (2022)

trains BPR and JPQ on the GPL synthetic data to

achieve efﬁciency and generalization. Chen et al.

(2022) investigates a hybrid model in the OOD set-

ting, yet different from us, they use a heavy DR

and do not concern the indexing memory. Most

existing work studies OOD generalization, and

much less attention paid toward the robustness of

retrievers (Penha et al.,2022;Zhuang and Zuccon,

2022;Chen et al.). To study robustness, Penha et al.

(2022) identiﬁes four ways to change the syntax

of the queries but not the semantics. Our work

is a complementary to Penha et al. (2022), where

we leverage adversarial attack techniques (Morris

et al.,2020) to create six different testing sets for

NQ dataset (Kwiatkowski et al.,2019).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AStudyontheEfficiencyandGeneralizationofLightHybridRetrieversManLuo1∗ShashankJain2AnchitGupta2†ArashEinolghozati2†BarlasOguz2†DebojeetChatterjee2†XilunChen2†ChittaBaral1PeymanHeidari21ArizonaStateUniversity2MetaRealityLab1{mluo26,chitta}@asu.edu2{shajain,anchit,arashe,barlaso,debo,xilun,peymanheidar...

展开>> 收起<<

A Study on the Efficiency and Generalization of Light Hybrid Retrievers Man Luo1Shashank Jain2Anchit Gupta2Arash Einolghozati2 Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari2.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Study on the Efficiency and Generalization of Light Hybrid Retrievers Man Luo1Shashank Jain2Anchit Gupta2Arash Einolghozati2 Barlas Oguz2Debojeet Chatterjee2Xilun Chen2Chitta Baral1Peyman Heidari2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: