Speeding Up Question Answering Task of Language Models via Inverted Index Xiang Ji Yesim Sungu-Eryilmaz Elaheh Momeni and Reza Rawassizadeh

2025-05-03 0 0 711.44KB 8 页 10玖币
侵权投诉
Speeding Up Question Answering Task of
Language Models via Inverted Index
Xiang Ji, Yesim Sungu-Eryilmaz, Elaheh Momeni, and Reza Rawassizadeh
jjixiang@bu.edu, yesims@bu.edu,
momeni.elaheh@gmail.com,rezar@bu.edu
1Boston University, Boston MA 02215, USA
2eMentalist, Vienna, Austria
Abstract. Natural language processing applications, such as conversa-
tional agents and their question-answering capabilities, are widely used
in the real world. Despite the wide popularity of large language models
(LLMs), few real-world conversational agents take advantage of LLMs.
Extensive resources consumed by LLMs disable developers from inte-
grating them into end-user applications. In this study, we leverage an
inverted indexing mechanism combined with LLMs to improve the ef-
ficiency of question-answering models for closed-domain questions. Our
experiments show that using the index improves the average response
time by 97.44%. In addition, due to the reduced search scope, the aver-
age BLEU score improved by 0.23 while using the inverted index.
Keywords: Inverted Index ·Question Answering ·Large Language Model.
1 Introduction and background
Advances in large language models (LLMs), especially employing the transformer
architecture [5], revolutionized the quality of natural language processing appli-
cations. However, training an LLM with state-of-the-art architecture, including
several layers of decoder/encoder and attention, is computationally very expen-
sive and not possible by small and medium enterprises with limited budgets.
There are promising works to reduce the model sizes of LLMs such as com-
pression [9,19], quantization [20] or the use of knowledge distillation [11], but
still, larger models are favored due to their accuracy.
This issue leads to the introduction of services such as huggingface[7] or
spacy[10], which can share the trained (pre-trained and fine-tuned) LLMs. These
services enable developers to benefit from trained LLM, while not dealing with
the expensive model training process. However, executing a query on trained
LLMs still consumes a significant amount of time. As the number of words in
the text increases, the response time of these models in answering the questions
increases exponentially. For example, if LLMs are used to answer questions based
on a book, they take an extremely long time to answer a question, which is not
practical in real-world applications. Response time has a direct correlation with
the usability of the application [18,15]. Therefore, slow response time hinders
the adaption of large language models into end-user applications.
arXiv:2210.13578v1 [cs.CL] 24 Oct 2022
2 Xiang et al.
In this work, we employ five popular transformer models for question answer-
ing (Q&A), i.e. BERT-base[5], BERT-large[5], DistilBERT[17], RoBERTa[12],
and Tiny-RoBERTa[12]. Then, we develop an inverted index layer [21] that re-
duces the execution time while maintaining and improving the accuracy of the
Q&A models. While extracting answers from a large amount of text (such as
a book), our result shows that our approach has a significant improvement in
both response time and accuracy over the baseline models (not using an inverted
index).
2 Method
Our proposed architecture (see Figure 1) includes two phases and five steps.
Phase 1 includes two steps. Step 1 analyzes the large corpus of the text and
extracts its keywords from each paragraph. For example, the keyword “bench
press“ was extracted from several paragraphs from a fitness book, and "SVM"
was extracted from several paragraphs from a machine learning book.
Fig. 1. Our proposed architecture. First, it identifies the keywords from the user’s
question and then searches the relevant paragraphs based on extracted user keywords.
Dotted lines present phase 1 and straight lines present phase 2.
Step 2 builds an inverted index on the extracted keywords on the paragraph
level for the given book. Phase 1 (steps 1 and 2) should be done offline and once
for every book. To build the inverted index, we do not use an existing library
such as Lucene [2] or SOLR [3]. Instead, we perform the indexing locally to have
control over the accuracy of extracted keywords, which is impossible to do with
the current inverted indexing method.
In Phase 2, step 3 parses the user’s question and extracts its keywords. Step
4 matches the keywords from the question with keywords inside the inverted
index. Step 5, which checks if there is a match, then it finds the given page and
paragraph of the index and feeds this text into the LLM module (e.g. BERT)
to find the answer. Instead of searching the entire text corpus, it only searches
locations specified by an inverted index.
To implement Phase 1, step 1, we experiment with two keyword extraction
algorithms (Rake[16] and keyBERT[8]) to extract one or two keywords for each
摘要:

SpeedingUpQuestionAnsweringTaskofLanguageModelsviaInvertedIndexXiangJi,YesimSungu-Eryilmaz,ElahehMomeni,andRezaRawassizadehjjixiang@bu.edu,yesims@bu.edu,momeni.elaheh@gmail.com,rezar@bu.edu1BostonUniversity,BostonMA02215,USA2eMentalist,Vienna,AustriaAbstract.Naturallanguageprocessingapplications,suc...

展开>> 收起<<
Speeding Up Question Answering Task of Language Models via Inverted Index Xiang Ji Yesim Sungu-Eryilmaz Elaheh Momeni and Reza Rawassizadeh.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:8 页 大小:711.44KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注