Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine Fenglin Liu1 Bang Yang2 Chenyu You3 Xian Wu4 Shen Ge4 Zhangdaihong Liu15

2025-04-29 0 0 729.14KB 14 页 10玖币

侵权投诉

Retrieval-Augmented and Knowledge-Grounded Language

Models for Faithful Clinical Medicine

Fenglin Liu1∗

, Bang Yang2∗

, Chenyu You3, Xian Wu4, Shen Ge4, Zhangdaihong Liu1,5

Xu Sun6†

, Yang Yang7†

, David A. Clifton1,5

1Department of Engineering Science, University of Oxford 2School of ECE, Peking University

3Department of Electrical Engineering, Yale University 4Tencent JARVIS Lab, China

5Oxford-Suzhou Centre for Advanced Research, China

6MOE Key Lab of Computational Linguistics, School of Computer Science, Peking University

7School of Public Health, Shanghai Jiao Tong University School of Medicine, China

{fenglin.liu, david.clifton}@eng.ox.ac.uk, {bangyang, xusun}@pku.edu.cn

chenyu.you@yale.edu, jessie.liu@oxford-oscar.cn

{kevinxwu, shenge}@tencent.com, emma002@sjtu.edu.cn

Abstract

Language models (LMs), including large language models (such as ChatGPT), have

the potential to assist clinicians in generating various clinical notes. However, LMs

are prone to produce “hallucinations”, i.e., generated content that is not aligned

with facts and knowledge. In this paper, we propose the Re

Writer method with

retrieval-augmented generation and knowledge-grounded reasoning to enable LMs

to generate faithful clinical texts. We demonstrate the effectiveness of our method in

generating patient discharge instructions. It requires the LMs not to only understand

the patients’ long clinical documents, i.e., the health records during hospitalization,

but also to generate critical instructional information provided both to carers and to

the patient at the time of discharge. The proposed Re

Writer imitates the working

patterns of physicians to ﬁrst retrieve related working experience from historical

instructions written by physicians, then reason related medical knowledge. Finally,

it reﬁnes the retrieved working experience and reasoned medical knowledge to

extract useful information, which is used to generate the discharge instructions

for previously-unseen patients. Our experiments show that, using our method,

the performance of ﬁve representative LMs can be substantially boosted across

all metrics. Meanwhile, we show results from human evaluations to measure the

effectiveness in terms of ﬂuency, faithfulness, and comprehensiveness. 3

1 Introduction

At the time of discharge to home, the Patient Instruction (PI), which is a rich paragraph of text

containing multiple instructions, is provided by the attending clinician to the patient or guardian. PI

is used for the purpose of facilitating safe and appropriate continuity of care [7, 24, 28]. As a result,

the PI has signiﬁcant implications for patient management and good medical care, lowering down the

risks of hospital readmission, and improving the doctor-patient relationship. As shown in Figure 1, a

PI typically contains the following three main components from patients’ perspective [

]: (1)

∗Equal contribution.

†Corresponding authors.

3The code is available at https://github.com/AI-in-Health/Patient-Instructions

arXiv:2210.12777v4 [cs.CL] 21 Jul 2024

Patient 2: You were admitted for bleeding from an ulcer in your stomach. This ulcer is at least

partially caused by naproxen. You should stop taking naproxen and take only tylenol for pain. […]

you are scheduled to get a repeat endoscopy next week. Prior to the procedure do not have

anything to drink or eat after midnight.

Input (Patient's Health Records)

Output (Patient Instruction)

Nursing

Notes

Radiology

Notes

Physician

Notes Medications

Admission

Notes

… …

Patient 1: Please shower daily including washing incisions gently with mild soap, no baths or

swimming, […] please no lotions, cream, powder, or ointments to incisions […] females: please

wear bra to reduce pulling on incision, avoid rubbing on lower edge.

Figure 1: Two examples of the Patient Instruction written by physicians which guide the patients how

to manage their conditions after discharge based on their health records during hospitalization.

What is my main health condition? (i.e., why was I in the hospital?) (2) What do I need to do? (i.e.,

how do I manage at home, and how should I best care for myself?) (3) Why is it important for me to

do this? Of the above, the second section is often considered to be the most important information.

For example, when a patient has had surgery while in the hospital, the PI might tell the patient to

keep the wound away from water to avoid infection.

Currently, the following skills are needed for physicians to write a PI [

]: (1) thorough

medical knowledge for interpreting the long patient’s clinical records, including diagnosis, medication,

and procedure records; (2) skills for carefully analyzing the extensive and complex patient’s data

acquired during hospitalization, e.g., admission notes, nursing notes, radiology notes, physician

notes, medications, and laboratory results; (3) the ability of extracting key information and writing

instructions appropriate for the lay reader. Therefore, writing PIs is a necessary but time-consuming

task for physicians, exacerbating the workload of clinicians who would otherwise focus on patient

care [

]. Besides, physicians need to read lots of patients’ long health records in their daily work,

resulting in a substantial opportunity for incompleteness or inappropriateness of wording [

Statistically, countries with abundant healthcare resources, such as the United States, have up to

54% of physicians experiencing some sign of burnout in one year of study [

], which is further

exacerbated in countries with more tightly resource-constrained healthcare resources.

The overloading of physicians is a well-documented phenomenon [

], and AI-related support

systems that can partly automate routine tasks, such as the generation of PIs for modiﬁcation/approval

by clinicians is an important contribution to healthcare practice. To this end, we propose the novel

task of automatic PI generation, which aims to generate an accurate and ﬂuent textual PI given input

health records of a previously-unseen patient during hospitalization. In this way, it is intended that

physicians, given the health records of a new patient, need only review and modify the generated

PI, rather than writing a new PI from scratch, signiﬁcantly relieving the physicians from the heavy

workload and increasing their time and energy spent in meaningful clinical interactions with patients.

Such endeavors would be particularly useful in resource-limited countries [32].

In this paper, we build a dataset PI and propose a deep-learning approach named Re

Writer, which

imitates the physicians’ working patterns to automatically generate a PI at the point of discharge

from the hospital. Speciﬁcally, when a patient discharges from the hospital, physicians will carefully

analyze the patient’s health records in terms of diagnosis, medication, and procedure, then accurately

write a corresponding PI based on their working experience and medical knowledge [

]. In order

to model clinicians’ text production, the Re

Writer, which introduces three components: Retrieve,

Reason, and Reﬁne, (1) ﬁrst encodes working experience by mining historical PIs, i.e., retrieving

instructions of previous patients according to the similarity of diagnosis, medication and procedure;

(2) then reasons medical knowledge into the current input patient data by learning a knowledge

graph, which is constructed to model domain-speciﬁc knowledge structure; (3) at last, reﬁnes relevant

information from the retrieved working experience and reasoned medical knowledge to generate ﬁnal

patient discharge instructions.

To prove the effectiveness of our Re

Writer, we incorporate it into 5 different language models

(LMs) [

]: 1) recurrent neural network (RNN)-based LM, 2) attention-based LM, 3) hierarchical

RNN-based LM, 4) copy mechanism-based LM, 5) fully-attentive LM, i.e., the Transformer [

]

LM. The extensive experiments show that our approach can substantially boost the performance of

baselines across all metrics.

Overall, the main contributions of this paper are:

•

We make the ﬁrst attempt to automatically generate the Patient Instruction (PI), which can

reduce the workload of physicians. As a result, it can increase their time and energy spent in

meaningful interactions with patients, providing high-quality care for patients and improving

doctor-patient relationships.

•

To address the task, we build a dataset PI and propose an approach Re

Writer, which imitates

physicians’ working patterns to retrieve working experience and reason medical knowledge,

and ﬁnally reﬁne them to generate accurate and faithful patient instructions.

•

We prove the effectiveness and the generalization capabilities of our approach on the built

PI dataset. After including our approach, performances of the baseline models improve

signiﬁcantly on all metrics, with up to 20%, 11%, and 19% relative improvements in BLEU-

4, ROUGE-L, and METEOR, respectively. Moreover, we conduct human evaluations to the

generated PI for its quality and usefulness in clinical practice.

2 Approach

We ﬁrst deﬁne the PI generation problem; then, we describe the proposed Re3Writer in detail.

2.1 PI Generation Problem Deﬁnition

When a patient is discharged from the hospital, the PI generation system should generate a ﬂuent

and faithful instruction to help the patient or carer to manage their conditions at home. Therefore,

the goal of the PI generation task is to generate a target instruction

I={y1, y2, . . . , yNI}

given the

patient’s health records

R={r1, r2, . . . , rNR}

in terms of diagnoses, medications and procedures

performed during hospitalization.

Since the input

including

words and the output

including

words are both textual sequences,

we adopt the encoder-decoder framework, which is widely-used in natural language generation tasks,

to perform the PI generation task. In particular, the encoder-decoder framework includes a health

record encoder and a PI decoder, which can be formulated as:

Record Encoder :R→R′;PI Decoder :R′→I, (1)

where

R′∈RNR×d

denotes the record embeddings encoded by the record encoder, e.g., LSTM

[

] or Transformer [

]. Then,

R′

is fed into the PI decoder (which again could be an LSTM or

Transformer), to generate the target PI

. During training, given the ground truth PI for the input

patient’s health records, we can train the model by minimizing the widely-used cross-entropy loss.

2.2 The Proposed Re3Writer

Our Re3Writer consists of three core components: Retrieve, Reason, and Reﬁne.

Formulation of the Re

Writer As stated above, given the health records

encoded as

R′∈

RNR×d

, we aim to generate a desirable PI

. Figure 2 shows the detail of our method, which is

designed to retrieve related working experience

WPr

and reason related medical knowledge

GPr

from the training corpus for current input patient. Finally, Re

Writer reﬁnes the retrieved working

experience

WPr

and reasoned medical knowledge

GPr

to extract useful information to generate a

proper PI:

Record Encoder :R→R′;Retrieve :R→WPr;Reason :R→GPr;

Reﬁne + PI Decoder :{R′, WPr, GPr} → I. (2)

Our method can be incorporated into existing encoder-decoder based models, to boost their perfor-

mance with PI generation, as we will later demonstrate. We now describe how to retrieve related

working experience and reason related medical knowledge from the training corpus for PI generation.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Retrieval-AugmentedandKnowledge-GroundedLanguageModelsforFaithfulClinicalMedicineFenglinLiu1∗,BangYang2∗,ChenyuYou3,XianWu4,ShenGe4,ZhangdaihongLiu1,5XuSun6†,YangYang7†,DavidA.Clifton1,51DepartmentofEngineeringScience,UniversityofOxford2SchoolofECE,PekingUniversity3DepartmentofElectricalEngineering,...

展开>> 收起<<

Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine Fenglin Liu1 Bang Yang2 Chenyu You3 Xian Wu4 Shen Ge4 Zhangdaihong Liu15.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine Fenglin Liu1 Bang Yang2 Chenyu You3 Xian Wu4 Shen Ge4 Zhangdaihong Liu15

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: