Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine Fenglin Liu1 Bang Yang2 Chenyu You3 Xian Wu4 Shen Ge4 Zhangdaihong Liu15

2025-04-29 0 0 729.14KB 14 页 10玖币
侵权投诉
Retrieval-Augmented and Knowledge-Grounded Language
Models for Faithful Clinical Medicine
Fenglin Liu1
, Bang Yang2
, Chenyu You3, Xian Wu4, Shen Ge4, Zhangdaihong Liu1,5
Xu Sun6
, Yang Yang7
, David A. Clifton1,5
1Department of Engineering Science, University of Oxford 2School of ECE, Peking University
3Department of Electrical Engineering, Yale University 4Tencent JARVIS Lab, China
5Oxford-Suzhou Centre for Advanced Research, China
6MOE Key Lab of Computational Linguistics, School of Computer Science, Peking University
7School of Public Health, Shanghai Jiao Tong University School of Medicine, China
{fenglin.liu, david.clifton}@eng.ox.ac.uk, {bangyang, xusun}@pku.edu.cn
chenyu.you@yale.edu, jessie.liu@oxford-oscar.cn
{kevinxwu, shenge}@tencent.com, emma002@sjtu.edu.cn
Abstract
Language models (LMs), including large language models (such as ChatGPT), have
the potential to assist clinicians in generating various clinical notes. However, LMs
are prone to produce “hallucinations”, i.e., generated content that is not aligned
with facts and knowledge. In this paper, we propose the Re
3
Writer method with
retrieval-augmented generation and knowledge-grounded reasoning to enable LMs
to generate faithful clinical texts. We demonstrate the effectiveness of our method in
generating patient discharge instructions. It requires the LMs not to only understand
the patients’ long clinical documents, i.e., the health records during hospitalization,
but also to generate critical instructional information provided both to carers and to
the patient at the time of discharge. The proposed Re
3
Writer imitates the working
patterns of physicians to first retrieve related working experience from historical
instructions written by physicians, then reason related medical knowledge. Finally,
it refines the retrieved working experience and reasoned medical knowledge to
extract useful information, which is used to generate the discharge instructions
for previously-unseen patients. Our experiments show that, using our method,
the performance of five representative LMs can be substantially boosted across
all metrics. Meanwhile, we show results from human evaluations to measure the
effectiveness in terms of fluency, faithfulness, and comprehensiveness. 3
1 Introduction
At the time of discharge to home, the Patient Instruction (PI), which is a rich paragraph of text
containing multiple instructions, is provided by the attending clinician to the patient or guardian. PI
is used for the purpose of facilitating safe and appropriate continuity of care [7, 24, 28]. As a result,
the PI has significant implications for patient management and good medical care, lowering down the
risks of hospital readmission, and improving the doctor-patient relationship. As shown in Figure 1, a
PI typically contains the following three main components from patients’ perspective [
13
,
31
]: (1)
Equal contribution.
Corresponding authors.
3The code is available at https://github.com/AI-in-Health/Patient-Instructions
arXiv:2210.12777v4 [cs.CL] 21 Jul 2024
Patient 2: You were admitted for bleeding from an ulcer in your stomach. This ulcer is at least
partially caused by naproxen. You should stop taking naproxen and take only tylenol for pain. […]
you are scheduled to get a repeat endoscopy next week. Prior to the procedure do not have
anything to drink or eat after midnight.
Input (Patient's Health Records)
Output (Patient Instruction)
Nursing
Notes
Radiology
Notes
Physician
Notes Medications
Admission
Notes
… …
Patient 1: Please shower daily including washing incisions gently with mild soap, no baths or
swimming, […] please no lotions, cream, powder, or ointments to incisions […] females: please
wear bra to reduce pulling on incision, avoid rubbing on lower edge.
Figure 1: Two examples of the Patient Instruction written by physicians which guide the patients how
to manage their conditions after discharge based on their health records during hospitalization.
What is my main health condition? (i.e., why was I in the hospital?) (2) What do I need to do? (i.e.,
how do I manage at home, and how should I best care for myself?) (3) Why is it important for me to
do this? Of the above, the second section is often considered to be the most important information.
For example, when a patient has had surgery while in the hospital, the PI might tell the patient to
keep the wound away from water to avoid infection.
Currently, the following skills are needed for physicians to write a PI [
24
,
28
,
7
]: (1) thorough
medical knowledge for interpreting the long patient’s clinical records, including diagnosis, medication,
and procedure records; (2) skills for carefully analyzing the extensive and complex patient’s data
acquired during hospitalization, e.g., admission notes, nursing notes, radiology notes, physician
notes, medications, and laboratory results; (3) the ability of extracting key information and writing
instructions appropriate for the lay reader. Therefore, writing PIs is a necessary but time-consuming
task for physicians, exacerbating the workload of clinicians who would otherwise focus on patient
care [
34
,
26
]. Besides, physicians need to read lots of patients’ long health records in their daily work,
resulting in a substantial opportunity for incompleteness or inappropriateness of wording [
30
,
35
].
Statistically, countries with abundant healthcare resources, such as the United States, have up to
54% of physicians experiencing some sign of burnout in one year of study [
27
], which is further
exacerbated in countries with more tightly resource-constrained healthcare resources.
The overloading of physicians is a well-documented phenomenon [
30
,
35
], and AI-related support
systems that can partly automate routine tasks, such as the generation of PIs for modification/approval
by clinicians is an important contribution to healthcare practice. To this end, we propose the novel
task of automatic PI generation, which aims to generate an accurate and fluent textual PI given input
health records of a previously-unseen patient during hospitalization. In this way, it is intended that
physicians, given the health records of a new patient, need only review and modify the generated
PI, rather than writing a new PI from scratch, significantly relieving the physicians from the heavy
workload and increasing their time and energy spent in meaningful clinical interactions with patients.
Such endeavors would be particularly useful in resource-limited countries [32].
In this paper, we build a dataset PI and propose a deep-learning approach named Re
3
Writer, which
imitates the physicians’ working patterns to automatically generate a PI at the point of discharge
from the hospital. Specifically, when a patient discharges from the hospital, physicians will carefully
analyze the patient’s health records in terms of diagnosis, medication, and procedure, then accurately
write a corresponding PI based on their working experience and medical knowledge [
6
,
7
]. In order
to model clinicians’ text production, the Re
3
Writer, which introduces three components: Retrieve,
Reason, and Refine, (1) first encodes working experience by mining historical PIs, i.e., retrieving
instructions of previous patients according to the similarity of diagnosis, medication and procedure;
(2) then reasons medical knowledge into the current input patient data by learning a knowledge
graph, which is constructed to model domain-specific knowledge structure; (3) at last, refines relevant
information from the retrieved working experience and reasoned medical knowledge to generate final
patient discharge instructions.
2
To prove the effectiveness of our Re
3
Writer, we incorporate it into 5 different language models
(LMs) [
36
]: 1) recurrent neural network (RNN)-based LM, 2) attention-based LM, 3) hierarchical
RNN-based LM, 4) copy mechanism-based LM, 5) fully-attentive LM, i.e., the Transformer [
33
]
LM. The extensive experiments show that our approach can substantially boost the performance of
baselines across all metrics.
Overall, the main contributions of this paper are:
We make the first attempt to automatically generate the Patient Instruction (PI), which can
reduce the workload of physicians. As a result, it can increase their time and energy spent in
meaningful interactions with patients, providing high-quality care for patients and improving
doctor-patient relationships.
To address the task, we build a dataset PI and propose an approach Re
3
Writer, which imitates
physicians’ working patterns to retrieve working experience and reason medical knowledge,
and finally refine them to generate accurate and faithful patient instructions.
We prove the effectiveness and the generalization capabilities of our approach on the built
PI dataset. After including our approach, performances of the baseline models improve
significantly on all metrics, with up to 20%, 11%, and 19% relative improvements in BLEU-
4, ROUGE-L, and METEOR, respectively. Moreover, we conduct human evaluations to the
generated PI for its quality and usefulness in clinical practice.
2 Approach
We first define the PI generation problem; then, we describe the proposed Re3Writer in detail.
2.1 PI Generation Problem Definition
When a patient is discharged from the hospital, the PI generation system should generate a fluent
and faithful instruction to help the patient or carer to manage their conditions at home. Therefore,
the goal of the PI generation task is to generate a target instruction
I={y1, y2, . . . , yNI}
given the
patient’s health records
R={r1, r2, . . . , rNR}
in terms of diagnoses, medications and procedures
performed during hospitalization.
Since the input
R
including
NR
words and the output
I
including
NI
words are both textual sequences,
we adopt the encoder-decoder framework, which is widely-used in natural language generation tasks,
to perform the PI generation task. In particular, the encoder-decoder framework includes a health
record encoder and a PI decoder, which can be formulated as:
Record Encoder :RR;PI Decoder :RI, (1)
where
RRNR×d
denotes the record embeddings encoded by the record encoder, e.g., LSTM
[
11
] or Transformer [
33
]. Then,
R
is fed into the PI decoder (which again could be an LSTM or
Transformer), to generate the target PI
I
. During training, given the ground truth PI for the input
patient’s health records, we can train the model by minimizing the widely-used cross-entropy loss.
2.2 The Proposed Re3Writer
Our Re3Writer consists of three core components: Retrieve, Reason, and Refine.
Formulation of the Re
3
Writer As stated above, given the health records
R
encoded as
R
RNR×d
, we aim to generate a desirable PI
I
. Figure 2 shows the detail of our method, which is
designed to retrieve related working experience
WPr
and reason related medical knowledge
GPr
from the training corpus for current input patient. Finally, Re
3
Writer refines the retrieved working
experience
WPr
and reasoned medical knowledge
GPr
to extract useful information to generate a
proper PI:
Record Encoder :RR;Retrieve :RWPr;Reason :RGPr;
Refine + PI Decoder :{R, WPr, GPr} → I. (2)
Our method can be incorporated into existing encoder-decoder based models, to boost their perfor-
mance with PI generation, as we will later demonstrate. We now describe how to retrieve related
working experience and reason related medical knowledge from the training corpus for PI generation.
3
摘要:

Retrieval-AugmentedandKnowledge-GroundedLanguageModelsforFaithfulClinicalMedicineFenglinLiu1∗,BangYang2∗,ChenyuYou3,XianWu4,ShenGe4,ZhangdaihongLiu1,5XuSun6†,YangYang7†,DavidA.Clifton1,51DepartmentofEngineeringScience,UniversityofOxford2SchoolofECE,PekingUniversity3DepartmentofElectricalEngineering,...

展开>> 收起<<
Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine Fenglin Liu1 Bang Yang2 Chenyu You3 Xian Wu4 Shen Ge4 Zhangdaihong Liu15.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:729.14KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注