
Preprint
Meta-learning approaches have been widely used in such problems by learning how to quickly adapt
the classifier to a new label unseen during training, given only a few support examples (Finn et al.,
2017; Snell et al., 2017). However, most prior meta-learning works focus on vision or language-
related tasks. In the new drug recommendation, applying existing meta-learning algorithms faces the
following challenges. (1) Complex relations among diseases and drugs: diseases and medicines
can have inherent and higher order relations. Deciding whether to prescribe a drug to a specific
patient depends on many factors, such as disease progression, comorbidities, ongoing treatments,
individual drug response, and drug side effects. General meta-learning algorithms do not explicitly
capture such dependencies. (2) Numerous false-negative patients: many drugs can treat the same
disease, but usually, only one of them is prescribed. For any given drug, there exist many false-
negative patients who were eligible but did not yet use the new drug (e.g., due to drug availability,
doctor’s preference, or insurance coverage). The number of false-negative supervision signals will
substantially confuse the model learning, especially in the few-shot learning setting.
To address these challenges, we introduce EDGE, a drug-dependent multi-phenotype few-shot
learner to quickly adapt to the recommendation for a new drug with limited support patients. Specif-
ically, since drugs within the same category often have similar treatment effects, EDGE utilizes the
drug ontology for drug representation learning to link new drugs with existing drugs. Further, EDGE
learns multi-phenotype patient representations to capture the complex patient health status from dif-
ferent aspects such as chronic diseases, current symptoms, and ongoing treatments. Given a new
drug with a few support patients, EDGE makes recommendations by performing a drug-dependent
phenotype-level comparison between representations of query patients and corresponding support
prototypes. Lastly, to reduce the false-negative supervision signal, EDGE leverages the MEDI (Wei
et al., 2013) drug-disease knowledge base to guide the negative sampling process.
The main contributions of this work include:
• To our best knowledge, this is the first work formulating the task of new drug recommendation;
• We propose a meta-learning framework EDGE to solve this problem by considering complicated
relations among diseases and drugs, and eliminating numerous false-negative patients.
• We conduct extensive experiments on the public EHR data MIMIC-IV (Johnson et al., 2020)
and private industrial claims data. Results show that our approach achieves 5.6% over ROC-AUC,
6.3% over Precision@100, and 5.5% over Recall@100 when providing recommended patient lists
for new drugs. We also include detailed analyses and ablation studies to show the effectiveness of
multi-phenotype patient representation, drug-dependent patient distance, and knowledge-guided
negative sampling.
2 PROBLEM FORMULATION AND PRELIMINARIES
Denote the set of all drugs as M; the goal of drug recommendation is to prescribe drugs in M
that are suitable for a patient with a record v= [c1, . . . , cV], which consists of a list of diseases
(and procedures), and Vis the total number of diseases and procedures in the record v. Prior
works (Zhang et al., 2017; Shang et al., 2019b; Yang et al., 2021; Tan et al., 2022b) formulate drug
recommendation as a multi-label classification problem by generating a multi-hot output of size
|M|. However, this formulation assumes that the drug label space Mremains unchanged after
training and is not applicable when new drugs appear. Thus, we propose an alternative formulation
for the new drug recommendation as follows.
Assume the entire drug set Mis partitioned into a set of existing drugs Mold and a set of new
drugs Mnew, where Mold ∩ Mnew =∅. Each existing drug mi∈ Mold has sufficient patients
using the drug mi(e.g., from EHR data). Each new drug mt∈ Mnew is associated with a small
support set St={vj}Ns
j=1 consisting of patients using the drug mt(e.g., from clinical trials), and
an unlabeled query patient set Qt={vj}Nq
j=1, where Nsand Nqare the number of patients in the
support and query sets, respectively. The goal of new drug recommendation is to train a model fφ(·)
parameterized by φon existing drugs Mold, such that it can adapt to new drug mt∈ Mnew given
the small support set St, and make correct recommendation on the query set Qt.
To reduce clutter, we use a unified notation for both diseases and procedures. Since we focus on record-
level prediction, “patient” and “record” are used interchangeably.
2