Containing the spread of a contagion on a tree Michela Meister meistercs.cornell.eduJon Kleinberg

2025-05-02 0 0 1.1MB 27 页 10玖币
侵权投诉
Containing the spread of a contagion on a tree
Michela Meister
meister@cs.cornell.edu
Jon Kleinberg
kleinberg@cornell.edu
October 25, 2022
Abstract
Contact tracing can be thought of as a race between two processes: an infection process and a tracing
process. In this paper, we study a simple model of infection spreading on a tree, and a tracer who
stabilizes one node at a time. We focus on the question, how should the tracer choose nodes to stabilize
so as to prevent the infection from spreading further? We study simple policies, which prioritize nodes
based on time, infectiousness, or probability of generating new contacts.
1 Introduction
Mathematical models have played an important role in epidemiology, providing tools and frameworks
complementing empirical and public health research. One key example are branching processes, which
lead to the development of the
R0
metric for measuring the spread of disease [18]. While there are many
mathematical models of the spread of disease, far fewer models exist for contact tracing. Here we present an
initial mathematical model of contact tracing which we use to explore algorithmic questions in designing
contact tracing interventions.
Contact tracing is the iterative process of identifying individuals (the contacts) exposed to an infected
case [3,24,28]. These contacts may then be tested for infection, treated, or quarantined, depending on the
nature of the disease, to limit the spread of further infections.
This can be thought of as a race between two processes: an infection process and a tracing process. The
goal of the tracing process is to identify infected cases faster than the disease spreads, so that eventually
no new infections occur, ie the infection is contained. Contact tracing is often implemented by teams of
human tracers, and as a result, the tracing process is limited by the number of human tracers available. A
key strategic decision is how to maximize the effectiveness of this limited tracing capacity [16,22,26]. To
simplify things, we model the tracing process as a single tracer who is given a list of contacts exposed to
infection. We focus on the question, given a list of contacts exposed to infection, which contact should the
tracer investigate or query next? In particular, how does the tracer’s policy for querying contacts affect the
probability that the infection is contained?
One of the challenges in studying these questions is the lack of simple models that manage to articulate
trade-offs between the infection and tracing processes. In the contact tracing literature, there are few models
which simultaneously capture the dynamics of an infection process and a resource-constrained tracing process.
Recent surveys on the contact tracing literature specifically note that “few models take the limited capacity of
the public health system into account” [22] or “consider...the practical constraints that resources for contact
tracing and follow-up control measures might not be available at full throttle” [16]. Meanwhile, the literature
on probabilistic models provides many epidemic models on trees, but these models do not consider the effect
of a tracing process. Thus it seems as if a model describing the interaction between an infection process and
a tracing process has been absent from these two fields.
Related work.
A few other papers analyze contact tracing under resource constraints, however in somewhat
different settings. In [19], Meister and Kleinberg develop a model of contact tracing in which the infection
and tracing processes operate in two disjoint phases. In the first phase the infection spreads throughout
the population; in the second phase the population is in “lockdown” and no new infections occur. Tracing
1
arXiv:2210.13247v1 [math.PR] 24 Oct 2022
proceeds in the second phase, and the tracer’s objective is to identify infected nodes efficiently so as to
maximize a total “benefit”. In contrast, this paper studies concurrent infection and tracing processes, where
the tracer’s objective is to contain the spread of the infection.
Armbruster and Brandeau also study contact tracing under resource constraints in [1,2]. In their model there
are fixed resources to allocate across the two interventions, contact tracing and surveillance testing. The
primary goal is to find the optimal allocation of resources so as to provide the best health outcomes for the
population. They evaluate a few simple policies for prioritizing contacts in [2], and choose the policy that
results in the lowest prevalence of infection for their main analysis in [1]. However, their main focus is on
determining the best allocation of resources across these two systems. In comparison, our work focuses on
analyzing and measuring the performance of different prioritization policies across a wide range of infection
parameters. Finally, in a somewhat different context, Ben-Eliezer, Mossel, and Sudan study a mathematical
model of information spreading on a network with errors in communication and investigate approaches for
error correction [4].
The model.
To address our questions, we need to be able to define trade-offs between the tracer’s policy
for querying individuals and factors such as an individual’s rate of meeting new contacts and the probability
that they transmit the infection to a contact. To do this, we develop a simple model of contact tracing on a
tree, which involves concurrent infection and tracing processes.
First we describe the infection process uninhibited by any tracing. Each individual is represented as a node
with a binary infection status. A node
v
is governed by two parameters: the probability
qv
that it meets a
new contact and the probability
pv
that it transmits the infection to a contact. These parameters are sampled
independently for each node, with
pvDp
and
qvDq
. We will discuss more about these distributions later
on, but the problem is still interesting even when both distributions take just a single fixed value. Initially
all nodes are uninfected. In round
t
= 0 a node
r
becomes infected with a probability drawn from
Dp
. In
each round
t >
0, each node
v
meets a new contact
u
with probability
qv
. If
v
is infected, it infects
u
with
probability
pv
. This process generates a tree, where
r
is the root, the nodes in the first layer are
r
’s contacts,
the nodes in the second layer are contacts of those contacts, and so on. If a node
v
joins the tree in round
t
,
its time-of-arrival is τv=t.
In order to define the tracing process, we assign each node a second binary status, indicating whether it is
active or stable. A node that is active probabilistically generates new contacts at each round, as defined by
the infection process. A node that is stable no longer generates new contacts and therefore cannot further
spread the infection. Initially, every node is active.
Contact tracing starts once the infection is already underway, at time
t
=
k
, when the tracer identifies a root
r
as an index case. From then on the tracer selects one node to query at each step. Note that, while the
tracer only queries one node at each step, we can change the rate of tracing relative to the infection process
by changing the contact probability
q
. Increasing
q
causes the infection to spread more quickly thereby
decreasing the relative rate of tracing, and decreasing
q
causes the infection to spread more slowly, thereby
increasing the relative rate of tracing. Querying a node reveals its infection status, and if a node is infected,
two events occur: (1) the node is stabilized and (2) the node’s children (ie contacts) are revealed. Thus
querying an infected node has two benefits: it prevents further infections and reveals individuals exposed to
infection.
1
At step
t
=
k
the only node the tracer may query is the root
r
. From then on the tracer may only
query a node if its parent is an infected node queried on an earlier step.
We now describe the concurrent infection and tracing processes. We say that an instance of the contact
tracing problem is defined by the three parameters
Dp
,
Dq
, and
k
. The process begins at step
t
= 0 when the
root
r
becomes infected with a probability drawn from
Dp
. The infection process runs uninhibited for steps
0
t < k
. During each step
tk
, first the tracer queries a node and then a single round of the infection
process runs. During the infection round only active nodes generate new contacts. The infection is contained
if the tracer stabilizes all infected nodes.
1If an individual is found to be uninfected, they do not need to be stabilized, since they cannot infect any contacts.
2
Policies for querying nodes.
We can think of the tracing process as maintaining a subtree where each
node in the subtree has an infected parent. The frontier is the set of all leaves in the subtree which have
not yet been queried. We assume that the tracer observes the triple (
pu, qu, τu
) for each node in the subtree
and that, for the purposes of querying, any two nodes in the frontier with the same triple of parameters
are indistinguishable. A policy is any rule that dictates which node from the frontier is queried next. Note
that if the frontier is empty, then all infected nodes have been stabilized, and therefore the infection is
contained.
We say that a policy is non-trivial if it only queries the children of infected nodes. (Since the children of
uninfected nodes are guaranteed to be uninfected, there is no reason to query them.) The remainder of the
paper considers only non-trivial policies.
Figure 1: This example illustrates concurrent infection and tracing processes. The infection process begins
at
t
= 0 and runs uninhibited for three steps. Tracing begins at
t
= 3. At the start of step
t
= 3,
a
is
the only node in the frontier. The tracer queries
a
, and since
a
is infected, its children
b
and
c
join the
frontier. Then another round of the infection process runs, in which every node that has not yet been queried
probabilistically generates a new contact. In this case,
c
generates child
e
. At
t
= 4, the tracer queries
c
,
which is infected, so
e
joins the frontier. Another round of the infection process runs, where
b
generates child
f
and
e
generates child
h
. At
t
= 6, the tracer queries
b
, the sole node in the frontier. Node
b
is uninfected,
so the frontier is empty, and thus the infection is contained.
1.1 Summary and overview of results
We analyze the effectiveness of different tracing policies, with a primary focus on the following question.
Question 1.
How does the tracer’s policy for querying nodes affect the probability that the infection is
contained?
We study this question via both theoretical analysis and computational experiments. To begin, we establish
basic theoretical bounds in section 2, which characterize the performance of any non-trivial policy under
certain conditions. In particular, we show that if either the infection probability
pv
or the contact probability
qv
is sufficiently small for all nodes, then any non-trivial policy contains the infection with high probability.
On the other hand, we show that if both contagion parameters are sufficiently large, then every policy fails
with high probability.
Thus the results in this first section focus on settings in which policy choice is inconsequential; either any non-
trivial policy is likely to contain the infection, or no non-trivial policy is likely to contain the infection. This
motivates the question of whether there exists an instance in which the choice of policy is significant.
Question 2.
Is there an instance and a pair of policies
A1
and
A2
so that the probability of containment
under A1is greater than the probability of containment under A2?
This question is the primary focus of the remainder of the paper. To begin, we start with the simplest setting
possible, where
Dp
and
Dq
are point mass distributions, that is, where all nodes have the same values of
p
3
and
q
. In such a setting, one might think that the probability of containment ought to be agnostic to the
policy chosen by the tracer. However, two nodes with the same infection and contact parameters
p
and
q
may still differ in their time-of-arrival τ.
There are two natural policies for ordering nodes by time-of-arrival. The ascending-time policy orders nodes
by ascending time-of-arrival, and the descending-time policy orders nodes by descending time-of-arrival.
In section 3we prove that for a specific choice of
p
and
q
, descending-time has a strictly higher probability of
containment than ascending-time. Given this result, one might wonder whether descending-time is the better
strategy in all instances. In section 4we compare the performance of ascending-time and descending-time for
a large range of contagion parameters via computational experiments and find numerous instances in which
we observe a significantly higher probability of containment for ascending-time. From these computational
experiments, we find that each of the two policies commands a region of the parameter space of non-trivial size
where it enjoys signficantly better performance, and we find that descending-time has the larger region.
This result still leaves open the question of how close these simple time-of-arrival policies are to an optimal
policy. We study this question in section 5by training an optimal policy via reinforcement learning, where we
formulate contact tracing as a game that the tracer “wins” if the infection is contained and “loses” otherwise.
We train our policy via q-learning and find that its probability of containment is close to that of our policies
based on time-of-arrival. This suggests that analyzing simple policies, such as prioritizing by time-of-arrival,
gives us at least some insight into the performance of an optimal policy.
Since parameters
p
and
q
are fixed in the above settings, the only parameter with which to prioritize nodes is
their time-of-arrival. In our final computational experiment, we explore settings in which each node
v
has
parameters
pv
and
qv
sampled from a distribution. Here we compare two other simple policies, prioritizing in
descending order of
pv
or prioritizing in descending order of
qv
, alongside the descending-time policy. We
evalutate the performance of these policies across a range of distributions for
pv
and
qv
and find that the
dominant policy is described by a phase diagram where the best policy in a specific setting depends on the
distribution generating pvand qv.
Paper organization.
We begin with a summary of related work. Section 2establishes basic theoretical
bounds. Section 3shows a specific instance where policy choice provably affects probability of contain-
ment. Section 4studies this question further, by comparing the performance of the ascending-time and
descending-time policies via computational experiments. Section 5formulates contact tracing as reinforcement
learning problem and conducts a heuristic search for other time-based polices to match the performance of
ascending-time and descending-time. Section 6compares the performance of other policies for prioritizing
nodes beyond time-based methods. Finally, section 7presents future work and open questions.
Further Related Work
Tian et al. study Tuberculosis contact tracing on a simulated network based on
the population of Saskatchewan, Canada, and compare different prioritization policies for tracing individuals,
with a particular focus on prioritizations based on patient demographics [27]. Prior work by Fraser et al. and
Klinkenberg studies when an outbreak of a disease may be contained by tracing and isolation interventions,
with a focus on HIV, smallpox, and influenza, among other diseases [7,14]. Hellewell et al. study this question
for COVID-19 specifically [8]. Kretzschmar et al. study the effect of time delays on contact tracing for
COVID-19 via computational simulations [15]. Kwok et al. review models of contact tracing and call for more
models to account for resource constraints in tracing [16]. Kaplan et al. model a tracing and vaccination
response to a bioterrorism attack in [12,13].
Muller et al. study contact tracing as a branching process [23]. Eames et al. study different contact tracing
strategies for a compartmental model of infection [6,9]. Eames and Keeling study the relationship between
the fraction of contacts which are traced and the rate at which the disease spreads in the context of sexually
transmitted diseases [5].
4
2 Basic Theoretical Bounds
Recall that a non-trivial policy is one which only queries the children of infected nodes. Our basic theoretical
bounds define conditions under which any non-trivial policy succeeds and under which any non-trivial policy
fails. We focus on the following question.
Question 3.
Fix a non-trivial policy
P
. Under what conditions, with high probability, does
P
contain the
infection? Under what conditions, with high probability, does Pfail to contain the infection?
Figure 2outlines our results in this section. First we establish that, if either the infection probability
pv
or
contact probability
qv
is sufficiently small for all nodes, then any non-trivial policy contains the infection with
high probability, which is shown in theorems 2.1 and 2.2. On the other hand, if both the infection probability
and contact probability are sufficiently large for all nodes, theorem 2.3 shows that for any fixed non-trivial
policy P, with high probablity Pdoes not contain the infection.
Figure 2: Our basic theoretical bounds describe parameter regimes in which the choice of policy is inconse-
quential. If the infection or contact probability is very low for all nodes, then any non-trivial policy contains
the infection with high probability, as shown in theorems 2.1 and 2.2. However, if the infection and contact
probabilities are both very high for all nodes, then containment is highly unlikely, regardless of the non-trivial
policy chosen, as shown in theorem 2.3.
2.1 Conditions under which containment is likely
We present two theorems, which together show that if either the infection probability or contact probability is
below a certain threshold for all nodes, then any non-trivial policy contains the infection with high probability.
For the following two theorems it will be helpful to analyze the tracing process through the lens of deferred
decisions. This analysis changes nothing about how the contact tracing process is defined, but simply makes
it easier for us to analyze. First we will generate a transcript
T0
0, T 0
1, . . .
of a tree with the given contagion
parameters growing uninhibited by any tracing. During the tracing process, we construct infection tree
T0, T1, . . .
by “replaying” the transcript. For example, when a new round of infection occurs at time
t
, we
refer to
T0
t
to determine the nodes to add to
Tt
and their infection statuses. The benefit of this framework is
that we can prove claims about the transcript
T0
0, T 0
1, . . .
, which is often much easier to analyze, and show
that these claims hold for the infection tree as well.
To start, we show that if the contact probability
qv
is sufficiently small for all nodes, then any non-trivial
policy contains the infection with high probability.
Theorem 2.1.
Fix a failure probability
δ
(0
,
1) and an arrival time
kN
. Suppose that each node
v
has infection probability
pv
1. There is a
q
(
δ, k
)
(0
,
1] such that, if each node
v
has contact probability
5
摘要:

ContainingthespreadofacontagiononatreeMichelaMeistermeister@cs.cornell.eduJonKleinbergkleinberg@cornell.eduOctober25,2022AbstractContacttracingcanbethoughtofasaracebetweentwoprocesses:aninfectionprocessandatracingprocess.Inthispaper,westudyasimplemodelofinfectionspreadingonatree,andatracerwhostabili...

收起<<
Containing the spread of a contagion on a tree Michela Meister meistercs.cornell.eduJon Kleinberg.pdf

共27页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:27 页 大小:1.1MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 27
客服
关注