Seed bank Cannings graphs How dormancy smoothes random genetic drift Adrián González Casanova1 Lizbeth Peñaloza2 and Arno Siri-Jégousse3

2025-04-26 0 0 605.15KB 20 页 10玖币
侵权投诉
Seed bank Cannings graphs: How dormancy smoothes random
genetic drift
Adrián González Casanova1, Lizbeth Peñaloza2, and Arno Siri-Jégousse3
1Unidad Cuernavaca del Instituto de Matemáticas de la Universidad Nacional Autónoma de México, Av. Universidad s/n
Periferica, 62210 Cuernavaca, Morelos, México. Email: adrian.gonzalez@im.unam.mx
2Instituto de Investigación de Matemáticas y Actuaría, Universidad del Mar, campus Huatulco. Carretera Federal No 200,
Km 250, Santa María Huatulco 70989, Oaxaca, México. Email: lizbeth@huatulco.umar.mx
3Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas, Universidad Nacional Autónoma de México, Circuito
Escolar 3000, C.U., 04510 Coyoacán, CDMX, México. Email: arno@sigma.iimas.unam.mx
May 23, 2023
Abstract
In this article, we introduce a random (directed) graph model for the simultaneous forwards and
backwards description of a rather broad class of Cannings models with a seed bank mechanism. This
provides a simple tool to establish a sampling duality in the finite population size, and obtain a path-
wise embedding of the forward frequency process and the backward ancestral process. Further, it
allows the derivation of limit theorems that generalize celebrated results by Möhle to models with seed
banks, and where it can be seen how the effect of seed banks affects the genealogies. The explicit
graphical construction is a new tool to understand the subtle interplay of seed banks, reproduction
and genetic drift in population genetics.
Keywords: Seed bank, Moment duality, Weak convergence, Mixing time.
2020 Mathematics Subject Classification: 92D10, 60F05, 60G10, 60J05, 60J90, 92D25.
1 Introduction
Cannings models and their modifications, along with their multiple merger genealogies are a major topic in math-
ematical population genetics [29, 28, 31, 1, 14, 16, 32]. Also, in the last decade, the study of dormancy (also called
seed bank effect) received significant attention [22, 4, 5, 6]. One of the unifying themes in both modeling areas is
that they arise from extensions of the Wright-Fisher model, and that classical evolutionary forces such as genetic
drift and selection are affected in important ways. While the theory of Cannings models is now robust, the study
of models with dormancy is still work in progress. The main goal of this paper is to stabilize a framework in which
seed banks can be combined with Cannings models, and to generalize the known limiting results for models without
dormancy to Cannings models with seed bank.
An important tool in population genetics is the moment duality for Markov processes. This technique establishes
a mathematical relation between forward and backward in time processes. The celebrated duality between the
Wright-Fisher diffusion and the Kingman coalescent was gradually generalized to a wide class of neutral population
genetics models, including some finite size discrete populations such as Cannings-type models [17]. In this situation,
the duality leads to asymptotic results for both forward frequency and genealogical processes. In the context of
dormancy, duality was established for the seed bank diffusion, which arises as a limit in models with geometric
seed bank [5]. However, for discrete seed bank models, the duality relation is the first open gap that this article
aims to close.
There are two main models for dormancy phenomena.
1
arXiv:2210.05819v2 [math.PR] 19 May 2023
Kaj et al. The model defined in [22] is based on the Wright-Fisher model with additional multi-generational
jumps of (bounded) size, the system has been extended to geometric jump sizes of bounded expected range
in [25] (which also provide some insight into the forward in time frequency diffusion), to the general finite
expectation case in [4], and even to unbounded (heavy-tailed) jump sizes in [3].
Blath et al. A second modeling frame is given by an external seed bank in terms of a “second island” (in
the spirit of Wright’s island model), effectively leading to geometric jump sizes on the evolutionary scale.
Here, forward and backward limits have been constructed, giving rise to the seed bank diffusion and the
seed bank coalescent [5] (see more analysis and generalization in [15, 6] and an interesting connection with
metapopulations in [26]).
Both modeling frames (generational jumps and second island) have their advantages and disadvantages. For
the Wright-Fisher model with multi-generational jumps, one typically loses the Markov property. For the island
version, one retains the Markov property but then needs to investigate two-dimensional frequency processes, which
in the limit are harder to analyze than one-dimensional diffusions, since e.g. the Feller theory is missing (this can
in part be replaced by recent theory for polynomial diffusions [2]). Interestingly, it turns out that for the limiting
frequency processes, both approaches can be two sides of the same medal.
In none of the above approaches, more general reproductive mechanisms, such as based on Cannings models,
have been analyzed. This paper’s second aim is to close this gap. We present an extended framework for the
simultaneous construction of seed bank models with general multi-generational jump distributions and Cannings-
type reproductive laws satisfying a paintbox construction. We are also able to obtain forward and backward
convergence results (extending [22], [25] and [4]) and to provide an explicit sampling duality, which is valid already
in the finite individual models.
More precisely, we show that if a sequence of Cannings models (with no seed bank effect) is in the universality
class of the Kingman coalescent, meaning that its ancestral process converges in the evolutionary scale to the
Kingman coalescent, then the ancestry of the same sequence with a seed bank effect will converge to the Kingman
coalescent delayed by a constant β2, where β < is the expected number of generations that separates an
individual from its ancestor. This extends the results of [22] and [4]. Convergence of the frequency process to the
solution of the Wight-Fisher diffusion with the same delay is also proved. We go further and study how sequences
of seed bank models with divergent expectations can make sequences of Cannings models that originally were
not in the Kingman class, converge to the Kingman coalescent. This is achieved using the mixing time of some
auxiliary Markov chains introduced in [22]. If instead of considering Cannings processes in the Kingman class we
consider that their genealogy converges to a Ξ-coalescent, we show that their seed bank modification converges to
aΞβ-coalescent. Heuristically, the transformation ΞΞβconsists in dividing by βall the non-dust boxes in a Ξ
paintbox event to obtain a Ξβpaintbox event. Similar asymptotics are shown for the forward process. All those
results are extended for models in the presence of mutations.
Note that the interplay of general reproduction and seed banks with other evolutionary forces can be subtle,
and we provide a framework for its analysis (also regarding the real-time embedding of coalescent-based estimates,
see e.g. [6]).
The paper is organized as follows. In section 2 we construct a random graph that allows us to embed the
ancestry and the frequency processes of both Cannings and dormancy models simultaneously and study the duality
relation of the processes forward and backward in time. Furthermore, we analyze the scaling limits of the ancestral
process in presence of skewed reproduction mechanisms and dormancy. We give conditions for convergence to
the Kingman coalescent and study scenarios beyond this universality class, where we can describe how seed bank
phenomena reduce the typical size coalescence events when combining seed banks with Cannings models that would,
in absence of the seed bank component, converge to a Λ- or a Ξ- coalescent. Section 3 uses the moment duality
to formally prove convergence of the frequency process to a Wright-Fisher diffusion. This intuitively clear result
was missing in the literature, probably since the lack of Markov property for the frequency process makes usual
techniques fail. In section 4 we study a variant of the seed bank random graph where mutations are added and we
extend the results obtained in sections 2 and 3.
2 A random graph version of the model of Kaj, Krone and Lascoux
Consider a discrete-time haploid population of constant size N1at each generation. The vertex set VN=Z×[N]
represents the whole population. For each individual vVN, denote by g(v)its generation and by `(v)its label
2
so that v= (g(v), `(v)). We denote the g-th generation of the population by VN
g:= {vVN:g(v) = g}. Set
a probability measure WNon the exchangeable probability measures on [N]. Let {¯
WN
g}gZbe a sequence of
independent WN-distributed random variables with ¯
WN
g={WN
v}vVN
g. Each variable WN
vgives the reproductive
weight of the individual vin the population graph. This multinomial setting can be extended to some more general
Cannings models (as in [28]) or non-exchangeable reproductive success (as in [32]). Also, consider a sequence
{mN}N1of integers and set a probability measure µNon [mN]. Let {JN
v}vVNbe a collection of independent
µN-distributed random variables. The variable JN
vsays how many generations ago an individual v’s mother is
living. Finally, set a collection of random variables in [N],{UN
v}vVNsuch that UN
vis the label of the mother of
v. Its conditional distribution is
P(UN
v=k|JN
v=j, {¯
WN
g}gZ) = WN
(g(v)j,k).
Definition 1. (The seed bank random di-graph) Consider the random set of directed edges
EN={(v, (g(v)JN
v, UN
v)),for all vVN}.
The seed bank random di-graph with parameters N,WNand µNis given by GN:= (VN, EN).
Two classical examples are
the Kaj, Krone and Lascoux (KKL) seed bank graph [22], in this case µNhas finite support [m], i.e. mN=m,
and WN=δ(1/N,...,1/N).
the Cannings model with parameter WN[8, 9, 28], in this case µN=δ1.
For every u, v VNwe denote by δ(u, v)the distance of uand vin the graph GN, i.e. the number of vertices
in a path from uto vor from vto u. Now let us define the ancestral process associated with this graph.
Definition 2 (The ancestral process).Fix a generation g0and Sg0consisting in a sample of individuals living
between generation g0and g0mN+ 1, i.e. Sg0⊂ ∪mN
i=1 VN
g0+1i. For every g0, let AN
gbe the set composed by
the most recent ancestors of the individuals of Sg0that live at a generation g0g0for some g0g, that is
AN
g={v∈ ∪
g0=gVN
g0g0:uSg0such that δ(u, v)δ(u, v0)for all v0∈ ∪
g0=gVN
g0g0}.
Define, for all i[mN],
AN,i
g=|AN
gVN
g0g+1i|
and ¯
AN
g= (AN,1
g, . . . , AN,mN
g). We call {¯
AN
g}g0the ancestral process. In the sequel, we consider the initial
configuration Sg0(¯n), for ¯n= (n1, . . . , nmN), such that ni0individuals are uniformly sampled (with repetition)
from generation g0+ 1 i. We denote the law of the ancestral process of this sample by P¯n. See Figure 1 for an
illustration.
For simplicity, we suppose that sup{i1 : ni>0}does not depend on N. This model was introduced, for
reproductions as in the Wright-Fisher model, by Kaj et al. [22] directly, in the sense that they construct a random
graph only implicitly. Our construction permits to provide a transparent relation between the ancestral process
and the forward frequency process defined in section 3. Observe that {¯
AN
g}g0is a Markov chain. We start our
results by formalizing the remark on p. 290 in [22]. This illustrative result is established when the Cannings model
is in the domain of attraction of the Kingman coalescent, although it can be easily generalized to any type of
reproduction law. Here we use the classical notations cN(resp. dN) that denote the probability that two (resp.
three) given individuals choose the same parent in a Cannings model. Those notations will be helpful all along the
paper. Recall, e.g. from [28], that the genealogies of a Cannings model fall into the domain of attraction of the
Kingman coalescent when cN0and dN=o(cN)while multiple merger coalescents arise when dNand cNare of
the same order.
Proposition 1 (Reformulation of Theorem 1 in [22]).Suppose that cN=NE[(WN
v)2]0and dN=NE[(WN
v)3] =
o(cN). Let M(n)be a multinomial random variable with parameters nand {µN(i)}mN
i=1 . Also, for any ¯n=
(n1, . . . , nmN)[N]mN,let Z(¯n) = (n2, . . . , nmN,0) + M(n1). Then, the transitions of {¯
AN
g}g0can be writ-
ten in terms of Mand Zas follows.
3
v5v1
v2
v3
v4
0-1-2-3-4-5
Figure 1: In this case N= 8 and mN= 2. The gray circles represent the members of S0=
{v1, v2, v3, v4, v5}where, for example, v2= (0,4) and v5= (1,3). The light gray circles represent
the ancestors of the sample. ¯
A8
0= (4,1),¯
A8
1= (2,2),¯
A8
2= (3,0),¯
A8
3= (1,1),¯
A8
4= (1,0),¯
A8
5= (1,0).
P¯n(¯
AN
1=Z(¯n)) = 1 P
i=1 cN[n1
2µN(i)2+µN(i)n1ni+1] + o(cN)
P¯n(¯
AN
1=Z(¯n)ei) = cN[n1
2µN(i)2+µN(i)n1ni+1] + o(cN)
where eiis the vector with the i-th coordinate equal to 1and the others equal to 0, for all i1.
Proof. We need to make two observations. First note that all the randomness in the transitions of the chain
{¯
AN
g}g0lies in what happens to the first coordinate. If for some g0,¯
AN
g= (0, n2, . . . , nmN)it is easy to see
that ¯
AN
g+1 = (n2, . . . , nmN,0) almost surely. On the other hand, if n1>0, the individuals that are in AN
gVN
g0g
cannot belong to AN
g+1. Then, each of these individuals, if denoted by v, must be replaced by an individual which
lives JN
vgenerations in the past, that is
Pe1(¯
AN
1=ei) = µN(i).
Further, if n1>1, one needs to find n1new ancestors, but some of them could be the same due to some coalescence.
The complete picture is as follows. For i2and j, k 0, and by denoting e0for the null vector,
P2e1+ei(¯
AN
1=ei1+ej+ek) =
2µN(j)µN(k)if i16=j6=k
(µN(j))2(1 cN)if i16=j, j =k
2µN(i1)µN(k)(1 cN)if i1 = j, j 6=k
(2µN(i1)µN(j)+(µN(j))2)cNif i16=j, k = 0
(µN(i1))2(cNdN)if i1 = j, k = 0
(µN(i1))2dNif j=k= 0
.(2.1)
The proof follows easily after these observations.
We now construct a less natural backward process which will be very useful when establishing its moment
duality with the forward process in section 3. We start by defining it in a graphical and intuitive way. More formal
definitions will follow all along the section.
Definition 3 (The window process).Fix a generation g0, and Sg0⊂ ∪mN
i=1 VN
g0+1i. In the genealogical tree of
the sample Sg0, define the variable BN,1
gas the number of edges arriving to generation g0g(plus the number
of individuals of Sg0living at this generation). For any i∈ {2, mN}, let BN,i
gbe the number of edges crossing
4
摘要:

SeedbankCanningsgraphs:HowdormancysmoothesrandomgeneticdriftAdriánGonzálezCasanova1,LizbethPeñaloza2,andArnoSiri-Jégousse31UnidadCuernavacadelInstitutodeMatemáticasdelaUniversidadNacionalAutónomadeMéxico,Av.Universidads/nPeriferica,62210Cuernavaca,Morelos,México.Email:adrian.gonzalez@im.unam.mx2Inst...

展开>> 收起<<
Seed bank Cannings graphs How dormancy smoothes random genetic drift Adrián González Casanova1 Lizbeth Peñaloza2 and Arno Siri-Jégousse3.pdf

共20页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:20 页 大小:605.15KB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 20
客服
关注