On the number of genealogical ancestors tracing to the source groups of an admixed population Jazlyn A. Mooney Lily Agranat-Tamir Jonathan K. Pritchard

2025-05-02 0 0 1.19MB 37 页 10玖币
侵权投诉
On the number of genealogical ancestors
tracing to the source groups of an admixed population
Jazlyn A. Mooney
, Lily Agranat-Tamir, Jonathan K. Pritchard
,
and Noah A. Rosenberg§
October 25, 2022
Abstract. In genetically admixed populations, admixed individuals possess ancestry from multiple
source groups. Studies of human genetic admixture frequently estimate ancestry components corre-
sponding to fractions of individual genomes that trace to specific ancestral populations. However,
the same numerical ancestry fraction can represent a wide array of admixture scenarios. Using
a mechanistic model of admixture, we characterize admixture genealogically: how many distinct
ancestors from the source populations does the admixture represent? We consider African Ameri-
cans, for whom continent-level estimates produce a 75-85% value for African ancestry on average
and 15-25% for European ancestry. Genetic studies together with key features of African-American
demographic history suggest ranges for model parameters. Using the model, we infer that if ge-
nealogical lineages of a random African American born during 1960-1965 are traced back until
they reach members of source populations, the expected number of genealogical lines terminating
with African individuals is 314, and the expected number terminating in Europeans is 51. Across
discrete generations, the peak number of African genealogical ancestors occurs for birth cohorts
from the early 1700s. The probability exceeds 50% that at least one European ancestor was born
more recently than 1835. Our genealogical perspective can contribute to further understanding the
admixture processes that underlie admixed populations. For African Americans, the results provide
insight both on how many of the ancestors of a typical African American might have been forcibly
displaced in the Transatlantic Slave Trade and on how many separate European admixture events
might exist in a typical African-American genealogy.
Department of Biology, Stanford, CA 94305 USA
Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089 USA
Department of Genetics, Stanford University, Stanford, CA 94305 USA
§Email: noahr@stanford.edu.
1
arXiv:2210.12306v1 [q-bio.PE] 22 Oct 2022
Introduction
Genetically admixed populations arise when two or more source groups combine to form a new
population. After a period of multiple generations of mating among members of the incipient
admixed population and new contributors from the source groups, typical individuals in the admixed
population possess ancestry from multiple sources (Chakraborty, 1986; Korunes & Goldberg, 2021;
Gopalan et al., 2022).
The genetic history of an admixed population can be represented by a temporal sequence of
admixture contributions, starting with the founding of the new admixed group (Long, 1991; Verdu
& Rosenberg, 2011; Gravel, 2012). Among present-day members of the admixed population, genetic
patterns such as the distribution of admixture levels estimated from individual genomes can then
be used together with a model of the admixture process to uncover features such as the timing and
magnitude of the genetic contributions that characterize the admixture process (Verdu et al., 2014;
Baharian et al., 2016; Zaitlen et al., 2017).
In studies that seek to infer population parameters from genetic patterns among individuals in
the admixed population, each admixed individual is treated as a random outcome of the admixture
process. The accumulation of data on many admixed individuals then provides information about
the population history. In this perspective, for a given model of the admixture history, an individual
possesses a random genealogy conditional on the parameters of the admixture process. What
information can be obtained about a random individual genealogy under the assumptions of an
admixture model? In particular, for individual members of an admixed population, how many
distinct contributors from the source populations does their admixture represent?
In human admixed populations, questions focused on random individual genealogies can provide
information both about the population-level history of admixture and about the relationship of
individuals to that history. Consider the case of the African-American admixed population in
the United States. Living African Americans descend primarily from an admixture of African
and European source populations, much of the admixture having occurred during the period of
enslavement of most African Americans, 1619-1865. Owing to widespread patterns such as forcible
fracturing of enslaved families by enslavers, a practice of using only first names and not surnames for
enslaved persons, lack of documentation of many of the enslaved even by first name in the written
record, and a reticence of many formerly enslaved individuals to record genealogical information
in the period after slavery, for many African Americans, limited data are available about their
2
individual ancestors prior to the middle or late 1800s (Gates, 2009; Swarns, 2012; Nelson, 2016).
Thus, an admixture model has potential to recover features of African-American genealogies that
are otherwise difficult to obtain.
For an African American chosen at random, how many genealogical lines traced back from
the present to a member of a source population reach an African individual? How many reach a
European or European American? The former quantity approximates the number of ancestors who
traveled from Africa to the Western Hemisphere as forced enslaved migrants in the Transatlantic
Slave Trade. The latter gives the number of occasions at which European admixture events occurred
in a random African-American genealogy. Answers to such questions are informative not only for
understanding the genealogies of individuals, but also for contributing details of the admixture
process that has given rise to the present-day population.
Model
Assumptions
We follow a mechanistic model in which admixture levels are explored in an admixed population
over time (Verdu & Rosenberg, 2011; Goldberg et al., 2014; Goldberg & Rosenberg, 2015; Goldberg
et al., 2020). Three populations are considered: source populations S1and S2, and admixed
population H. In each of a series of generations—indexed discretely with the index increasing
forward in time—an individual in the admixed population Hin generation ghas a pair of parents
probabilistically drawn from among individuals extant in generation g1 in source populations S1
and S2and admixed population H(Figure 1).
Suppose that for an individual in generation g, the admixture contributions are s1,g1,s2,g1,
and hg1, for populations S1,S2, and H, respectively. In other words, for an individual chosen
at random in admixed population H, a parent chosen at random has probability s1,g1of having
originated from population S1,s2,g1for population S2, and hg1for population H. We then have
s1,g1+hg1+s2,g1= 1.(1)
The sampling probabilities s1,g1,s2,g1, and hg1can be interpreted as fractional contributions
from source populations S1,S2, and Hto autosomal genomes in population Hin generation g.
Generation g= 1 represents the founding of the admixed population from members of the source
population from generation g= 0. The admixed population does not exist in generation g= 0, so
that h0= 0, and s1,0+s2,0= 1.
3
Previous studies with these modeling assumptions have tracked properties of random variables
that describe admixture proportions in the source populations S1and S2at generation g. In partic-
ular, Verdu & Rosenberg (2011) studied recursions for the probability distribution and moments of
a random variable H1,g , representing the autosomal fraction of admixture from source population 1
for an individual in the admixed population at generation g. We instead study the random variable
Z1,g, the number of genealogical ancestors from source population 1 for an individual in the admixed
population at generation g, and Z2,g, the number of genealogical ancestors from source population
2. In the sense in which we consider genealogical ancestors, once a source population is reached
along a genealogical line in a specific ancestor, that ancestor is tabulated as a genealogical ancestor
from the associated source population, and the line is not traced any farther back (Figure 2).
Recursion for the number of genealogical ancestors
We review expressions that we will need for the mean and variance of autosomal admixture under
the model (Verdu & Rosenberg, 2011). The mean ancestry fraction from population 1 in generation
gis (Verdu & Rosenberg, 2011, eqs. 10 and 11):
E[H1,g] =
s1,0, g = 1,
s1,g1+hg1E[H1,g1], g 2.
(2)
The variance of the ancestry fraction from population 1 in generation gis (Verdu & Rosenberg,
2011, eqs. 22 and 23)
V[H1,g] =
s1,0(1s1,0)
2, g = 1,
s1,g1(1s1,g1)
2s1,g1hg1E[H1,g1] + hg1(1hg1)
2E[H1,g1]2
+hg1
2V[H1,g1], g 2.
(3)
Note that the mean ancestry fraction from population 2 is one minus the mean ancestry fraction
from population 1, and the variances of the two ancestry fractions are equal.
A recursion describing the admixture fraction H1,g (Verdu & Rosenberg, 2011) can be modified
to obtain a recursion for Z1,g. Whereas the random autosomal admixture fraction H1,g of an
individual is the mean of the corresponding admixture fractions of the parents of the individual,
the random number of ancestors Z1,g is the sum of the numbers of ancestors of the parents (from
population 1).
Let Lbe a random variable that gives the source populations of the parents of a random
individual from the admixed population. Listing the mother first, Ltakes a value in the set L=
4
{S1S1, S1H, S1S2, HS1, HH, HS2, S2S1, S2H, S2S2}. Based on eqs. 1 and 2 of Verdu & Rosenberg
(2011), for generation g= 1, we have
Z1,1=
2 if L=S1S1,with P[L=S1S1] = s1,0s1,0
1 if L=S1S2,with P[L=S1S2] = s1,0s2,0
1 if L=S2S1,with P[L=S2S1] = s2,0s1,0
0 if L=S2S2,with P[L=S2S2] = s2,0s2,0.
(4)
For subsequent generations, g2,
Z1,g =
2 if L=S1S1,with P[L=S1S1] = s1,g1s1,g1
1 + Z1,g1if L=S1H, with P[L=S1H] = s1,g1hg1
1 if L=S1S2,with P[L=S1S2] = s1,g1s2,g1
Z1,g1+ 1 if L=HS1,with P[L=HS1] = hg1s1,g1
Z1,g1+Z0
1,g1if L=HH, with P[L=HS1] = hg1hg1
Z1,g1if L=HS2,with P[L=HS2] = hg1s2,g1
1 if L=S2S1,with P[L=S2S1] = s2,g1s1,g1
Z1,g1if L=S2H, with P[L=S2H] = s2,g1hg1
0 if L=S2S2,with P[L=S2S2] = s2,g1s2,g1.
(5)
For L=HH,Z1,g1and Z0
1,g1are independent and identically distributed copies of the same
random variable. Eqs. 4 and 5 enable us to compute the probability distribution of Z1,g , the number
of population-1 ancestors of an individual in the admixed population in generation g.Z1,g and Z2,g
range in Qg={0,1,...,2g}. For qin Qg, we compute the probability P[Z1,g =q] that a random
individual from population Hat generation ghas qgenealogical ancestors from population 1.
Analogously to eqs. 3-5 of Verdu & Rosenberg (2011), we have for g1
P[Z1,1=q] =
s2
1,0, q = 2,
2s1,0s2,0, q = 1,
s2
2,0, q = 0.
(6)
For g2 and qin Qg,
P[Z1,g =q] = h2
g1
2g1
X
r=0 P[Z1,g1=r]P[Z1,g1=qr]
+(2s1,g1hg1)P[Z1,g1=q1] + (2s2,g1hg1)P[Z1,g1=q] + Ig(q).(7)
5
摘要:

OnthenumberofgenealogicalancestorstracingtothesourcegroupsofanadmixedpopulationJazlynA.Mooney*„,LilyAgranat-Tamir,JonathanK.Pritchard…,andNoahA.Rosenberg§October25,2022Abstract.Ingeneticallyadmixedpopulations,admixedindividualspossessancestryfrommultiplesourcegroups.Studiesofhumangeneticadmixture...

展开>> 收起<<
On the number of genealogical ancestors tracing to the source groups of an admixed population Jazlyn A. Mooney Lily Agranat-Tamir Jonathan K. Pritchard.pdf

共37页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:37 页 大小:1.19MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 37
客服
关注