THEORETICAL

POPULATION

BIOLOGY

10,

70-82

The Decay of Genetic Structured THOMAS

(1976)

Variability in Geographically Populations. II* NAGYLAKI+

Mathematics Research Center and Department of Medical Genetics, University of Wisconsin, Madison, Wisconsin 53706 Received

June

25, 1975

The ultimate rate of approach to equilibrium in the infinite stepping-stone model is calculated. The analysis is restricted to a single locus in the absence of selection, and every mutant is assumed to be new to the population. Let f (t, x) be the probability that two homologous genes separated by the vector x in generation t are the same allele. It is supposed that f(0, x) = 0(x-*-~), 7 > 0, as x = 1 x 1 h co. In the absence of mutation, f(t, x) tends to unity at the rate t-II2 in one dimension and (In t)-’ in two dimensions. Thus, the loss of genetic variability in two dimensions is so slow that evolutionary forces not considered in this model would supervene long before a two-dimensional natural population became completely homogeneous. If the mutation rate, u, is not zero f(t, x) asymptotically approaches equilibrium at the rate (1 - u)z9-3/2 in one dimension and (1 - u)2tt-1(lnt)-2 in two dimensions. Integral formulas are presented for the spatial dependence of the deviation off (t, x) from its stationary value as t --+ CX), and for large separations this dependence is shown to be (const + x) in one dimension and (const + In X) in two dimensions. All the results are the same for the Malecot model of a continuously distributed population provided the number of individuals per colony is replaced by the population density. The relatively slow algebraic and logarithmic rates of convergence for the infinite habitat contrast sharply with the exponential one for a finite habitat.

INTRODUCTION The amount and pattern of genetic variability in geographically structured populations are of considerableinterest in population geneticsand evolutionary

* Paper No, 1863 from the Genetics Laboratory, University of Wisconsin. Sponsored by the United States Army (Contract No. DA-31-124-ARO-D-462), the National and the National Institutes of Health Science Foundation (Grant MPS72-00381 AOl), (Grant GM-15422). ‘Present address: Department of Biophysics and Theoretical Biology, University of Chicago, Cummings Life Science Center, 920 E. 58th Street, Chicago, Illinois 60637.

70 Copyright All rights

0 1976 by Academic Press, Inc. of reproduction in any form reserved.

GEOGRAPHICAL

STRUCTURE

71

theory. In the absence of selection, one studies the evolution of a population under the joint action of mutation, migration, and random drift. We shall use the stepping-stone model of population structure (Malecot, 1950; Kimura, 1953), but, if the number of individuals per colony is replaced by the population density, all our results apply equally to the Malecot model of continuously distributed populations (Malecot, 1967; 1969, pp. 64-69). Much work has been done on populations occupying finite habitats (Maruyama, 1971, 1972; Fleming and Su, 1974; Nagylaki, 1974a, 1974b; Malecot, 1975a). Exact solutions have been found for some simple geometries, qualitative understanding of the degree and pattern of genetic diversity at equilibrium has been attained, and the rate and nature of the approach to equilibrium have been deduced in many cases. If the size of the habitat is very large compared to the typical amount of migration per generation, the habitat may accurately be treated as infinite. As explained previously (Nagylaki, 1974b), such a formulation is required to obtain a precise estimate of the ultimate rate of approach to the steady state in very large habitats. The calculation of the rate of convergence to equilibrium is essential for understanding when the equilibrium structure might actually be observed. Since, without mutation, in one and two dimensions even an infinite geographically structured population tends to complete genetic uniformity (Weiss and Kimura, 1965; Malecot, 1967) this problem has great evolutionary significance. The unidimensional problem was solved exactly in a continuous spacecontinuous time model in (Nagylaki, 1974b). In more than one dimension, however, this model has an unacceptable singularity (Fleming and Su, 1974; Nagylaki, 1974b). For identity by descent, one may assume that the initial probability of identity is zero, and thereby very much simplify the analysis. In this case, the ultimate rates of convergence for the stepping-stone model were obtained by methods different from the one employed in this paper in one dimension by Malecot (1975a, b) and in both one and two dimensions by Sawyer (Stanley Sawyer, private communication). For an evolving natural population, however, it is identity in allelic state which is directly measurable and pertinent to the amount of genetic diversity. In this situation, our aim is to compute the asymptotic probability of identity in state from its arbitrary (and, in principle, known) initial value. We suppose that there are N diploid monoecious individuals at the points of the infinite unit lattice in n dimensions. The panmictic colonies exchange migrants so that the probability that the separation between two individuals of any genotypes changes by x is r(x), the components of the vector x being integers. We posit that there is no selection, and all alleles at the locus under consideration mutate to types not preexisting in the population at rate u. Letf(t, x) represent the probability that two homologous genes separated by x in generation r are the same allele. For two genes in generation t + 1 to be identical in state, muta-

72

THOMAS

NAGYLAKI

tion must not occur, and they must be descended from the same gene or identical genes. Thus, the basic equation of the model reads

f(t + 1,x)= (1- u)”IcYr(y)f(

t, x - Y) + WV111

- f(4 011r(x)/. (1)

The secondterm in (1) corrects the term with y = x in the sum for the possibility that if the two genesin generation t + 1 are descendedfrom the samecolony in generation t, then they may originate from the samegene, in which casethe appropriate probability of identity in generation t is unity, not f(t, 0). The probability of descentfrom the samegene, given descentfrom the samecolony, is manifestly just (2N)-l. If0 0, and, rewriting (l), we seedirectly that

f(t + 1,x) = (1 - 4Bj c

4Y)f(

t, x - y) + {(my

Y#X

+ [l - (2W11f (4 0))T(X) 1

< (1 - 24)” c r(y) + [(2N)-1 + 1 - (2N)-9 Y(X) = (1 - u)” < 1. 1Y#X I Therefore, 0 0 by induction. Note that this is not alwaystrue in continuous space. It is easy to see that the expected symmetry relation r(-x) = r(x) holds (Nagylaki, 1974a,b). Assuming f (t, -x) = f (t, x), we find at once from (1) that f(t + 1, -x) = f (t + 1, x). Since the two genesare not distinguished, we may choosethe symmetric initial condition f (0, -x) = f(0, x), and conclude by induction that f(t, -x) = f (t, x) for t > 0. We decomposef into its transient and equilibrium componentsaccording to f(t, 4 = w, x) + f&J, x>,

(2)

where we shall see that F(t, x) -+ 0 as t -+ CO,~1s (1 - u)a, and we have exhibited the dependenceof fm on the mutation rate for future convenience. Substituting (2) into (l), we obtain f&9

4 = 57 /c r(Y)fa’(w,

Y

x - Y) + WHl

-f&4

O)] r(x)/ 7

F(t + 1, x) = et 1 Y(y)F(t, x - y) - (2N)-IF@, 0) T(X) . 1Y I

(3)

(4)

GEOGRAPHICAL

73

STRUCTURE

In the next section, we shall analyze the steady state equation (3). The results, more refined than those given previously, will be required for the subsequent investigation of the rate of approach to equilibrium based on (4).

THE

As suggested by the convolution 1969, pp. 77-84)

STEADY

STATE

structure

of (3) we define (Malecot, 1967;

jm(w, 0) = Cfm(w, x

x) e-+x,

(5)

where the dot signifies the scalar product, and derive directly from (3) j,(w,

6) = 7x(w) R(8)[1 - wR(B)]-1.

(6)

In (6), R(e) = C r(x) eeie.x, x s(w) = (2N)-l[l

-f,(w,

O)].

Now (5) and (6) allow us to calculate the Fourier coefficientsf,(w,

f&J, x) = s(w)+, xl, eie’X

x) offm(a, 0): (9)

sR(e)

where

(8)

dn13

ol(w,x) = w Q 1 - wR(8) (2n)S ’

(10)

and the region Q is the n-dimensionalcube 1f?,1 < n, j = l,..., n. We wish to approximate f* for small mutation rates, i.e., as u + 0 with x fixed. We assumethe covariance matrix of Y is finite, and rotate coordinatesso asto diagonalizeit. Next, we scalethe rotated coordinatesin terms of the standard deviations oi, j = l,..., n, of r. The sumsin (5) and (7) will be over the rotated and scaledcoordinatesof the lattice points, and, sinceY(-x) = r(x), it follows from (7) that R(0) = 1 - 4 ea+ o(P)

as

e+ 0,

(11)

where 0 = ( 8 /. After rotation and scaling, (10) becomes (12) with T(e) = [R(e)]-’ and c = &

Us. The region Q isjust Q after rotation and

74

THOMAS

NAGYLAKI

scaling. We shall require only the obvious fact that the origin is in the interior of a. From (11) we have T(B) = 1 + ; 82 + @g(B),

(13)

whereg(B) = o(1) as B-+0. In view of (13), as u -+ 0 (V -+ I), we expect the main contribution to (12) to come from the neighborhood of 8 = 0. We write, for fixed 6 > 0, a: = a1 + or + aa + 01~with

(14b)

4’3x)= : S,,,[ye;- v-

(l/2)

6f2 + 2J eie*x & 7

(14c) (14d)

The notation in (14a) indicates integration over all space. For sufficiently small 6 > 0, the region 8 < 8 is contained in D. Sz, is simply 51 with 0 < 6 excluded. We complete the argument in one dimension. Then with x = 1x I, al(w, x) = (~czP)-~

we-2u1”x,

and, clearly, 01~(v,X) = O(1) and Q(V, X) = 0( 1) uniformly It remains to estimate

4wp ‘) = C(

(15) in x as u + 0.

[22 - ezg(e)] cos ex de ((i/2) 82 + 24[(1/2) 82 + 2~ + e2g(e) - d] .

(16)

For any c > 0, no matter how small, there exists 6 > 0 such that 1g(B)1 < E for 0 < 0 < 6. For sufficiently small u, (16) permits us to conclude (17) Replacing the upper limit in (17) by co, we find 1a&, x)1 < -$ (ZN + $)

= o(l/zP)

(18)

GEOGRAPHICAL

uniformly

75

STRUCTURE

in x as u + 0. Combining (15) and (18), we may express (Yin the form a(w, x) = (2cu1/a)-1[e-au1’e* + h(w, x)],

where h(v, x) = o( 1) uniformly

fm(v,4 =

(19)

in x as u - 0. Then (8) (9) and (19) yield

e-@. + h(fJ,x) 1 + ~Ncu~/~ + h(w, 0) ’

@-v

From (20) we get the heterozygosity

1 - f,(o, 0) = 4NcrA2[1 + ~Ncu~/~ + h(w, 0)1-l (21)

= ~Ncu’/~ + o(zP)

as u-+0. The result (20) was first given for Gaussian migration with the approximation h = 0 by MalCcot (1955). Since h(w, x) = o(1) uniformly in x as u --+ 0, (20) also holds in the limit II -+ 0, x + co with rA2x fixed, and h(w, x) = o(1) in this limit. We may always neglect h(w, x) in the numerator of (20). If g(6) = O(P+n), 7 > 0, as ~9-+ 0, as will be the case if the absolute moment of order 3 + 7 of r(x) is finite, then %(w, x) = O(1) uniformly in x as u --+ 0, so h(w, x) = O(d2) uniformly in x as u -+ 0, and we may disregard h(v, 0) in the denominator of (20) for large NC. In (21), o(u)~/~ is then replaced by O(u). In two dimensions, Malecot (1967) h as derived a formula for isotropic Gaussian migration and x = ( x / > 1. The work of Weiss and Kimura (1965) is discussed in (Nagylaki, 1974b). For arbitrary migration, a slight modification of the unidimensional method gives (Nagylaki, 1974b) fm(v x) = &l(2~1’24 - K)(2”‘“4 + &4 x) , 27rNc - (l/2) In 2u + h(w, 0) ’

(22)

in which K,, is the modified Bessel function of the second kind of order zero and h(w, x) = o(ln u) uniformly in x as u --f 0. The heterozygosity now reads 1 - f,(w, 0) = 2nNc[2?rNC - 4 In 2u + h(w, 0)1-i = --LhrNc/ln

2u + o(l/ln u)

(23)

as u--f 0. With the very weak assumption g(8) = O(@), 7 > 0, as 0 -+ 0 in (13), h(w, x) = O(1) uniformly in x as u + 0, and h may be neglected in (22) for large NC. In this case, o(l/ln u) in (23) becomes O[(ln u)-~], and it is also possible to prove that

fdw,x) N -2K,(2u1/2x)/ln

2u

as u ---f 0 and x --f co with zA2x fixed (Stanley Sawyer, private communication).

76

THOMAS

NAGYLAKI

We observe from (20) and (22) that in both one and two dimensions fm(z), x) -+ 1 as u -+ 0 with x fixed (Weiss and Kimura, 1965; Malecot, 1967). Since the population is infinite, this complete loss of genetic variability is a consequence of geographical structure. In the next section, we shall require the function (Y(w, x) f a(o, 0) - c+, x)

(24)

1 - cos 8 * x d”8 z.zc I Q T(B) - v (27F)n * V

(25)

Note that 1 - cos 0 . x d”B (26) is finite. asG(l,x)

To deduce its asymptotic =ol,+&s+~?,with ‘Gx)

behavior as x -+ co, we rewrite

1 - cos 8 . x d”B (i/2)82 m’

41ec8

1

‘2(l~x) = f Se,,L& i -

(i/2) 82

1(1 -

1 - cos 8 * x ‘3Uy

x)

=

f

(26)

J1,

qfj)

_

0

Qs was defined in (14d). Evidently, simplifies to

1

cos e * x) 6, d”B

(2a)”

*

(27b) (27~)

E3 = O(1) as x -+ 00. Recalling (13, (27b)

which yields, for any E > 0 and suitably small 6 > 0,

I ol,(l,x)l 9.

t-1

(42)

(43)

Observe that substituting (39) into (41) yields (lo), and substituting (40) into (42) produces the secondexpressionfor /?. We now derive from (38) Y(% 4 = B(% x) - (my

+., x)CdO,0) + Y@,O)l,

GEOGRAPHICAL

79

STRUCTURE

solve for y(z, 0), and substitute: Y(% x) = rw, x) -

4z, x)hJ(O,0) + B(z,ON 2N + 4% 0)

= B(&x) - f&, x)[P)(O,0) + PC?O>l.

(4)

The final form of (44) the main tool for calculating the rate of convergence, follows from (8) and (9). In the special case f(0, x) = 0, y may be expressed entirely in terms of fm , as first shown by a different method by Malecot (1975a). From (2), we now have ~(0, x) = f,(w, x). Substituting this into (42) and using successively (lo), (5), (6) again (lo), and (9), we deduce that

13(x,x) = @J- +~&@,

xl - ~$4[wl-1fm(~, 41,

and inserting this into (44) and utilizing (8) we find Y@, x) = (v - +l[&x(%

x) - q&,

x)1.

(45)

We shall treat the cases u = 0 and u > 0 in (44) separately. No Mutation (u = 0) Our aim is to derive the behavior of y(z, x) as x + 1-, and then apply a standard Tauberian theorem (Feller, 1966, p. 423) to calculate that of ~(t, x) as t + co. Using the bar notation as in (24), (44) gives, with the aid of (8) and (9), Y(& x) = Y(% 0) - +, = W

x)

+ “(2, x)144 ,+A 0) - 8(x, x) - ~(0,0)f&, x).

(46)

Since f&l, x) = 1, the last term in (46) tends to -~(0, 0) as z -+ 1-, and (2) implies ~(0, x) = 1 -f(O, x). Then (42), (5), (6), and (24) inform us that B(& x) = 4

- 4-l

- @4x,0) C f(O, Y) - Cf(O, Y Y

Y) qz, x - y).

(47)

We shall suppose f(0, x) = O(X-~-Q), 7 > 0, as x -+ co. Then, recalling (28), we see that the sums are finite in the limit z ---t I -, and hence fl(x, x) - ~(1 - z)-1 and /?(z, x) = O(l), so y(z, x) -

[2N + G( 1, x)] s(z)( 1 - z)-’

(48)

as 2 -+ 1-. From (8), (21), and (23) we know that as x -+ ls(2) - c[2(1 - X)11/2, N -2rc/ln(l - z), 653/10/r-6

?Z=l 12= 2.

(49)

80

THOMAS

NAGYLAKI

Therefore, (43), (48), (49), and, assuming ~(t, x) is ultimately decreasing, the Tauberian theorem (Feller, 1966, p. 423) give

monotone

$1 & 01,

dt, x) - P + mww,

(504

n=l

y(t, 0) - 2Nc(2/(7rt))l/2, N 4xNc/ln t,

n = 2.

(5Ob)

This is the desired rate of convergence. Observe that for large separations (28) provides the explicit spatial dependence. Note also that (50) is independent of the initial conditionf(0, x). Iff(0, x) = 0, the decay rate is easily calculated from (45), the method employed in essence by Malecot (1975a) to derive the unidimensional formula. The result (50) has also been obtained by Sawyer (private communication) in another way. The one-dimensional expression may also be deduced directly from the exact solution of the continuous spacecontinuous time model in (Nagylaki, 1974b). Nonzero Mutation

(u > 0)

In the particular casef(0, x) = 0 (45) shows that with u > 0 (w < 1) ~(1, x) is finite. Therefore, we shall examine y’(z, x) as z + l--, and deduce, on the basis of (43), the behavior of tcp(t,x) as t--t CCLRecalling (9), (24), and (42), we rewrite (46) as

+ %z,@Ic ~(0,~)fm(z,

14, x) = W

- c PO Y%i

x-

Y

-Y)

Y> - @4 -Y)l - do, O)f&,

x)

and differentiate with respect to x: Y’(z, x) = ‘;;‘(z, 4 C do, Y)~,(z,

+ G(z,@IC ~(0,Y).(,(z,

-Y> + PN

-Y> (51)

- c $40: Y)[qz, x - Y) - S’(z, -Y>l - do, h:b, Y

XI.

As z--t 1-, (34) shows that the third term ;cancels the first. Thus, we require only f;(Z, x> = f&

0) - b(z) G> XII’

= [l + (2N)-1 qz, x)]fk(z,

0) - s(z) E’(z, x),

(52)

where we used (9) and (24). Inserting (52) into (51) yields as z + lY’(z> 4 -

41 + WY

41, x)1 f;(z, O),

(53)

GEOGRAPHICAL STRUCTURE

81

with the constant

A = -do, 0)+ cY do,Y)PN + w, -Y)l.

(54)

Then (21) (23), and the Tauberian theorem (Feller, 1966,p. 423) give 9+, x) - u + w-w,

cp(t,0) - (2/77)‘/” ANCt-3/2, - 4rrANct-l(ln t)-2,

$1 cpk0)

VW

n=l n = 2.

Wb)

For (55) to hold, it sufficesto supposethat tp)(t, x) is ultimately monotone decreasing and (Stanley Sawyer, private communication) ~(0, x) = 0(x-2-*), 7 > 0, as x + co. But ~(0, x) = fm(w, x) - f(0, x) from (2), and we shall see below that the contribution of fm(q x) to A is ~(1 - u)-l. Therefore, as for u = 0, we may supposethatf(0, x) = 0(+-n), 7 > 0, asx + GO. The spatial dependenceof (55) is the sameas that of (50), but the rate of convergencewith u > 0 is much faster than with ZJ= 0. Also, in contrast to the simpler casewithout mutation, the asymptotic value of ~(t, x) depends on the initial probability of identity, f(0, x), through the overall constant A. If f(0, X) = 0 it is possible to evaluate A explicitly. Substituting ~(0, x) = f&v, x) into (54), and utilizing successively(26), (5), (6), (lo), and (S), we find A = ~(1 - v)-l m (224-l, the approximation being valid for u < 1. The decay rate for f(0, x) = 0 was obtained in one dimension by Malecot (1975b) and in both one and two dimensionsby Sawyer (private communication) by different methods. The general one-dimensional result agreeswith the solution of the continuous space-continuoustime model in (Nagylaki, 1974b).In the continuous model, the linear spatial dependenceis exact, rather than asymptotic for large separations.

DISCUSSION

In the Malecot model of continuously distributed populations (Malecot, 1967; 1969, pp. 64-69), the number of individuals in a colony, N, is replaced by the population density, and the sumsover the lattice becomeintegrals over all space. Then all the integrals over 6 are extended to all of B-space.But this hasno effect because,aswe showed,all the leading contributions comefrom the neighborhood of 8 = 0. Therefore, all the decay rates are unaltered. The main results of the paper are (50) and (55), which give the rate of decay of the transient part of the probability of identity, in allelic state, F(t, x) = -(l - u)2t g)(t, x). The rate of approach of q(t, x) to zero is much faster for u > 0 than for u = 0. If u = 0, the population has no genetic diversity in

82

THOMAS

NAGYLAKI

equilibrium, but in two dimensions the rate of convergence is so slow that this state would never be reached in practice. With u = 0, the initial probability of identity,f(O, x), does not influence the asymptotic value of ~(t, x), but if u > 0, f(0, x) is required to calculate an overall constant in ~(t, x). The spatial dependence is the same for u = 0 and u > 0, and is specified explicitly for large separations by (28). If the size of the habitat is not much larger than the standard deviation of the migration distance per generation, it is more accurate to treat the habitat as finite. The exponential rate of convergence in that situation contrasts sharply with the considerably slower algebraic and logarithmic ones derived above for the infinite habitat.

ACKNOWLEDGMENT I am very

grateful

to Professor

Stanley

Sawyer

for many

extremely

helpful

discussions.

REFERENCES FELLER, W. 1966. “An Introduction to Probability Theory and Its Applications,” Vol. II. Wiley, New York. FLEMING, W. AND Su, C.-H. 1974. One dimensional migration models in population genetics theory, Theor. Popul. Biol. 5, 431-449. KIMUFZA, M. 1953. ‘Stepping stone’ model of population, Ann. Rept. Nat. Inst. Genet. Japan 3, 62-63. MAL~OT, G. 1950. Quelques schemas probabilistes sur la variabilite des populations naturelles, Arm. Univ. Lyon Sci. Sec. A 13, 37-60. MAL~OT, G. 1955. Decrease of relationship with distance, Cold Spring Harbor Symp. Quant. Biol. 20, 52-53. MAL~COT, G. 1967. Identical loci and relationship, Proc. Fifth Berkeley Symp. Math. Stat. Prob. 4, 317-332. MALBCOT, G. 1969. “The Mathematics of Heredity,” Freeman, San Francisco. MALBCOT, G. 1975a. Heterozygosity and relationship in regularly subdivided populations, Theor. Pop&. Biol. 8, 212-241. MAL~COT, G. 197513. “La Differentiation Geographique,” Masson et Cie, Paris, preprint. MARUYAMA, T. 1971. The rate of decrease of heterozygosity in a population occupying a circular or a linear habitat, Genetics 67, 437-454. MARUYAMA, T. 1972. The rate of decrease of genetic variability in a two-dimensional continuous population of finite size, Genetics 70, 639-651. NAGYLAKI, T. 1974a. Genetic structure of a population occupying a circular habitat, Genetics 78, 777-790. NAGYLAKI, T. 1974b. The decay of genetic variability in geographically structured populations, Proc. Nat. Acad. Sci. USA 71, 2932-2936. WEISS, G. H. AND KIMURA, M. 1965. A mathematical analysis of the stepping stone model of genetic correlations, J. Appl. Prob. 2, 129-149.

The decay of genetic variability in geographically structured populations. II.

THEORETICAL POPULATION BIOLOGY 10, 70-82 The Decay of Genetic Structured THOMAS (1976) Variability in Geographically Populations. II* NAGYLAKI+...
627KB Sizes 0 Downloads 0 Views