THEORETICAL

POPULATION

BIOLOGY

37, 213-2 19 ( 1990)

How Gene Families

Evolve*

TOMOKO OHTA National Institute of Genetics, Mishima 411, Japan

Received June 16, 1989

Theories and facts of gene family evolution are reviewed. Concerted evolution is commonly observed for gene families which originated a long time ago, however there are many different types of multigene families, from uniform to diverse. The rate of homogenization by unequal crossing-over, gene conversion, etc. has been evolutionarily adjusted for each gene family. When new functions are needed by organisms, gene families may evolve into superfamilies, in which no further concerted evolution takes place, and each member of the family may acquire an indispensable function. The homeobox-containing gene family is a most exciting example of such superfamily. c’ 1990 Academic Press, Inc.

Higher organisms are complex, and their developmental processes are controlled by the sequential expression of genes. Such genes often belong to multigene families or their super families. Indeed, chromosomes of higher organisms are full of gene families of various kinds, from simple repeating genes to those with complicated organization. Thus, gene duplication must have played an important role in the evolution of higher organisms (for review, see Ohta, 1988~). For some years, I have been studying the evolution of gene families from the standpoint of population genetics, and my results are summarized here. 1. CONCERTED EVOLUTION Most existing gene families are considered to be in a steady state, established long ago, as exemplified by genes of ribosomal RNA or histone. However, these gene families are not at a standstill; concerted evolution is commonly observed (Dover, 1982; Arnheim, 1983; Ohta, 1983a), and copy number and gene arrangements change to some extent. These genes have been carrying out similar functions for millions of years, and their evolutionary changes are thought to be mostly selectively neutral or nearly * This is Contribution No. 1809 from the National Institute of Genetics, Mishima, Shizuoka-ken, 411 Japan.

213 0040-5809/90$3.00 CopyrIght :C: 1990 by Academic Press, Inc. All rtghts of reproduction m any form reserved

214

TOMOKO

OHTA

neutral with respect to natural selection, just like those of other ordinary gene loci (Kimura, 1983). The selectively neutral model of concerted evolution is based on molecular interaction mechanisms that are responsible for homogenizing genetic information on the chromosome, such as unequal crossing-over and gene conversion. When there is no bias on these interactions, the spreading of genetic information on the chromosome becomes analogous to the random frequency drift of mutant genes in finite populations (Ohta, 1980, 1983a; Nagylaki and Petes, 1982). Population genetic theory is mainly based on identity coefficients, i.e., the probability that genes belonging to a gene family are identical by descent. The results of theoretical analyses show that, by increasing the rate of homogenization, more uniform multigene families may be obtained. Indeed, there exist many different types of gene families, from very uniform (ribosomal RNA genes) to quite diverse (immunoglobulin genes), and the homogenization rate is thought to be evolutionarily adjusted (Ohta, 1988~). The high homogenization rate of uniform gene families has been supported by recent observation by Seperack et al. (1988) and by Matsuo and Yamazaki (1989). These authors examined the diversity of gene copies among individuals of a species.Their results agree with the simple theoretical model of Ohta (1980; 1983a) and Nagylaki (1984a, b) if the rate of homogenization, such as unequal crossing-over and gene conversion, is high, and the inter-chromosomal crossover rate is very low. Let j. be the rate at which a gene is converted by one of the remaining genes belonging to the family on the same chromosome in one generation. Asymmetric and unbiased gene conversion can be treated by this model (Nagylaki, 1984a, b; Ohta, 1984a). Furthermore, the homogenization by unequal crossingover may be roughly approximated by this model, if the copy number is kept stable. When the mean shift of gene number at unequal crossing-over is m, and y is the crossover rate per generation, my/2 is equivalent to i, since one gene conversion corresponds to a cycle of duplication and deletion of one gene (Ohta, 1983a,b). Let other parameters be N = the effective population size, n = the number of genesin a family, v = the mutation rate per generation, and p = the interchromosomal recombination rate between adjacent genes per generation. The data from Seperack et al. (1988) and Matsuo and Yamazaki (1989) indicate that N,?B 1 and N/3 4 1. Under such a condition, quite uniform gene members are expected, even when the copy number (n) is quite large. It should be noted that the mutational load becomes much smaller than the Haldane-Muller prediction for uniform gene families (Ohta, 1989). This is because the high homogenization rate with limited interchromosomal recombination, as mentioned above, results in a large variation of mutant frequency among individuals, and hence efficient elimination

HOW GENE FAMILIES EVOLVE

215

of detrimental mutations by natural selection. My simulation study indicates that the mutational load of haploid populations can be as small as l/20 of nu, if n is 80 or more, where v is now the rate of detrimental mutations per one gene. Note that the Haldane-Muller value is nu for haploid. It is possible that unequal crossing-over is mainly responsible for the homogenization. Copy number regulation causes another load for maintaining gene families’ status quo (Crow and Kimura, 1970, pp. 294296). However, the total load is likely to be much less than the mutation rate estimated by the size of the gene family. Next, let us turn our attention to gene families with variable members, such as those of immunoglobulin or cytochrome P450. The rate of homogenization must be quite low in these genes as compared with the uniform ones. The parameter, ,?, may be less by two orders of magnitude (Ohta, 1983a, 1984b; Gojobori and Nei, 1984). It is likely that, at the beginning, the incipient gene family had accumulated genetic diversity among duplicated gene copies by positive natural selection similar to the case discussed in the next section. When gene copies had diverged sufficiently, the occurrence of gene conversion or unequal crossing-over had become rarer (Walsh, 1987). Nevertheless, concerted evolution is still observed by comparing present gene families; i.e., gene identity is higher for intra-species comparisons than for inter-species ones (Hood et al., 1975; Ohta, 1980). Genes belonging to the major histocompatibility complex (MHC) represent a possible example in which natural selection is observable. Hughes and Nei (1988) examined the pattern of nucleotide substitution between polymorphic alleles in the region of the antigen recognition site (ARS) and other regions of human and mouse class I MHC genes.The results indicate that the rate of amino acid altering substitution is higher than that of synonymous substitution in the ARS. From this finding, these authors argue that overdominant selection is operating at these sites. On the other hand, it has been known for several years that class I or class II genes tend to be related each other by short stretches of homology resulting in a patchwork (see Kappes and Strominger, 1988, for review). Thus, it is likely that gene conversion or other illegitimate recombination, involving short DNA segments of lo&200 nucleotides, is occurring at a non-negligible rate among genes, contributing to enhancing polymorphisms. In my review paper (Ohta, 1983a), I stated that MHC polymorphism is caused by multigenie structure and conversion, and that the primary structure of genes is considered to be evolving mostly by random drift. However, in next year, I suggested the possibility that both gene conversion and natural selection have contributed to enhancing polymorphism (Ohta, 1984a). Actually, intra-genie recombination that results in new antigen recognition would be selected as amino acid replacements in ARS, and may contribute more

216

TOMOKO OHTA

to polymorphisms than selectively neutral conversions at the region not responsible for antigen recognition. A population genetic model that incorporate both gene conversion and selection is now under investigation.

2. INDIVIDUAL GENE MEMBER MAY ACQUIRE INDISPENSABLE FUNCTION

In order to understand the evolution of the complexity of higher organisms, one needs to understand the origins of various gene families. In this section, following my analyses (Ohta, 1987, 1988a, b), let us consider the origins of gene families with functionally diverse members. A simplest model incorporates unequal crossing-over that occurs at a constant rate, y, per gene per generation. Gene conversion is ignored, and mutation, selection, and random drift are treated as in the usual population genetics model. Both positive and negative natural selection are incorporated. A specific model of positive selection is used, in which, if genetic diversity (the number of different alleles in a genome) is lower than the population average, the individual is disadvantaged according to the fitness function, w,= 1

for k,>k

wi=exp{ -s(E-ki)}

for kick,

(1)

where the subscript, i, denotes the ith individual; ki is the number of alleles in the ith individual; E is the population average; and s is a positive selection coefficient. Note that we are here concerned with the evolution of a new gene with a novel function, and the above selection model takes into account such a situation. Negative selection means simply that a gamete is lethal if all gene copies become nonfunctional, or if the copy number becomes zero by unequal crossing-over. Starting from a single gene, my simulation results showed how gene array changes with time. In general, the process of gene accumulation is highly chancy and many different gene families may evolve under similar conditions (see Fig. 1). A most crucial quantity is the ratio, R, of the rate of spread of beneficial mutations to that of detrimental ones. This is expressed as

&UfV+, U-V.

(2)

where u, and z._ are the fixation probabilities of a beneficial and detrimental mutant, respectively, and u + and vP are the rates of beneficial and detrimental mutations, respectively. It is likely that beneficial mutation

HOW GENE FAMILIES EVOLVE

217

FIG. 1. The diagram shows the chromosomal organization of genes at the 5ONth generation of simulation experiments (from Ohta, 1988a). Each line represents one chromosome randomly taken from the population of one run, and results of five runs are shown. Straight lines of chromosomes are blocked to show repeating gene units. Broken lines are those having detrimental mutations. Small vertical marks on solid lines are neutral mutations, and circles, triangles and rectangles represent beneficial mutations. Parameters are positive selection according to Eq. (1) with 2Ns =40; occurrences of unequal crossing-over by 2N7 = 0.25; detrimental mutation by 2No = 0.1; neutral mutation by 2Nv, =0.09; and beneficial mutation by 2Nv + = 0.01.

is much rarer than detrimental mutation, and I examined the case where = lOv+ in detail. R was 0.5 - 1.5 under moderate selection intensity (the Vproduct of the selection coefficient and the effective population size is approximately 10) (Ohta, 1988a,b). Also, it is expected that amino acid substitution is accelerated by this type of moderate selection. These expectations tit the observed facts of hemoglobin gene organization (Jeffreys, 1982), and the sequencedivergence of gene families (Li, 1985). If this type of selection continues, gene members differentiate each other, and no further concerted evolution occurs. A most exciting example of a gene family, of which each member has indispensable function, is that of homeobox-containing genes of Drosophila and mammals. These genes are activated in hierarchy and regulate embryogenesis (for recent reviews, see Dressler and Gruss, 1988, and Akam, 1989). It is now known that the insect homeotic gene complexes (HOM-C) and the vertebrate Hox clusters are homologous, and that the corresponding genes show the same relative boundaries of expression along the antero-posterior axis of the embryo between the two taxa. Thus, these animals have inherited the same regulatory genetic system from a common ancestor. It is remarkable that similar gene organization is found in such distantly related species. By comparative studies of these gene families, a positive correlation between organismal and genetic complexities may be found, i.e., multiple

218

TOMOKOOHTA

sequenceshomologous to the Antennapedia homeobox have been found in the genomes of echinoderms, annelids, chordates and anthropods, but not in bacteria, slime moulds, or yeast; and furthermore, homeobox-like sequences may be present in lower copy number in certain brachiopods, nemerteans, and platyhelminthes as compared with insects or mammals (Dressler and Gruss, 1988). Another topic I would like to mention is the acceleration of the evolution of new genes by gene duplication with subsequent differentiation by sexual recombination. Although this advantage of sexual recombination has been thought to be important by molecular biologists (see a standard text book of molecular biology, Alberts et al., 1989, pp. 843-844), population biologists appear not to recognize it (e.g., see Michod and Levine, 1988). I attempted to study this problem quantitatively by incorporating recent knowledge of molecular biology. Because the model is highly complicated, it is very difficult to obtain analytical solutions. Results of simulation studies carried out so far show that the time for acquiring a new gene in a diploid population with sexual recombination is roughly $$ of that for the haploid population under realistic values of parameters (Ohta, 1988b). This is because, for diploids, when two or more beneficial mutant alleles exist at the same locus in the population, their frequencies may increase by selection, and the two alleles may be combined onto one chromosome by unequal crossing-over at sexual recombination (Spofford, 1969). The advantage of sex here has a slightly different meaning from that of the ordinary model of recombining genes at different loci. In view of the numerous examples of gene families, this role of sex should be seriously considered. REFERENCES M. 1989. Hex and HOM: Homologous gene clusters in insects and vertebrates, Cell 51, 341-349. ALBERTS, B., BRAY, D., LEWIS, J., RAFF, M., ROBERTS, K., AND WATSON, J. D. 1989. “Molecular Biology of The Cell,” 2nd Ed. Garland, New York/London. ARNHEIM, N. 1983. Concerted evolution of multigene families, in “Evolution of Genes and Proteins” (M. Nei and R. K. Koehn, Eds.), pp. 38-61, Sinauer, Sunderland, MA. CROW, J. F., AND KIMURA, M. 1970. “An Introduction to Population Genetics Theory,” Harper & Row, New York. DOVER, G. A. 1982. Molecular drive: A cohesive mode of species evolution, Narure 299, 111-117. DRESSLER, G. R., ANU GRIJSS, P. 1988. Do multigene families regulate vertebrate development? Trends Genet. 4, 214-219. GOJOBORI, T., AND NEI, M. 1984. Concerted evolution of the immunoglobulin VH gene family, Mol. Biol. Evol. 1, 195-212. HOOD, L., CAMPBELL, J. H., AND ELGIN, S. C. R. 1975. The organization, expression, and evolution of antibody genes and other multigene families, Annu. Rev. Genet. 9, 305-353. AKAM,

HOW GENE FAMILIES EVOLVE

219

HUGHES, A. L., AND NEI, M. 1988. Pattern of nucleotide substitution at major histocompatibility complex loci reveals overdominant selection, Nafure 335, 167-170. JEFFREYS,A. 1982. Evolution of globin genes, in “Genome Evolution” (G. A. Dover and R. B. Flavell, Eds.), pp. 157-176, Academic Press, London/New York. KAPPES. D., AND STROMINGER,J. L. 1988. Human class II major histocompatibility complex genes and proteins, Annu. Rev. B&hem. 57, 991-1028. KIMURA, M. 1983. “The Neutral Theory of Molecular Evolution,” Cambridge Univ. Press, London/New York. LI, W.-H. 1985. Accelerated evolution following gene duplication and its implication for the neutralist-selectionist controversy, in “Population Genetics and Molecular Evolution” (T. Ohta and K. Aoki, Eds.), pp. 333-352, Japan Scientific Sot. Press, Tokyo; SpringerVerlag, Berlin/New York. MATSUO, Y., AND YAMAZAKI, T. 1989. Nucleotide variation and divergence in the histone multigene family in Drosophila melanogaster, Genetics 122, 87-97. MICHOD, R. E., AND LEVINE, B. R. (Eds.) 1988. “The Evolution of Sex,” Sinauer, Sunderland, MA. NAGYLAKI, T. 1984a. The evolution of multigene families under intrachromosomal gene conversion, Genetics 106, 529-548. NAGYLAKI, T. 1984b. Evolution of multigene families under interchromosomal gene conversion, Proc. Null. Acad. Sri. U.S.A. 81, 379G3800. NAGYLAKI, T., AND PETES, T. D. 1982. Intrachromosomal gene conversion and the maintenance of sequence homogeneity among repeated genes, Genetics 100, 315-337. OHTA, T. 1980. “Evolution and Variation of Multigene Families,” Lecture Notes in Biomathematics, Vol. 37, Springer-Verlag, Berlin/New York. OHTA, T. 1983a. On the evolution of multigene families, Theor. Pop. Biol. 23, 216-240. OHTA, T. 1983b. Time until fixation of a mutant belonging to a multigene family, Genet. Res. 41, 47-55. OHTA, T. 1984a. Some models of gene conversion for treating the evolution of multigene families, Genetics 106, 517-528. OHTA. T. 1984b. Population genetics theory of concerted evolution and its application to the immunoglobulin V gene tree, J. Mol. Evol. 20, 274280. OHTA, T. 1987. Simulating evolution by gene duplication, Generics 115, 207-213. OHTA, T. 1988a. Further simulation studies on evolution by gene duplication, Evolution 42, 375-386. OHTA, T. 1988b. Time for acquiring a new gene by duplication. Proc. Natl. Arad. Sci. U.S.A. 85, 3509-3512. OHTA, T. 1988~. Multigene and supergene families, in “Oxford Surveys in Evolutionary Biology, V” (P. H. Harvey and L. Partridge, Eds.), pp. 41-65, Oxford Univ. Press, Oxford. OHTA, T. 1989. The mutational load of a multigene family with uniform members, Genet. Rex 53, 141~145. SEPERACI;, P., SLATKIN, M., AND ARNHEXM, N. 1988. Linkage disequilibrium in human ribosomal genes: Implications for multigene family evolution, Genetics 119, 943-949. SPOFFORD,J. B. 1969. Heterosis and the evolution of duplication, Amer. Nar. 103, 407432. WALSH, J. B. 1987. Sequence-dependent gene conversion: Can duplicated genes diverge fast enough to escape conversion? Genetics 117, 543-557.

How gene families evolve.

Theories and facts of gene family evolution are reviewed. Concerted evolution is commonly observed for gene families which originated a long time ago,...
455KB Sizes 0 Downloads 0 Views