Copyright 0 1990 by the Genetics Society of America

RFLP Mapping in Soybean: Association Between Marker Loci and Variation in Quantitative Traits Paul Keim,**’ BrianW. Diem,* Terry C. Olson* and Randy C. Shoemakert *Departments ofAgronomy and Genetics, Iowa State University, Ames, Iowa 5001 1, and Wnited States Department of AgricultureAgricultural Research Service Manuscript received March 26, 1990 Accepted for publication July 2 1, 1990

ABSTRACT We have constructed a genetic mapfor soybean and identified associations between genetic markers and quantitative trait loci. One-hundred-fifty restriction fragment length polymorphisms (RFLPs) were used to identify genetic linkages in an F2 segregating population from an interspecific cross (Glycine max X Glycine soja). Twenty-six genetic linkage groups containing ca. 1200 recombination units are reported. Progeny-testing of F2-derived families allowed quantitative traits to be evaluated in replicated field trials. Genomic regions, which accounted fora portion of the genetic variation(R2 = 16 to 24%) in several reproductiveandmorphological traits, were linked to RFLPmarkers. Significant associations between RFLP markers and quantitative trait loci were detected for eight of nine traits evaluated. The ability to identify genes within a continuously varying trait has important consequences for plant breeding andfor understanding evolutionary processes.

G

ENETIC linkage of molecular markers to quantitative trait loci (QTLs) has been used to identify individual polygenes and to characterizetheir genetic action (NIENHUISet al. 1987; OSBORN, ALEXANDER and WILLIAMS1987; EDWARDS,STUBERand WENDEL1987; TANKSLEY and HEWITT 1988; BURR, HELENTJARIS and TANKSLEY 1989). As early as 1923, SAX (1923) used a seed-coat color locus to predict variation in a quantitative trait, seed size. Only with a large number of genetic markers, however, can reQTLs in populations. searchersroutinelyidentify Morphological markers are too rare for such studies and may vary with the environment.Restriction fragment length polymorphisms (RFLPs) have the potential to “saturate” the genome which will increase the probability of Q T L detection. Soybean [Glycine max (L.) Merr.] genetics has developed slowly due to the inherentdifficulties in performing crosses, a lack of genetic variation in the germplasm, and a lack of cytogenetic markers. Currently the classical soybean genetic linkage map contains 49 linked markers and covers 530 centimorgans (PALMERand KIANG 1990). Incontrast,the maize genetic map contained about 60 linked markers and covered over 700 centimorgans in 1935 (EMERSON, BEADLEand FRAZIER1935). Early successin maize genetics was largely due to the concentrated effortof a largenumber of researchers andhas been duplicated in only a small number of easily manipulated experimental organisms (e.g. Drosophila). In soybean, and

’ Current address: Department of Biology, Northern Arizona University, Flagstaff, Arizona 860 I I . Genetics 126: 735-742 (November, 1990)

many other species, it is doubtfulthattraditional genetic approaches will generate a detailed genetic map. Molecular markers providean importantgenetic tool where traditional genetic studies have been difficult. If a large number of RFLP markers can be identified in a single segregating population it facilitates establishing genetic linkage among markers and identifying linkage between markers and QTLs. Previous studies in G. max haveencountered lowlevels of restriction site polymorphism (DOYLEand BEACHY 1985; APUYAet al. 1988; DOYLE1988), which preventedextensivegenetic mapping. A survey of 58 wild and cultivated soybean accessions from the subgenus Soja, however, identified genetically diverse genotypes (KEIM, SHOEMAKER and PALMER1989). Glycine soja (Seib. and Zucc.) is a wild relative of the domesticated G . max and is considered the progenitor of the domesticated species (HYMOWITZ and SINCH 1987). G . soja is interfertile with G. max and represents a potential genetic resource for plant breeders. In a crop species with limited genetic variability, such as soybean (DELANNAY,RODGERSand PALMER1983; SPECHTand WILLIAMS1984; KEIM,SHOEMAKER and PALMER1989), wild germplasm becomes valuable as a source of genes not present in domestic germplasm. Unfortunately, G . soja also has many undesirable agronomictraits (e.g. lodging) tobe avoided during introgression of desirable genes. In plant breeding, an understanding of undesirable characters is as important as that of desirable traits. In this study, a segregating population was derived from an interspecific cross between G. max and G.

736

P. Keim et at.

A

12 7

-

-

B PA-25eb

PA-104

: --

"82

14

PA-25.S

18

14

-

PA-85 PA-110

" I 18 22

-

-

-

PA-203

-

pT-153b

8

15 "88

-

-

-

pK.327

I 1 pK-265

PK-3

PA-130

22

pR-17

22

-- PA-** - PA.20 pA-73

22

8

14

13

17

r

pK-367 pK-314

-

pK-229

PA-I 66

-

pA-18oa

-

pK418b

pK-

lb

15

8

pA.245 12 P

4

K

-

~

pA-65

pK-493

11

pT-15-

pA.112

F

PK-266

17

pK-443

-

PK-

5

-

21

17

-

-

-

pKs9a

18

13 pA.178

20

--

PA-2pA.23

11

21

-

-

-pb

E

D

C

9 pw13a pK-477

23

-

pA.245b

-

DA.235a

25

PA-

K

pA315

PA-280 pK-395b pK.418a w71)

21 13 pK-401a

pK-8441

pR-22

pK-274 PA-102 10

PA-136

FIGURE 1 .-Soybean RFLP genetic linkage map. One thousand two hundred map units are present on 26 linkage groups. Markers predicting signihcant variation for quantitative traits are boxed (see Table 2).

PA-7 @SAC-3

G

H pA.131

L

I M373

M

pA-109 PA-129

pK-375

L

pK-455

pA-89b

18 pK-24

PA-118

0

17

P

pK-384 pK-69b

pK.227 PA-64

18

14

R

I'

pA.124

pA-233

PA*

pK-395s 19

L

l1

pA.262

pK-417c

pK-70

pA.257

I

v

11 pK-102b 14

PA-122

pT-92

soja. This population was selected for its diversity at

the molecular level (RFLPs) and at the phenotypic level. O u r goals were to constructa molecular genetic linkage map and to identify associations between molecular markers and quantitative trait variation. MATERIALSANDMETHODS

Germplasm:The F2 study population was constructed by W. R. FEHR(Iowa State University) from two parental lines: G . max (A81-356022, an Iowa State University breeding

r

t

PA-162

pK.7

Y

X

W

E

pK-14a

pK-262 pK-255

pK-2yJ PA-83

z pK-286

3

c ;;:;a"

line) and a G. soja accession (PI 468.916). These parents were chosen for theirgenotypic diversity (KEIM, SHOEMAKER and PALMER1989), phenotypic differences (CARPENTER and FEHR 1986; CIANZIOand FEHR 1987; GRAEF,FEHR and CIANZIO1989) and lack ofchromosomal translocations (PALMER et al. 1987). Geneticmarkers: One-hundred-fifty RFLP differences between the G. max and G. soja parental lines were detected using recombinant DNA clones primarily from a random Pstl genomic library (KEIMand SHOEMAKER 1988). Parental DNA digested with EcoR1, EcoRV, DraI, TaqI, Hind111 or XbaI were screened by molecular hybridization with each

Soybean RFLP Mapping

1

.s

Ev,

d d

F2 INDIVIDUALS

'

I

"

Bm

737

*

.

-

*="

FIGURE2.-The independent segregation of two loci detected with one probe. DNA from F S individuals, digested with the restriction enzyme HindIII, was probed with PA-199. Allelic combinations of restriction fragments are indicated for each locus (A, = 4.8 kb, A,,, = 2.7 kb, B, = 1.9 kb, B, = 0.3 kb).

probe. The random DNA probes pG-17.3, pG-8.15 and M373 were provided by K. G. LARK(University of Utah), and their isolation has been described (APUYA et al. 1988). The actin gene probe (pSAC-3) was kindly provided by RICHARD MEACHER (University of Georgia). Probes designated pT, pR and PC were also from the random PstI library (KEIMand SHOEMAKER 1988). though they were developed and kindly provided by K. G. LARK.The hybridization protocol has already been described (APUYAet at. 1988; KEIM, SHOEMAKER and PALMER 1989). Three morphological markers were also evaluated: seed coat color ( i ) , pubesKILEN 1987), and seed coat luster cent tip (fib) (PALMER and (scl). An official gene symbol has yetto be assigned for seed coat luster. Experimental design: The 60 genotypes usedinthis study wereF2plants derived from the G. max X G. soja cross. DNA from these individuals was extracted according (1988). to the procedure of KEIM, OLSON and SHOEMAKER Fs progeny derived from each of the 60 F2 plants were used to measure quantitative traits. F2 derived lines were grown in plots 1.5 meters long at 33 seeds per meter. Plots were separated by 1.2 meters (within rows) with I-meter spacing between rows. Entries were arranged in a randomized complete-block design withtwo replications at each of three unique locations near Ames,Iowa.Because maturity was segregating in this population, planting dates for each loca-

tion were different: May l, May 15 and May 29 of 1988. Quantitative traits: Vegetative and reproductive stages in soybeanhavebeen described by FEHRand CAVINESS (1977). Using this study as a guide, we measured the traits on each plot as follows. R1was the number of days after June 31 when 50% of the individuals in a plot had at least one flower. R8 was the number of days after August 31 when 50% of the individuals in a plot had mature seed-pod color (95% of the pods per plant). Seed-fill length was the number of days between R1 and R8.Six fully expanded leaflets (randomly chosen) per plot were measured at their widest and longest points to determineaverage leaflet length and width per plot. Stem diameters were measured with calipers midway between the unifoliolate and the first trifoliolate nodes for three mature plants per plot. To determine canopy height, a wooden lath was placed on each plot and used to measure the average height above the ground where the lath was supported. Stem length was defined as the distance from soil surface to theuppermost node with a pod on the main stem. Internode length was defined as the distance between the unifoliolate and the third trifoliolate nodes. Upon maturity, stem and internode lengths were determined on three plants per plot. Linkage analysis: Genetic relations among markers were determined by maximum likelihood analysis ofsegregation patterns in the FPpopulation. The computer program "Map

738

P. Keim et al. only those associations significant at P < 0.01 are reported here. Even when a significance level of P < 0.01 is used, however, there is still a risk of concluding a marker is linked to a QTL although it is not. Whereas, when a lower significance level is used important loci may be overlooked. Assuming a genome size of 1.2 Morgans and 150 markers, we have calculated using the procedures of LANDER and BOTSEIN (1989) that for any trait, the probability of a false QTL association is near 1.0 at the significance level of P < 0.05. The probability decreases to 0.78, 0.14 and 0.01 for P < 0.01,O.OOl and 0.0001, respectively. This study constitutes only the initial identificationof QTLs, and markers significant near the P < 0.01 level will require further confirmation.

Linkage B 4 4 11 14

21 pT-1-

Linkage A

18

22 3 8 m

PA-226b

m

pK-274

pA-226a

Linkage G

18

FIGURE3.-Duplicateloci on linkage group B. Linked lociin linkage group B do not have linked duplicates in other linkage groups.

Maker” (LANDER et al. 1987) was used to analyze these data and establish the “best”order of loci.The statistical evidence in favor of the “best” map was summarized as a lod score (loglo ofthe odds ratio). A minimum lod scoreof 3.0 (except of markers pA-203and pT-153b where lod= 2.8) was used for the pairwise linkage analysis. Lod scores for the combined markersof different linkage groups ranged from 3 to

20. Statistical analysis of quantitative data: The statistical program “General Linear Models (PC-SAS)” was used for quantitative data analysis. Initially, quantitative data was analyzedusing standard analysisofvariance(ANOVA) methods for a randomized-complete-block design experiment. Variance was partitioned according to experimental factors including location, replication within locations, genotypes (meansof the Fp-derived families),and genotype-bylocation interactions. Both locations and replications were considered random effects. All traits had significant genotypic effects and were further analyzed on an individual marker basis. Marker-genotypes classes(homozygous G . soja, heterozygous, homozygous G. max) were used to sort the F2 -derived lines so that one-way ANOVAs could be performed on the quantitative data. The F statistic was used to determine if the meansof different marker-genotype classesweresignificantly different ( P < 0.01). Variation explained by markers was described by using the R2 value which is the proportion of the total variance among the 60 lines’ means explained by the marker-genotype classes. The 9 traits were sorted and analyzed for 150 markers. Because 150 one-way ANOVAs wereperformed for each trait, agreat probability for type I errors exists. Therefore,

RESULTS

The frequency of polymorphisms observed in this population was greater than that observed in previous studies of soybean RFLP markers. Over 500 random genomic clones were used as probes to screen restriction digested parental DNAs (five different restriction enzymes). Some 400 of these proved to be low copy DNAsequences (KEIM and SHOEMAKER1988)of which ca. 40% detected polymorphisms between the G . soja accession(P1468.916) andthe G. max line (A81-356022). This is atwofoldincrease over the frequency observed between two G. max Plant Introductions used previously (APUYA et aE. 1988). Seemingly, a large number of RFLP markers identified in this study resulted from DNA rearrangements since about half of the variable probes detected polymorphisms with two or more restriction enzymes. DNA rearrangements have been found to be the most plausible mechanism causing a polymorphism to be detected by multiple enzymes and this has been substantiated by restriction mapping of polymorphic regions (APUYAet al. 1988). RFLP markers where the heterozygote and oneof the homozygotes cannot be distinguished (similar todominancein classical markers) account for 10% of the polymorphisms between these twolines. These“dominant”RFLPmarkerscould have been causedby sequence deletions in one parent, and thus, also by DNA rearrangements. A genetic map was constructed from the 150 segregating markers. One hundred thirty markers (20 markers remain unlinked) were genetically linked to form 26 linkage groups (Figure 1). Redundant linkage groups must exist in our map because the haploid chromosome number of G . max is known to be 20 (HYMOWITZand SINGH1987).Becauseofthis, the RFLP linkage group letter designations are tentative and will change with further study.Currently,the classical geneticmap consists of 17 linkagegroups (PALMERand KIANG1990), the majority of which are two-point linkages. Many of these two-point linkages may be present on the same chromosome, but have not been connected either because the appropriate populations have not been analyzed, or because large

Soybean RFLP Mapping

739

distances separatethem. T h e RFLPmapreported here consists of ca. 1200 centimorgans, in contrast to 12 Mean = 51.4 the ca. 530 centimorgans present in the previous map I u) l o Std. Dev. = 5.3 KILEN 1987). In conjunctionwith RFLP (PALMER and Q markers, several segregating morphological markers 3 8 were evaluated in the F2 population. In two instances, 9 these markerswere present in both classical and RFLP 5 6 genetic maps. T h e pubescence marker p b occurred in 4 : theRFLP linkage group “B,” and in the classical linkage group14. The seed coat marker i was in 2 RFLP linkage group“A”andthe classical linkage I. group 7. Integration of the remaining genetic linkage 35 45 40 sb 55 60 65 groups will require segregation analysis in other populations, the useof aneuploid lines (HELENTJARIS, Percent G. max WEBERand WRIGHT1986), or the use of near-isogenic FIGURE4.-Genotypic composition of FBindividuals. The genotype composition of an individual was determined by averaging all lines (MUEHLBAUERet al. 1988). the markers for each individual. Some DNA probes detected two RFLP loci segregating independently in the F2 population. SegregaTABLE 1 tion of two loci detected with probe PA-199 is illusTrait variation due to experimental factors trated in Figure 2. Duplicate loci detected with probe pG-17.3 (previously described) were attributed to an R 2(%) ancestral tetraploidy state for soybean (APUYAet al. Traits Location Genotype G*L 1988). Almost all probes used in this study detected multiple fragments, but multiple polymorphicfrag7* R1 56** 30** ments existed in only 23 of the 104 probes (12’7 linked 4** 8** R8 83** 53** Seed-fill (R1 to R8) 12* 20** RFLP markers). An example of nonpolymorphic frag54** 2** Stem diameter 21* ments can also beobserved in Figure 2. Duplicate NS 25** 3** Stem length markers are indicated in Figure 1 by a suffix letter 64** NS Canopy height 2** designation (e.g., pT-153a). In most instances (with 5* Leaf width 79** 7** Leaf length 58** lo** 22** the exception of PA-256a and b, and pK474e and d), the duplicateRFLPmarkersoccur in independent NS = not sigificant, (*) P < 0.05, (**) P C 0.01. Replications within location were significantlyonly for leaf width, and leaf linkage groups. In maize studies, similar observations length. have been attributed to the existence of ancient homoeologous chromosomes (HELENTJARIS, WEBERand Significant genotypic variation was present inall WRIGHT 1988). In soybean, however, we have found traits measured. Twenty-five to 83% of the variation that duplicate loci linked in one group do not occur (R2) in our study was explained by genotypic effects again in another single linkage group. Figure 3 illus(Table 1). Location and genotype-by-location intertrates an instance in which none of the duplicate loci actions were significant for most traits but accounted from linkage group “B” formed another homoeolofora lesser amount of the variation than did the gous group. Six other examples of linked duplicate genotypic effect. Segregation within F2-derived lines loci exist in the RFLP genetic map, and none of these could hinder the accurate evaluation of a trait. T h e seem to define asecond homoeologous group (Figure lack of significant replication effects (except for leaf 1). length and width, where less than the full plot was NIENHUIS et al. (1987) reported a skewed distribuused forevaluation, see MATERIALS AND METHODS) tion of segregating alleles that favored the wild parent suggests that an each plot contained enough individin an interspecific cross of tomato (Lycopersicon hirsuuals to avoid errors caused by within-family segregatum X Lycopersicon esculentum). This result has been tion. attributed togametic selection which favored the wild Genomic regions important in determining quantiaccession over the cultivated line (ZAMIR, TANKSLEYtative variation were identified in 8 of 9 traits. Markand JONES 1982). In this study, only 20 out of 150 ers in these regions weresignificantly ( P < 0.0 1, Table markers deviated significantly (chi-square test, P < 2) associated with a portion of the phenotypic varia0.05, data not shown) from the expected ratios, and tion. No significant associations were detectedfor these deviations were in both the G. max and the G. internode length. Table 2 lists the 20 different RFLP soja directions. Thus, the genetic distortions noted in marker associations with traits that were significant. other studies (NIENHUISet al. 1987) do not exist in T h e variation explained by each marker was greater this soybean F2 population (Figure 4). than 16% (R‘). In several instances the markers asso-

-



-=

P. Keim et al.

740 TABLE 2 Markers thatdetect significant phenotypic variation

Trait

R2

Marker

widthLeaf

0.24 PA-Ill pK-390 pK-411

P

RFLP mapping in soybean: association between marker loci and variation in quantitative traits.

We have constructed a genetic map for soybean and identified associations between genetic markers and quantitative trait loci. One-hundred-fifty restr...
2MB Sizes 0 Downloads 0 Views