SHORT COMMUNICATION Identification of Polymorphic Simple Sequence Repeats in the Genome of the Zebrafish DEBORAHJ. GOFF,* KATHERINEGALVlN,~ HILLARYKATZ,~ MONTE WESTERFIELD,§ ERIC S. LANDER,:I:'11AND CLIFFORDJ. TABIN* *Department of Genetics, Shattuck Street, and t Department of Microbiology and Molecular Genetics, Longwood Avenue, Harvard Medical School, Boston, Massachusetts 02115; SWhitehead Institute for Biomedical Research, Cambridge, Massachusetts 02142; §Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403; and IlCenter for Genome Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 T h e z e b r a f i s h h a s d r a w n a g r e a t d e a l o f a t t e n t i o n as a d e v e l o p m e n t a l s y s t e m b e c a u s e it o f f e r s t h e a b i l i t y to combine excellent embryology and genetics. Here, we report that simple sequence repeats are abundant in the zebrafish genome and are highly polymorphic between t w o o u t b r e d l i n e s , m a k i n g t h e m u s e f u l m a r k e r s for t h e c o n s t r u c t i o n o f a g e n e t i c m a p o f t h i s o r g a n i s m . © 1992 Academic Press, Inc.

The zebrafish, Brachydanio rerio, is a vertebrate that provides the possibility of combining classical genetic analysis with unparalleled embryology (5, 6). A major obstacle to the use of the zebrafish for genetic studies of development is the current lack of a genetic map. An attractive source of markers for the construction of a genetic map are simple sequence repeats (SSRs) (10), which have proved to be both abundant and highly polymorphic in mammalian genomes (2, 8). Because SSRs are typed in a polymerase chain reaction assay (7), it is possible to assay a large number of markers from small amounts of DNA and to automate the typing procedure. These advantages have been exploited in constructing SSR-based genetic maps of the rat, mouse, and human genomes (1, 4, 11). Here, we report that zebrafish also contain abundant simple sequence repeats that are highly polymorphic. To have a high probability of identifying SSR polymorphisms, we analyzed DNA from two outbred lines of zebrafish that we believed to be distantly related: the AB line, which has been used in many laboratories for embryological studies, and the Darjeeling (DAR) line, which has recently been isolated from the wild in India. To further maximize the likelihood of detecting polymorphisms, DNA from approximately five individuals of each outbred line was pooled prior to our initial analysis. To ascertain whether simple sequence repeats of the dinucleotide [CA] exist in the 109-bp zebrafish genome (3) and to assess their frequency, we used a mixed [CA]IJ[GT]I~ probe to screen approximately 100,000 M13 clones containing 200- to 400-bp genomic AB DNA fragment inserts. We found that 2.9% of plaques gave a positive signal; this rate is approximately equal to or GENOMICS 14, 200-202 (1992) 0888-7543/92 $5.00 Copyright© 1992by AcademicPress, Inc. Allrightsof reproductionin any formreserved.

slightly greater than that seen in the mouse (1). If [CA], repeats were present randomly, this frequency of positive plaques would indicate an SSR approximately every 12 kb. Thirty-one randomly selected clones were sequenced, all of which were verified to contain [CA]n repeats. These results demonstrate that SSRs exist in the zebrafish genome at a frequency that has proved to be sufficient for genetic mapping in other organisms. Polymerase chain reaction (PCR) assays were used to determine whether the identified zebrafish SSRs are polymorphic between lines of zebrafish. Primers flanking each repeat were generated for 25 of the sequences by a computer program (PRIMER) [(1), M. J. Daly, S. E. Lincoln, and E. S. Lander, unpublished]. The program uses strict criteria to identify oligonucleotides suitable for amplification under the same set of conditions. Reactions with 17 of the 25 primer pairs yielded distinct products on 6% sequencing gels from both lines. Of the 8 primer pairs that did not yield distinct products, 1 gave a clear pattern in the AB line but no products in the DAR line, which may reflect sequence divergence of the DAR line, and 7 gave no distinct bands in either line. No attempt was made to optimize these PCR reactions; taking advantage of the high throughput capability of these protocols, in the long run it is faster to screen additional primers for markers than to pursue clones that do not initially give PCR products. In every PCR reaction with AB DNA, the migration of one of the resultant bands is consistent with the size predicted based on the sequence obtained for the M13 clone. Of the 17 primer pairs that gave a clear pattern in both pooled samples, 16 showed polymorphisms between the two zebrafish lines: 2 amplified a total of five distinct bands between the two lines, 6 amplified a total of four distinct bands, 7 amplified a total of three distinct bands, and 1 amplified two distinct bands (Table 1). For example, primers for SSR marker 12 were chosen to flank a [CA], repeat to amplify a piece of AB DNA 157nucleotides long. A PCR reaction using AB DNA as template yielded a single band migrating at that estimated size (Fig. 1A, lane a). A similar reaction using DAR DNA as template yielded two different bands (Fig. 1A, lane b). PCR products for SSR markers 16 and 22 are also shown in Fig. 1A, lanes c-f.

200

SHORT COMMUNICATION

201

TABLE 1 S i m p l e S e q u e n c e R e p e a t s T h a t A r e P o l y m o r p h i c b e t w e e n A B and D A R L i n e s o f Zebrafish

Marker

Sequence of repeat

1

[CA]2s

2

[CA]lsNlo[CA]l9

9

[CA]27

12

[CA]21

13

[GT]2,N14[GT]l 5

14

[GT]9

15

[CA]I6

16

[CA]22

17

[GT]~Ns[GT]6N4[GT]5

18

[CA]~3

20

[GT]23

22

[GT]25

25

[GT]3sCT[GT]5

26

[GT]s2

27

[GT]34

29

[GT]23

PCR primers (5'-3') Forward Reverse AAGAGAGCAGTTCAGGATTTGC CTGTAACGCTGTACCTCAGGC GACAGCTCCATTAACTAAACCTTT GTCAGCAAGCTTTCTGTCACA ATGCTGCCATTATGGTAGTGC TTCACTTGGATAAAGTACTGGGTG TCACTCACTGGGAATGTAGGG ACAAGCAGAGACATAATCAACTGC CTGGCCTTTGGTTTATCATACATT CATGGATTACTTTCACTAAGCATG GCTCTTCAACATGACTCATTGTG TTACTGGCATTGGTTAAACGG ACATGTTCAGAGACGCCAGA ATTCTGAAACCTGAGGCCCT AGTAGAAAAGGCTTGCAGCG GAAACAAGTGGAGGAAATGAGG ATCTGGAGCTGATTGGACTAGC TTATTGTCTTCACAGCATGATGG GACTCTAGAGGATCATTCTGTGGT CTCCAAACGGATTGAGCTCTCTCT TGCTGACCGCTGAAACTG TGAACGGCCACATAAATAAGC ACCATCGAGGAGCAGAGCT GCAATCCCAATATACGTGCC TGAACCTATAACTGCTGTAATGGC TTTGAGTATGAGTGACATGTGTGC CTGTGTGAGATTGTGTGTTTGTC ACAACACGTTGCATTTGTCG ACCTTTTTTCCCTGTCTGTCC CTCCTTTGCACATTTCTACGC TTGCTCTGTTAAAAATCATTTTGG CGGGGCTTTAGCTGTTGTAT

Predicted PCR product length

No. AB bands

No. DAR bands

No. shared bands

Total no. bands

121

3

2

1

4

187

2

2

0

4

124

1

2

0

3

157

1

2

0

3

208

2

1

1

2

166

1

2

0

3

116

1

2

1

3

136

2

2

0

4

154

2

4

2

4

174

2

2

1

3

133

2

3

0

5

126

3

2

0

5

149

2

2

0

4

241

2

1

0

3

150

2

2

1

3

186

2

2

0

4

Note. Clones containing simple sequence repeats were isolated from an M13 library constructed, as described by Dietrich et al. (1), from 200 to 400-bp fragments of AB genomic DNA. Single-stranded DNA was isolated from 48 positive clones, and the inserts were sequenced by the dideoxy method using ABI's Taq cycle sequencing protocol and fluorescently labeled primers. The sequences were detected with an ABI 373A sequencing machine. Thirty-one DNA sequences were analyzed by a computer program (PRIMER) [(1), M. J. Daly, S. E. Lincoln, and E. S. L., unpublished] that designs PCR primer pairs to amplify SSRs. Twenty-five primer pairs were designed from the 31 analyzed sequences. N, nucleotides known but not given. PCR assay conditions were essentially as described by Dietrich et al. (1). Sixteen primer pairs tested showed polymorphism between AB and DAR lines.

In the PCR reactions with genomic DNA pooled from several outbred individuals, some primer pairs yielded m o r e t h a n t w o b a n d s (for e x a m p l e , in Fig. 1A, l a n e e, a n d T a b l e 1). E v e r y i n f o r m a t i v e p r i m e r p a i r w a s t h e r e f o r e reanalyzed using template DNA prepared from an indiv i d u a l fish. E a c h r e a c t i o n y i e l d e d a m a x i m u m o f t w o bands. To prove that the multiple PCR products genera t e d b y e a c h p r i m e r p a i r a r e allelic, w e e x a m i n e d t h e i r g e n e t i c s e g r e g a t i o n . S u c h a n a n a l y s i s is f a c i l i t a t e d i n zebrafish by the ability to generate haploid embryos. A s i n g l e A B f e m a l e w h o a p p e a r e d t o b e h e t e r o z y g o u s (i.e., g a v e t w o P C R r e a c t i o n p r o d u c t s ) a t f o u r loci w a s c h o s e n . Five of her haploid offspring were examined at each of t h e s e loci, a n d , in e v e r y case, e a c h w a s r e p r e s e n t e d b y a s i n g l e P C R p r o d u c t w i t h t h e size o f o n e o f t h e t w o a l l e l e s p r e s e n t i n t h e p a r e n t (Fig. 1B). T h u s t h e t w o b a n d s d e -

tected in the PCR reactions represent true genetic polymorphism and not distinct, unlinked sequences. Our results indicate that simple sequence repeats exist and are abundant in the zebrafish genome. The heteroz y g o c i t y o f t h e s e s e q u e n c e s is low w i t h i n e a c h o u t b r e d s t r a i n d u e t o t h e f a c t t h a t e a c h p o p u l a t i o n is d e s c e n d e d from a small number of founders constituting an evolutionary bottle neck. However, the markers show a high r a t e o f p o l y m o r p h i s m b e t w e e n t h e t w o l i n e s o f fish ( 1 6 / 17, o r 94%). T h e s e r e s u l t s d e m o n s t r a t e t h e f e a s i b i l i t y o f using these polymorphisms as genetic markers in the generation of a map of the zebrafish genome, which currently has no identified linkage groups and no cytologically distinguished chromosomes. Of the clones anal y z e d i n t h i s s t u d y , 51% ( 1 6 / 3 1 ) l e d t o a n i n f o r m a t i v e m a r k e r . ( T h i s n u m b e r is l o w e r t h a n t h e p o l y m o r p h i s m

202

SHORT COMMUNICATION

A

B 12

SSR Marker : I

16 ii

2

i

22

[

.....I I - -

13 I1

II

PI

hapioids

iPf

haploids

IP~

18 ]1

haploids

I

tP,

haploids I

187208-

187157-

174-

136126-

106 -

121-

ab

cd

ef

ob

cd

e f

ghi

j

k Imnopqr

sfuvwx

FIG. 1. (A) Representative PCR assay results to test for polymorphism of SSR markers 12, 16, and 22 between AB (lanes a, c, and e) and DAR (lanes b, d, and f) lines. (B) PCR assay results to test genetic segregation of multiple band SSRs in haploids. PCR reactions using four sets of primer pairs (1, 2, 13, and 18) were run on DNA isolated from an individual AB female (lanes a, g, m, and s) and from five of her haploid offspring (lanes b - f , h - l , n - r , and t-x). PCR reactions were conducted as described by Dietrich et al. (1).

rate because useful P C R primers could not be readily c h o s e n for all S S R s . ) G i v e n t h i s r a t e of s u c c e s s a n d b a s e d o n t h e e s t i m a t e of 100 c M for e a c h of t h e 25 t e l o c e n t r i c c h r o m o s o m e s , it w o u l d b e n e c e s s a r y to i d e n t i f y a p p r o x i m a t e l y 5000 S S R s to g e n e r a t e a m a p w i t h m a r k e r s h a v i n g a n a v e r a g e d i s t r i b u t i o n of 1 c M or 1000 S S R s to g e n e r a t e a m a p w i t h m a r k e r s h a v i n g a n a v e r a g e d i s t r i b u t i o n of 5 cM, a s s u m i n g t h a t S S R s are r a n d o m l y d i s t r i b u t e d . T h e efficiency of t h e m a p p i n g w o u l d be i n c r e a s e d b y t h e use of c l o n a l l i n e s of fish, h o m o z y g o u s a t e v e r y locus. P r o c e d u r e s for r a p i d l y p r o d u c i n g c l o n a l fish h a v e b e e n d e s c r i b e d (9). T h e S S R m a r k e r s d e s c r i b e d h e r e c a n b e u s e d to v e r i f y t h e c l o n a l n a t u r e of s u c h lines. Because the PCR methodology requires a minimal a m o u n t of t e m p l a t e D N A p e r a s s a y a n d t h e b r o o d size of z e b r a f i s h is large, t h e e n t i r e m a p c o u l d b e c o n s t r u c t e d f r o m t h e a n a l y s i s of 100 p r o g e n y of a s i n g l e F 1 cross b e t w e e n t w o lines. T h e c r e a t i o n of a n S S R - b a s e d g e n e t i c m a p s h o u l d p r o v i d e a p o w e r f u l tool for d e v e l o p m e n t a l g e n e t i c s t u d i e s of t h e zebrafish.

ACKNOWLEDGMENTS

We gratefully acknowledge John Rush for expeditiously synthesizing the oligonucleotides used in this study. We thank Bill Dietrich for advice and protocols; Catherine Nocente-McGrath and Walter Gilbert for help with haploid analysis; and Randy Johnson and Craig Nelson for critical reading of this manuscript. D.J.G. received support from a National Eye Institute Training Grant (T32 EY07110); H.K. and E.S.L. were supported by a grant from the National Center for Human Genome Research (P50HG00098).

REFERENCES 1. Dietrich, W., Katz, H., Lincoln, S. E., Shin, H., Friedman, J., Dracopoli, N., and Lander, E. S. (1992). A genetic map of the mouse suitable for typing intraspecific crosses. Genetics, in press. 2. Hamada, H., Petrino, M. G., and Takunaga, T. (1982). A novel repeat element with ZDNA-forming potential is widely found in evolutionarilydiverse eukaryotic genomes. Proc. Natl. Acad. Sci. USA 79: 6465-6469. 3. Hinegardner, R., and Rosen, D. E. (1972). Cellular DNA content and the evolution of teleostean fishes. Am. Nat. 106: 621. 4. Jacob, H. J., Lindpaintner, K., Lincoln, S. E., Kusumi, K., Bunker, R. K., Mao, Y., Ganten, D., Dzau, V. J., and Lander, E. S. (1991). Genetic mapping of a gene causing hypertension in the stroke-prone spontaneously hypertensive rat. Cell 67: 213224. 5. Kimmel, C. B. (1989). Genetics and early development of zebrafish. Trends Genet. 5" 283 288. 6. Rossant, J., and Hopkins, N. (1992). Of fin and fur: Mutational analysis of vertebrate embryonic development. Genes Dev. 6: 1-13. 7. Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G. T., Mullis, K. B., and Erlich, A. (1985). Primer directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487-491. 8. Stallings, R. L., Ford, A. F., Nelson, D., Torney, D. C., Hildebrand, C. E., and Moyzis, R. K. (1991). Evolution and distribution of [GT]n repetitive sequences in mammalian genomes. Genomics 10: 807-815. 9. Streisinger, G., Walker, C., Dower, N., Knauber, D., and Singer, F. (1981). Production of clones of homozygous diploid zebra fish (Brachydanio rerio). Nature 291: 293-296. 10. Weber, J. L., and May, P. E. (1989). Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Genet. 44: 388-396. 11. Weber, J. L. (1990). Informativeness of human (dC-dA)~. (dGdT)n polymorphisms. Genomics 7: 524-530.

Identification of polymorphic simple sequence repeats in the genome of the zebrafish.

The zebrafish has drawn a great deal of attention as a developmental system because it offers the ability to combine excellent embryology and genetics...
924KB Sizes 0 Downloads 0 Views