OENOMICS 1 4 , 1041-1054 (1992)

Closure of a Genetic Linkage Map of Human Chromosome 7q with Centromere and Telomere Polymorphisms CYNTHIA HELMS, SANTOSHK. MISHRA, HAROLDRIETHMAN,* ANDREA K. BURGESS,SRINIRAMACHANDRA, CHRISTOPHERTIERNEY, DENISEDORSEY,AND HELENDONIS-KELLER1 Division of Human Molecular Genetics, Department of Surgery, Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri 63110; and *The Wistar Institute, Philadelphia, Pennsylvania 19104 Received April 13, 1992; revised August 24, 1992

We have constructed a 2.4-cM resolution genetic linka g e m a p f o r c h r o m o s o m e 7 q t h a t is b o u n d e d b y c e n t r o mere and telomere polymorphisms and contains 66 loci (88 polymorphic systems), 38 of which are uniquely p l a c e d w i t h o d d s f o r o r d e r o f at l e a s t 1 0 0 0 : 1 . T e n g e n e s are included in the map and 11 markers have heterozyg o s i t i e s o f at l e a s t 7 0 % . T h i s m a p is t h e first to i n c o r p o rate several highly informative markers derived from a telomere YAC clone HTY 146 (locus D7S427), including HTY146c3 (HET 92%). The telomere locus m a r k e r s s p a n at l e a s t 2 0 0 k b o f t h e 7 q t e r m i n u s a n d n o crossovers within the physical confines of the locus were observed in approximately 240 jointly informat i v e m e i o s e s . T h e s e x - e q u a l m a p l e n g t h is 1 5 8 c M a n d the largest genetic interval between uniquely localized m a r k e r s i n t h i s m a p is 1 1 cM. T h e f e m a l e a n d m a l e m a p l e n g t h s a r e 1 8 1 a n d 1 3 3 cM, r e s p e c t i v e l y . T h e m a p is based on the CEPH reference pedigrees and includes over 4000 new genotypes, our previously reported data plus 29 allele systems from the published CEPH version 5 database, and was constructed using the program package CRI-MAP. This genetic linkage map can be considered a baseline map for 7q, and will be useful for defining the extent of chromosome deletions previously reported for breast and prostate cancers, for developing additional genetic maps such as index marker and 1-cM maps, and ultimately for developing a fully integrated genetic and physical map for this chromos o m e . ©1992 Academic Press, Inc.

INTRODUCTION One of the first goals of the H u m a n Genome Initiative is to complete a 2- to 5-cM genetic linkage map. T he resolution of recent chromosome linkage maps approaches and often surpasses the 5-cM mark, but unless there has been a directed effort because of a nearby disease locus, for example, Huntington's disease (Bates et al., 1990; reviewed in Pritchard et al., 1991), few groups 1 T o w h o m c o r r e s p o n d e n c e s h o u l d be a d d r e s s e d at Division of H u m a n M o l e c u l a r Genetics, D e p a r t m e n t o f Surgery, W a s h i n g t o n U n i v e r sity School of Medicine, 660 S o u t h Euclid, St. Louis, M O 63110.

have targeted telomeric regions for marker development. Without closure of genetic maps, i.e., genetic loci included at the physical ends of chromosomes, the completeness of the maps for genetic disease gene searches will be limited. T he published C E P H consortium maps of chromosomes 1 and 10 (Dracopoli et al., 1991; White et al., 1990) are two examples in point; while the maps have a mean locus spacing of 4 and 7 cM, respectively, the genetically defined physical ends of the chromosomes have yet to be incorporated. Microsatellites and V N T R loci localized to the most distal cytogenetic bands cannot be assumed to provide complete genetic coverage because there are cases where markers localized to the most distal band cover a genetic distance in excess of 50 cM (Nakamura et al., 1989; Petersen et al., 1991); therefore, genes located in regions of high recombination near the telomeres of such chromosomes may be missed in a genome search. T h e availability of relatively large telomere segments (50-400 kb) cloned into YAC vectors (Riethman et al., 1989) provides source DNA for development of genetic markers and provides an opportunity to compare physical and genetic distance over defined intervals. As shown in Riethman et al. (in preparation) the YAC clone HTY146 (locus D7S427) defines the physical end of 7q. We report here genetic closure of the 7q arm via mapping of multiple polymorphic sequences from D7S427. We also report the construction of a new linkage map with an average locus spacing of 2.4 cM th a t includes tie points to 21 cytogenetically localized markers. This report, together with additional new polymorphic loci and our 7p map (Mishra et al., 1992), describes the first map to include data from two previously published chromosome 7 maps (each of which was based on essentially unique sets of markers; Barker et al., 1987; Lathrop et al., 1989). Genetic maps such as the one reported here should prove useful for genome screens for heritable traits, as an aid for cloning genes from positional information, for more precise definition of chromosome deletions by assays of loss of heterozygosity of polymorphic alleles (LOH), and for developing clonebased physical maps of chromosomes. Examples of import ant genes from chromosome 7q t hat await cloning

1041

0888-7543/92 $5.00 Copyright © 1992 by Academic Press, Inc. All rights of reproduction in any form reserved.

1042

HELMS ET AL.

and c h a r a c t e r i z a t i o n by these approaches include putative t u m o r suppressor gene(s) for p r o s t a t e a n d b r e a s t c a n c e r ( B i e c h e e t al., 1992; C a r t e r e t al., 1990); a p o s s i b l e g e n e s y n t e n i c t o t h e m o u s e ob l o c u s i m p l i c a t e d in o b e s i t y ( F r i e d m a n e t al., 1988); g e n e ( s ) a s s o c i a t e d w i t h a 7q36 translocation breakpoint and holoprosencephaly type 3 ( H a t z i i o a n n o u e t al., 1991), m y e l o d y s p l a s t i c s y n d r o m e s , a n d s e c o n d a r y l e u k e m i a s ( K e r e e t al., 1989); a n d a l o c u s b e t w e e n 7q21.2 a n d q21.3 i n v o l v e d i n s p l i t h a n d / f o o t d e f o r m i t i e s ( S H F D 1 ) (e.g., R o b e r t s e t al., 1991).

MATERIALS AND METHODS Genetic markers. All loci included in the 7q map, including 10 gene loci for which abbreviations are given, are listed in Table 1. The full gene names are as follows: 2,3-bisphosphoglycerate mutase (BPGM), collagen 1A2 (COL1A2), carboxypeptidase A1 (CPA1), epidermal growth factor receptor (EGFR), engrailed homolog 2 (EN2), endogenous retroviral sequence 3 (ERV3), met proto-oncogene (MET), P glycoprotein 3/multiple drug resistance 3 (PGY3), plasminogen activator inhibitor type I (PLANH1), and the T-cell receptor ~ cluster (TCRB). The genotype dataset includes additional genotypings of loci reported in our previous map (Barker et al., 1987): CRI-L917 (D7S15), CRI-L281 (D7S54), CRI-L887 (D7S59), CRI-L966 (D7S61), CRI-R53 (D7S68), CRI-R967 (D7S70), CRI-S2 (D7S71), CRI-S3 (D7S72), CRI-S14 (D7S73), CRI-S23 (D7S78), CRI-S65 (D7S84), CRI-S140 (D7S93), CRI-$148 (D7S95), CRI-$155 (D7S96), CRI-S161 (D7S98), CRI-$162 (D7S99), CRI-S167 (D7S101), CRI-S194 (D7S104), CRI$201 (D7S107), and CRI-$241 (DTSlll). New loci genotyped for this study are the HTY146 cosmid subclones c3, c6, c30, and c48 (D7S427, HR, in preparation); Mfd50 (D7S440, Weber et al., 1990); and pPAIE6.9 (PLANHI, Klinger et al., 1987). New genotypes will be submitted to the CEPH upon publication and will be made available to any interested researcher. PLANHI clones were obtained from the ATCC; clones having the prefix CRI- were gifts of Collaborative Research, Inc. (Waltham, MA), and can be purchased from the ATCC (Rockville, MD) or directly from CRI. Telomere clones reported here that reveal polymorphism are available from the ATCC. The CEPH database (version 5) data for markers published by Lathrop et al. (1989) and included in this work are COL1A2, D7S8, D7S13, D7S18, D7S125, D7S126, D7S129, D7S368, D7S371, D7S372, D7S392, D7S395, D7S396, ERV3, MET, and some of the pJ2 data for TCRB. Data from the centromere locus (D7Z2, probe pMGB7) for six CEPH pedigrees was provided by Dr. Huntington Willard. The TCRB haplotype data have been published by Charmley et al. (1990). Other markers from the CEPH database not known to be incorporated previously into a published genetic map are BPGM (Dracopoli et al., 1990), pKKA12 (D7S398; Nakamura et al., 1988), IEF24.11 (D7S448; Dean et al., 1991), p184a (CPA1; Stewart et al., 1990), and pLg3 (D7S22; Wong et al., 1986). The data in the CEPH published database for EGFR used in map construction were collected by N. Dracopoli (personal communication), the EN2 data were contributed to the CEPH by J. Murray, and the MDR2 (PGY3) data were collected by L. Cavalli-Sforza. C E P H p a n e l DNA. DNA from members of the primary 40-family CEPH reference pedigree panel (Dausset et al., 1990) used in the genetic mapping studies was either obtained directly from the CEPH or prepared from lymphoblastoid cell lines as previously described (Mishra et al., 1992). Genotyping, error checking, and genetic map construction. HTY146 cosmid subclones (D7S427) and the probe pPAI-E6.9 (PLANHI) were screened for their ability to detect RFLPs as described previously (Schumm et al., 1988). s2P-labeled probes were hybridized to a panel of six parent CEPH reference panel DNAs digested with eight different restriction enzymes: BamHI, BglII, EcoRI, HincII, HindIII, MspI, PstI, and TaqI. Polymorphisms detected by the probes were analyzed further by hybridization to a set of Southern

blots containing the appropriate restriction enzyme digests of DNAs from the 40 pairs of CEPH reference parents, followed by a probing of the blots of informative families (i.e., at least one parent being heterozygous for the RFLP). Southern blotting and hybridization conditions for detection of RFLPs were described previously (Donis-Keller et al., 1987; Barker et al., 1987). Mfd50 (D7S440), a dinucleotide repeat polymorphism, is detected by PCR amplification essentially as described by Weber et al. (1990) except that we used one U-end-labeled primer in the PCR reactions, denatured the template initially for 10 min, annealed the primers at 58°C, and did not use a final extension period of 10 min. Genotypes were confirmed by two scientists reading the autoradiographs and then entered into a Macintosh computer using a HyperCard application (Six Ponds Software) that converts the data into the genotype file format read by the CRI-MAP linkage mapping programs. Allele frequencies, polymorphism information content (PIC, see Botstein et al., 1980), and heterozygosity calculations were performed using the computer program PIC/HET (Weaver et al., 1992). The genetic maps were constructed using the CRI-MAP linkage programs, version 2.4 (Green et al., 1989; Lander and Green, 1987), run on a Sun SparcStation (Sun Microsystems). Markers with fewer than 20 informative meioses in the database were not included in the map construction. After each CRI-MAP "build," the resulting map order was used with the CRI-MAP option "chrompics." The chrompics output was used as an aid in error-checking, since it depicts the inheritance of each informative marker (i.e., the grandparental origin of the marker) for each chromosome in the dataset and, for each genetic interval, lists the crossovers detected. The map order obtained from the final build was subjected to a CRI-MAP "flips 3" analysis, in which adjacent sets of three loci are permuted and the likelihood of the resulting map order is calculated and then compared to the likelihood of the original map order. The order with the best support was chosen to undergo another round of flips analysis, and the process was repeated until no better order was found. The final graphical representations were prepared with the aid of a computer program, Vertical Mapper (Six Ponds Software), written for the Macintosh. The x 2 statistic was used to test the significance of differences in the male and female maps.

RESULTS Genetic Map

Construction

W e b e g a n t h e 7q m a p c o n s t r u c t i o n w i t h t h e g e n o t y p i c d a t a f o r 103 a l l e l e s y s t e m s (76 loci), 66 o f w h i c h w o u l d be i n c o r p o r a t e d i n t o t h e f i n a l 7q m u l t i p o i n t l i n k a g e m a p . M a r k e r s u s e d i n o u r p r e v i o u s m a p ( B a r k e r e t al., 1987) that were genotyped on the additional informative families o f t h e 40 C E P H r e f e r e n c e p e d i g r e e s a r e l i s t e d u n d e r Materials and Methods; these data, together with genot y p e s f o r 12 n e w s y s t e m s , e x p a n d t h e p r e v i o u s d a t a s e t b y o v e r 4000 g e n o t y p e s . A n a d d i t i o n a l 29 a l l e l e s y s t e m s r e p r e s e n t e d in the published C E P H version 5 database were also i n c l u d e d in m a p c o n s t r u c t i o n (see M a t e r i a l s a n d Methods). In addition to the allele systems r e p r e s e n t e d i n T a b l e 1, g e n o t y p e s f o r s e v e r a l loci m a p p i n g t o 7p w e r e included to increase the c o n t i n u i t y of the entire c h r o m o s o m e 7 m a p in t h e p e r i c e n t r o m e r i c r e g i o n a n d t o p r o d u c e a s t a b i l i z i n g e f f e c t o n t h e 7q m a p f o r m a t i o n a r o u n d t h e c e n t r o m e r e d u r i n g t h e m a p c o n s t r u c t i o n p r o c e s s . T h e 7p loci u s e d i n c l u d e D 7 S 1 7 , D 7 S 4 3 5 , T C R G , D 7 S 6 5 , G C K , a n d D 7 S 5 7 (see M i s h r a e t al., 1992). A s in o u r 7p m a p , we u s e d t h e C R I - M A P p r o g r a m p a c k a g e ( G r e e n e t al., 1989) f o r all t w o - p o i n t a n d m u l t i p o i n t l i n k a g e c a l c u l a t i o n s . F o r e a c h C R I - M A P b u i l d , e i t h e r we c h o s e t w o h i g h l y

GENETIC

LINKAGE

MAP

OF

HUMAN

CHROMOSOME

1043

7q

TABLE 1 Genetic M a r k e r s Incorporated into C h r o m o s o m e 7q L i n k a g e Map Locus

Reg. loc.

Probe

Enzyme"

Allele size (kb)

Const.

Freq. %b

HET (PIC)

IM °

BPGM

q22-q34 e

XMEL4

M

2 - 7.0 1 = 6.2

16 84

27 (0.23)

151

COL1A2

q21.3-q22.1 f

NJ3

E

1 = 13.0 2 = 9.5, 3.5

63 37

47 (0.36)

193

CPA1

q32-qter g

p184a

Bg

1 = 11.0 2 = 8.7

31 69

43 (0.34)

216

D7Z2

Centromere h

pMGB7

NA

D7S8

q31 i

pJ3.11

M T

D7S13

q22.3-q31.1 j

B79a

H M

D7S15

CRI-L917

Hc H

NA

NA

102

1 2 1 2

= = = =

1.6 4.0 6.0 3.1

64 36 97 03

46 (0.35) 6 (0.06)

282

1 2 1 2

= = = =

8.1 4.3 11.6 8.4

19 81 35 65

31 (0.26) 46 (0.35)

109

1 2 1 2

= = = =

2.5, 1.9 3.8 6.3 5.3, 1.0

53 47 71 29

50 (0.37) 42 (0.33)

190

6.7, 5.0, 3.6 12.5, 7.5

8

198

209

D7S18

q31.1-q31.2 j

p7C22

E

1 - 7.5 2 = 5.0

83 17

28 (0.24)

96

D7S22

q36-qter k

pLg3

Hf

M u l t i p l e alleles; 1.5 to 22

26 27 26 21

75 (0.70)

537

CRI-L281

M

M u l t i p l e alleles; 1.5 t o 4.0 1 = 10.8 2 = 10.5 3 = 6.7, 4.7 4 = 11.0 5 = 6.3, 4.5

4.7, 0.9, 0.7

NA

292

3.3, 1.6

34 21 28 16 01

71 (0.66) 74 (0.69)

D7S54

T

337

D7S56

CRI-L544

M

1 = 21.0, 4.8 2 = 10.0, 6.6

1.9

50 50

50 (0.38)

177

D7S59

CRI-L887

M

1 - 12.5 2 = 9.7 3 = 7.6

4.0

27 22 51

62 (0.55)

305

D7S61

CRI-L966

R

M u l t i p l e alleles; 2.0 t o 6.7

2.0, 1.4, 1.1, 0.9

NA

80 (0.78)

331

D7S63

CRI-L1033

M

1 = 13.0 2 = 10.8

6.8

46 54

50 (0.37)

165

D7S64

CRI-L1238

E

1 = 13.5 2 = 10.0, 3.4

5.5, 2.8, 1.5

66 34

45 (0.35)

222

D7S67

CRI-R40-2 d

M

1 = 6.3 2 = 5.0

3.1, 2.8

51 49

50 (0.37)

143

D7S68

CRI-R53

Bg

1 2 3 1 2

= = = = =

4.7 4.5 4.4 7.3 6.9

9.0, 3.0

17 12 71 16 84

45 (0.41)

122

26 (0.23)

150

21 26 51 01 01 01

63 (0.57)

282

68 32 64 36 58 01 41

44 (0.34) 46 (0.36) 50 (0.39)

183

71 29

41 (0.33)

Bm D7S70

CRI-R967

M

1 2 3 4 5 6

= = = = = =

3.2 3.0 2.8 2.7 3.0, 2.0 2.5

D7S71

CRI-S2

M

D7S72

CRI-S3

H

D7S73

CRI-S14

M

1 2 1 2 1 2 3

= = = = = = =

4.2, 2.3 2.9, 1.9 9.4 5.3, 4.3 17.0 13.3 12.3

D7S76

CRI-S19

P

1 = 4.7 2 = 4.2

6.2, 5.5, 4.7 4.3, 1.8, 1.7

Closure of a genetic linkage map of human chromosome 7q with centromere and telomere polymorphisms.

We have constructed a 2.4-cM resolution genetic linkage map for chromosome 7q that is bounded by centromere and telomere polymorphisms and contains 66...
2MB Sizes 0 Downloads 0 Views