Sequence differences between HLA-B and TNF distinguish different MHC ancestral haplotypes L. J. Abraham, C. Leelayuwat, G. Grimsley, M. A. Degli-Esposti, A. Mann, W. J. Zhang, F. T. Christiansen, R. L. Dawkins. Sequence differences between HLA-B and TNF distinguish different MHC ancestral haplotypes. Tissue Antigens 1992: 39: 117-121. Abstract: The HLA-B locus is extremely polymorphic. We have sequenced a region, CL, telomeric of HLA-B that also shows a high degree of allelic variation which we have shown previously by RFLP analysis. The polymorphism can be accounted for by sequence variation in duplicated, reiterated sequence elements called geometric elements. Comparison of the CL1 and CL2 sequences from the 57.1, 8.1, 18.2 and 7.1 ancestral haplotypes revealed that the lengths of the elements vary, both between the duplicated loci within a haplotype and between haplotypes, apparently because certain sequences are inserted or deleted. It is possible, using the polymerase chain reaction, to amplify these elements in genomic DNA from ancestral haplotypes for which sequence data of the CL region are not available and to obtain gel patterns which are characteristic of different ancestral haplotypes. The most striking feature of the data is the fact that the majority of the CL patterns are haplospecific; i.e. have a particular pattern that is unique for a particular ancestral haplotype and can be used to type these ancestral haplotypes. At least 12 different allelic patterns have been identified within a panel of 29 cell lines representing 16 ancestral haplotypes. For these 16 ancestral haplotypes, all examples of each haplotype have the same CL pattern. The haplotypic nature of the patterns confirms that ancestral haplotypes are conserved chromosomal segments and that coding and non-coding sequences are identical by descent from a remote ancestor.
Introduction One of the most striking features of the Major Histocompatibility Complex (MHC) is the presence of multiple highly polymorphic and duplicated genes, including HLA Class I and Class I1 but also the C4, C2 and CYP21 genes. However, the degree of polymorphism varies from site to site throughout the region. For example, the most centromeric of the Class I genes (HLA-B) is extremely polymorphic whereas TNF and the adjacent B144 gene appear to be much less so (1, 2,
3). There are at least three distinct blocks of polymorphism within the MHC: (...HLA-C,HLA-B..., ...C4.CYP21 ... and ...HLA-DRB, DQ ...) (4). We chose to examine the interval between HLA-B and TNF B144 to identify other polymorphisms and to define the centromeric boundary of the polymorphic block which includes HLA-C and -B. In ad-
+
L. J. Abraham, C. Leelayuwat, 6. Grimsley, M. A. Degli-Esposti, A. Mann, W. J. Zhang, F. T. Christiansen and R. L Dawkins Publication number 91 29 of the Departments of Clinical Immunology, Royal Perth Hospital, Sir Charles Gairdner Hospital and the University of Western Australia, Perth. Western Australia
Key words: MHC
- HLA - typing - haplotype
Received 19 September, revised, accepted for publication 4 November 1991
dition, we askecl vrrhether the degree of polymorphism found within the expressed protein products of HLA-B and HLA-C correlated with surrounding polymorphism at the genomic level. All polymorphisrns within the megabase between HLA-C and HLA-DQ have been shown to be carried by one or more of some 30 to 50 ancestral (conserved population or extended) haplotypes with a specific genomic structure (5-9). Thus, a basically Caucasoid population can be described in terms of these founder haplotypes plus their recombinants. To test this concept further, we predicted that any new polymorphism would be found on all examples of each ancestral haplotype. To examine these issues, we established lambda EMBL3 genomic libraries of four cell lines which are known to be homozygous for the well-defined M H C ancestral haplotypes, 57.1, 8.1, 7.1 and 18.2 (2). We have already shown that these ancestral haplotypes have numerous haplospecific sequences 117
Abraham et al. number of variable nucleotides
-
57.1 c L 1
T C A G A
57.1 CL2
- -
-
8.1 - c L 2
-
- -- -
-
7.1 cL2
N
(TC) 12 (TG)6 (T C)
- -
. N - .
~
14
( T G ) 3 (T C) 12
TA(TC)ieTT(TC)g (TC)15 T G ( T C ) 8 T
(T C)
14
G (TC)
8
T G (TC) 8 T G ( T C )
T G (T C) 8 T G (T C) 8 T G (T C) 8 T G (T C) 5
( )n = number of dinucleotide repeats
-=consensus sequences
5
N
T G T T T
94
- _ _ _ _
58
-
-
96
- -
94
- * -
- -
-
-
= A, G , C or T
Figure I. Geometric elements at CL1 and CL2 are polymorphic and ancestral haplotype-specific. The nucleotide sequences of the geometric elements and tlanking regions at CLI and CL2 for the 57.1, 8.1 and 7.1 Ancestral haplotypes are shown. The CLl locus from 18.2 is also presented but a CL2 locus from 18.2 has not been identified. The number of nucleotides comprising each geometric element is listed.
between HLA-B and -DQ and we postulated that these four examples would provide an adequate estimate of the degree of polymorphism seen in the Caucasoid population (5-9). Material and methods
Nucleotide sequence analysis DNA subclones derived from lambda EMBL3 or lambda GEM11 clones, derived from 8.1,
57.1 (2) 18.2 and 7.1 (10) that contained geometric element sequences were sequenced using the Taq DyeDeoxy Terminator Cycle Sequencing protocol (Applied Biosystems Inc., Foster City, Ca). Fluorescent dye-labelled extension products were analyzed on a Model 373 DNA Sequencer (Applied Biosystems Inc.). Sequence data editing and alignments were done using the SeqEd program (ABI) on a Macintosh CI (Apple Computer Inc., Cupertino, Ca).
Table 1. The length of the geometric elements explains the sue variations observed from RFLP analysis of the CL region length of GE (nucleotides)
GE size variation in comparison to 18.2-CL1 GE (nucleotides)
Observed differences in RFLP size (nucleotides)
expected RFLP fragment
size (nucleotides)
observed RFLP fragment size (nudeotides)
57.1 CLl 18.2 - CL1 8.1 - CLl 7.1 - CL1
94 28 56 94
66 0 28 66
80 0 30 80
1866 1800 1828 1866
1880 1800 1830 1880
- CL2 - CL2 - CL2
sa
30 68 66
ao
50
1830 1868 i8136
1850 1880 1800
AH-Iows
-
57.1 8.1 7.1
96 94
GE=geometric element (Taq I R R P fragment sizes of 57.1, 18.2, 8.1 and 7.1 AHs are from ref. 13).
118
0
Distinguishing MHC ancestral haplotypes Oligonucleotide primers
The PCR oligonucleotide primers CTREP3 and CTREP4 that delimit a geometric sequence located in the CL region and sequencing primers were synthesized using phosphoramidite chemistry on a Model 39 1 DNA Synthesizer (Applied Biosystems Inc.), deprotected, and purified on OPC columns (ABI). Cell lines
A panel of 29 EBV-transformed human B-cell lines were used in this study. These cells were obtained from EBV transformation of peripheral blood lymphocytes (PBLs) or from the 10th International Histocompatibility Workshop reference cell panel. The local cell lines-were genotyped for HLA class I, I1 and compIement loci by serology, complement allotyping or RFLP analysis. Cell lines that were homozygous for a particular ancestral haplotype were selected for inclusion on the panel. DNA was isolated from cell lines using a Proteinase K / phenol extraction procedure as described (8). PCR and analysis of products
Polymerase chain reactions (PCR) were performed in a volume of 100 pl containing 500 ng of genomic DNA derived from individual EBV-transformed B-cell lines, 200 pM each of dATP, dCTP, dGTP and dTTP, 2.0 mM Tris-HC1 (pH 8.3), 2.0 mM magnesium chloride, 50 mM KCI, 50 pmol each of CTREP3 and CTREP4 primers and 2 units of Taq DNA polymerase (Amplitaq, Cetus Corp.). Samples were overlaid with light mineral oil (Sigma) and subjected to thermocycling (30 cycles of 95°C for 30 s, 55°C for 30 s, 72°C for 60 s) followed by a final extension at 72°C for 10 min. Products were analyzed by electrophoresis in 3% Nusieve/ 1% Seakem agarose (FMC Corp., Rockland, Ma), l x TBE.
,
Sequence analysis of selected lambda clones showed that much of the polymorphism detected at the RFLP level is accounted for by insertions! deletions of sequences which we refer to provisionally as haplospecific geometric elements (Fig. 1). Although there are features which suggest similanties to “satellites” (14), we are most impressed by the geometry (relations of magnitudes) when different haplotypes are compared, i.e. the lengths of the elements vary apparently because certain stretches of sequence are inserted or deleted (Fig. 1). The sequence contains (but is not necessarily restricted to) reiterated dinucleotides and possibly other nonrandom patterns. There is also some apparent symmetry around the region which is deleted or inserted (Fig. 1). Analysis of the sequence data indicates that the polymorphisms observed after Taq I digestion and Southern analysis of different ancestral haplotypes are due to differences in the length of the geometric elements (Table 1). Thus, the size differences between the Taq I fragments of CL1 from 57.1, 7.1, 8.1 and 18.2 can be predicted accurately from the length of the geometric elements. Similarly, Taq I fragment sizes for CL2 of 57.1 and 8.1 can be predicted from the geometric element sequence length (CL2 from 18.2 has not been analyzed). The exception, 7.1 - CL2, is interesting because, clearly, additional polymorphism is present. We would predict that either deletion/s of approximately 60 basepairs are present elsewhere within the CL2 Taq I fragment or, alternatively, that a unique Taq I site is present in 7.1 - CL2. An unexpected feature of the polymorphism is that the composition and arrangement of these M 1
2
3
4
5
6
7
0
9
10
1112
13
M
14
Results
Using genomic walking probes (1 1, 12) in the first instance, we walked in the region between B144 and HLA-B until we found a sub-region where there were major differences between the four cell lines (Leelayuwat et al., unpublished). Also, after probing DNA from a panel of homozygous cell lines, we were able to demonstrate extensive polymorphism at the RFLP level (13). Furthermore, there was evidence that parts of the sub-region are duplicated with an interval of some 20-35 kilobasepairs (kb) between the homologous sequences designated CLl (telomeric) and CL2 (centromenc).
AH
46.1
18.2
8.1
7.2
Figure 2. DNA was extracted from 12 homozygous cell lines representing 3 examples each of ancestral haplotypes (AH): 46.1 (lanes 2 4 , 18.2 (lanes 5-7) 8.1 (lanes 8-10) and 7.2 (lanes 11-13). Amplification using primers flanking the geometric elements demonstrated reproduciblehaplotypic patterns for all four ancestral haplotypes after electrophoresisin a 3% Nusieve, 1% Seakem agarose gel. The molecular weight markers (M) in lanes 1 and 14 were pGEM3 (Promega) digested with Hae 111.
119
Abraham et al. Table 2. Banding patterns identified after
Cell ID (I613975 R6112367
PCR amplification with primers flanking the CL loci AH assigned
r-Kl
R6112361 R9152519 R9152523
174
154
142
5 5
4 4
6 6
41 4
2 2
5 5
4 4
5 5
4 3
11811
3
3
4
5
3
46.1
2 2
3 3
3 3
6 4
3 4
12
3
3
5
2
4 5 (5
4 5
5
14
4
13
3
13
4
y
R714714 R6112351 R511518 R51843
R6112307 I36112332
267 220 200
1-2 2
R714708 R6112286 R6112293
Fragment size in base pairs
1471)
R714709 R7112579 177112580
2
3 3 2
4 5 5
5 5 4)
4
41 6
61
R6112382 R6112335 R6112306
14211
R6112316 R6112317
R
R6112364 R612553 R6112337 a518086
6 6
6
6
1571( 1351]
2
4
7
R515054 R6/12303 R6112370
4 4
2 3
9 7
3 3
For each cell line the Table gives the local cell identification, the ancestral haplotypes present, and the optical density of the bands seen after PCR amplification. The density of each band has been scored semi-quantitatively so that 1=probably negative but cannot exclude positive, 2-probably positive, cannot exclude negative, 3=+, 4-++, etc. to 9 =scores of 4 or greater. The ancestral haplotypes shown here are carried by unrelated haplotypes (ik. not nuclear haplotypes).
Ancestral haplotypes-conserved MHC population haplotypes derived from a common, remote ancestor.
geometric elements is typical of each ancestral haplotype which has been tested to date. It is possible, using the polymerase chain reaction, to amplify the CLl and CL2 elements in genomic DNA from cell-lines that are homozygous for ancestral haplotypes for which sequence data of the C1 region are not available and to obtain gel patterns which are characteristic of different ancestral haplotypes (Table 2 and Fig. 2). In the 8.1 ancestral haplotype, one would predict at least two bands of 173 bp and 213 bp that represent the addition of the primer lengths plus the distances between the primer sites in CLl and CL2, respectively. The complex patterns seen in all cases may reflect the presence of additional hybridizing regions or interactions between specific PCR products. In any case 120
the patterns are typical for a particular ancestral haplotype and so must derive from MHC loci. At least 12 different allelic patterns have been identified within a panel of 29 cell lines representing 16 ancestral haplotypes. For these 16 ancestral haplotypes, all examples of each haplotype have the same CL pattern. The haplotypic nature of the patterns confirms that ancestral haplotypes are conserved chromosomal segments and that coding and non-coding sequences are identical, probably by descent from a remote ancestor. The most striking feature of the data is the fact that the majority of the CL patterns are haplospecific; i.e. have a particular pattern that is unique for a particular ancestral haplotype (Table 2). Indeed, the allelic patterns camed by 8.1, 18.2, 35.1, 42.1, 44.3, 47.1,
Distinguishing MHC ancestral haplotypes 52.1, 57.1 and 65.1 are all haplospecific, and can be used to type these ancestral haplotypes. By contrast, 7.1 and 7.2 are indistinguishable but our previous data have established that the parts of 7.1 and 7.2 from HLA C to at least Bf are identical between the two haplotypes (13). The fact that both 7.1 and 7.2 show the same allelic pattern confirms this finding and suggests that these two ancestral haplotypes share the block from HLA-C through HLA-B and CL because of a common origin. The same appears to be true for 46.1 and 46.2. The differences between 7.1 and 7.2 versus 46.1 and 46.2 are quantitative rather than qualitative (see Table 2 and Fig. 2) and more sensitive detection methods may be required.
Discussion
The CL region centromeric of HLA-B is highly polymorphic. Most of the polymorphism is accounted for by variations of reiterating complex sequences referred to as geometric elements. The function of these polymorphic geometric elements is unknown, but many possibilities could be considered, including regulation. The sequence composition and arrangement of the geometric elements are typical for ancestral haplotypes and therefore haplotypic CL patterns were obtained following PCR amplification. Irrespective of the fundamental significance of these sequences, they clearly provide very valuable markers for distinguishing between ancestral haplotypes (haplotyping) in the region between HLA-C + HLA-B and TNF+B144 without the need for sequencing. As such, they can be used to define the sites of recombination between different ancestral haplotypes. Preliminary data (not shown) suggest that recombination between CL and HLA-B is distinctly unusual whereas recombination centromeric of CL (at least as far as C2/BF) is relatively frequent. A similar phenomenon is observed in the mouse MHC (1 5). These observations suggest that there may be less recombination occurring within polymorphic blocks than between these blocks. In other systems there are suggestions that nucleotide substitutions, and by analogy polymorphism, may decrease recombination frequency (16, 17). The precise boundary of the polymorphic “frozen block” including HLA-B +C and the CL sub-region remains to be determined but we do know that there are additional extensive polymorphisms centromeric ill allow of CL2. Further analysis of these regions w definition of the boundaries of the conserved blocks. References 1. Yang SY Assignment of HLA-A and HLA-B antigens for the reference panel of B-lymphoblastoid cell lines determined
by one-dimensional isoelectric focusing (1D-IEF) gel electrophoresis. In: Dupont B, ed Immunobiology of HLA, Volume 1: Histocompatibility Testing 1987. New York: SpringerVerlag, 1989: 43. 2. Abraham W,Chin D, Zahedi K, Dawkins RL, Whitehead AS. Haplotypic polymorphisms of the TNFB gene. Immunogenetics 1991: 33: 5&S3. 3. Abraham LJ, Grimsley G, Zhang WJ, Degli-Esposti MA, Dawkins RL. Polymorphism in the human B144 gene in different MHC haplotypes. Eur J Immunogenet (in press). 4. Kleh J, Takahata N. The major histocompatibility complex and the quest for origins. Immunol Rev 1990 113: 5-25 5. Dawkins RL, Zhang WJ, Degli-Esposti MA, Abraham L, McCann V, Christiansen FT.Studies of MHC haplotypes by pulsed field gel electrophoresis. In: Clinical Endocrinology and Metabolism. London: Bailliere Tindall Limited, 1991: 285-97. 6. Tokunaga K, Saueracker G, Kay PH, Christiansen FT, Anand R, Dawkins RL. Extensive deletions and insertions in different MHC supratypes detected by pulsed field gel electrophoresis. J Exp Med 1988: 168: 93340. 7. Tokunaga K, K a y PH, Christiansen FT,Saueracker G, Dawkins RL. Comparative mapping of the human major histocompatibility complex in different racial groups by pulsed field gel electrophoresis. Human Immunol1989: 26: 99-106. 8. Dawkins RL, Leaver A, Cameron PU, Martin E, Kay PH, Christiansen FT. Some disease-associated ancestral haplotypes carry a polymorphism of TNF. Humun Immunol 1989: 26: 91-97. 9. Zhang WJ, Degli-Esposti MA, Cobain TJ, Cameron PU, Christiansen FT,Dawkins RL. Differences in genecopy number carried by different MHC ancestral haplotypes: quantitation after physical separation of haplotypes by Pulsed Field Gel Electrophoresis. J E.rp Med 1990: 171: 2101-14. 10. Leelayuwat C, Abraham LJ, Tabarias H, Christbansen FT, Dawkins RL. Genomic organization of a polymorphic duplicated region centromeric o f HLA B. Immrmogtwetics (in press). 1 I . Chimini C. Boretto J, Marguet D, Lanau F, Lauguin G, Pontarotti P. Molecular analysis of the human MHC class I region using yeast artificial chromosome clones. Intmcmogenetics 1990 32: 419-26. 12. Spies T, Blanck G, Bresnahan M, Sands J, Strominger JL. A new cluster of genes within the human major histocomparability complex. Science 1989: 243 214-7. 13. Wu X, Zhang WJ, Witt CS, Abraham LJ, Christiansen FT. Dawkins RL. Haplospecific polymorphism between HLA B and M E Hum Inununol (in press). 14. Singer M E SINES and LINES: Highly repeated short and long interspersed sequences in mammalian genomes. Cell 1982 28: 43’3-4. 15. Snoek M, Groot PC, Spies T, Campbell RD, Demant P. Fine mapping of the crossover sites in the C4-H-ZD region of H-2 recombinant mouse strains. Immunogenetics 1991: 34: 409-12. 16. Hughes AL. Testing for interlocus genetic exchange in the MHC: a reply to Anderson and co-workers. Imrnunogenetics 1991: 33: 243-6. 17. Koop BF, Siemieniak D, Slightom JL, et al. Tarsius deltaand beta-globulin genes: Conversions, evolution, and systematic implications. J Biol Chem 1989 264(1): 68-79. Address: R L. Dawkins Department of Clinical Immunology Royal Perth Hospital GPO Box X2213 Perth WA 6001 Australia
121