American Journal of Medical Genetics 35:600-608 (1990)

COMPARISON OF DIRECT AND INDIRECT METHODS OF CARRIER DETECTION IN AN X-LINKED DISEASE Dwight D. Koeberl, Cynthia D.K. Bottema, and Steve S . Sommer Department of Biochemistry and Molecular Biology, Mayo ClinicFoundation, Rochester MN

For a severe X-linked disease such as hemophilia B, unrelated patients generally will have different mutations. Restriction fragment length polymorphisms (RFL,Ps) have been used to perform carrier testing in an indirect manner by following the segregation of the disease allele through the pedigree. However, this approach requires family participation and suffers from multiple levels of uncertainty. We have applied the direct sequencing technique of genomic amplification with transcript sequencing (GAWTS) to direct carrier testing. Here we compare direct and RFLP-based testing in two families with hemophilia B. A total of 22 at-risk females was diagnosed by direct testing, whereas only 11 females could be diagnosed by standard RFLP analysis. The superior accuracy and greater conceptual simplicity of direct testing is demonstrated and some commonly overlooked uncertainties in RFLP analysis are highlighted.

KEY WORDS: DNA diagnosisicarrier testing/hemophiliaidirect sequencingRFLP analysisifactor IX, mutations INTRODUCTION

from

Approximately one male in 30,000 suffers hemophilia B, a debilitating, X-linked

Address reprint requests to Dr. Steve S. Sommer, Department of Biochemistry and Molecular Biology, Mayo Clinic/Foundation, Rochester, MN 55905 Received for publication December 28, 1989; revision received February 5, 1990.

0 1990 Wiley-Liss, Inc.

coagulopathy caused by a lack of factor IX activity (Gianelli et al., 1983). Among the current approaches to carrier detection in females at risk, restriction fragment length polymorphism (RFlLP) linkage analysis is most widcly used. This technique uses restriction endonucleases to cleave DNA into fragments which can be detected by Southern blotting. At a polymorphic site, cleavage will occur in some chromosomes but not in others. These polymorphisms can be detected by differences in the size of the DNA fragments after restriction endonuclease digestion. The polymorphisms serve to mark adjacent sequences, thereby allowing thc segregation of the defective gene to be followed through the family. Diagnosis based on RFLP analysis are indirect because the mutation is not defined; its presence is only inferred based on linkage to an RFLP. RFLP analysis, when informative, represents an improvement over protein-based testing because it is generally more accurate. In particular, RFLP-based testing: (1) circumvents the overlap in protein levels that exists between noncarriers and carriers, (2) is unaffected by age, environmental agents, or disease state, and (3) is unaffected by nonrandom X inactivation when that occurs. In addition, RFLP testing can provide prenatal diagnosis by using accessible tissue such as chorionic villus cells or amniocytes, while protein-based testing requires exposing the fetus to the hazards of blood sampling.

Direct Carrier Testing for Hemophilia

While problems associated with protein-based testing can be circumvented, RFLP testing has its own limitations under certain circumstances. First, the diagnosis of carrier status depends on finding an informative polymorphism. For factor IX, in approximately 20% of Caucasians and up to 80% of individuals in other racial groups, no informative RFLP is found [Winship and Brownlee, 1986; Lubahn et al., 19871. Second, the diagnosis in a female at risk depends on collecting blood from relatives who may be uncooperative, deceased, or otherwise unavailable. Third, factors relating to recombination, genetic heterogeneity, and nonpaternity may confound that diagnosis. Finally, when the propositus represents a sporadic case, germline mosaicism and uncertainty about the origin of the mutation can introduce additional uncertainty in the diagnosis. Ideally, diagnostic testing should be performed by identifying the mutation and directly determining whether the at-risk female carries the defective allele. For some autosomal recessive diseases, relatively few mutations account for the vast majority of defective alleles in the population, and direct testing by conventional methods is possible. However, in severe X-linked diseases such as hemophilia, each family can have a different mutation [Haldane, 19351 so direct carrier testing has not been feasible in most families. We have developed genomic amplification with transcript sequencing (GAWTS) [Stoflet et al., 19881, a method of direct sequencing which can be used to perform direct carrier testing for hemophilia B on a time scale that allows the test to be used in a clinical setting [Bottema e t al., 19891. To highlight the greater accuracy and the conceptual simplicity of direct diagnosis, we compare it with indirect RFLP testing in two families. These two families illustrate many of the potential pitfalls and problems that can arise in R E P testing. METHODS The propositus (HB24) of family one has a factor IX coagulant level of 1% with absence of cross-reacting material (CRM-) while the propositus (HB26) of family two has a factor IX coagulant level of 3% and a normal amount of cross-reacting material (CRM+). Twenty-four ml of blood was drawn in ACD solution B and DNA was extracted as previously described [Gustafson et al., 19871. GAWTS was performed with Taq DNA polymerase, T7 RNA polymerase, and A M V reverse transcriptase using steps two through four oE our previously described protocol [Koeberl et al., 19891. Briefly,

601

GAWTS is a three-step procedure that involves: (1) polymerase chain reaction (PCR) with oligonucleotides containing a phage promoter and a sequence complemeqtary to the region to be amplified, (2) transcription of the amplified product with a phage RNA polymerase, and ( 3 ) sequencing of the transcript using reverse transcriptase that is primed with a nested (internal) oligonucleotide [Stoflet et al, 19881. At least 2460 bp of sequence was obtained on each hemophiliac. The chosen exonic sequences included the entire protein coding sequence (1383 bp), the 5’ untranslated sequence, and portions of the 3’ untranslated sequence (497 bp). The nonexonic sequences included the putative promoter, the 14 splice junctions, and the segment immediatcly 3’ to the gene (580 bp). Region A contains the putative promoter, exon a, and the adjacent splice junction. Region B/C contains exon b, intron b, exon c, and the flanking splice junctions. Regions D through G contain the appropriate exon and flanking splice junctions. Region H-5’ contains a splice junction, the amino acid coding sequence of exon h and the proximal 3’ untranslated segment of the mRNA. Region H-3’ contains the distal 3’ untranslated region in exon h (including the poly A addition sequence) as well as the sequence immediately following the gene. The following bases were sequenced in each hemophiliac to search for the causative mutation(s) (numbering system of Yoshitake et al., 19851): Region A: -106-139; Region B/C: 6720-6265; Region D: 10544-10315; Region E: 17847-17601; Region F 20577-20334; Region G. 30183-29978;Region H-5’: 31411-30764; Region H-3’: 32808-32583. The order of the numbers in each region indicates the direction of sequencing. The mutation in each propositus was delineated and the appropriate region was sequenced in at-risk females. Each diagnosis was confirmed by reamplification and resequencing. For the comparative RFLP analysis, the Hint (intron a), XmnI (intron c) and TaqI (intron d) intragenic factor IX polymorphisms were determined by PCR as previously described [Koeberl et al., 19901. For each RFLP, the presence of the restriction site is denoted by ”+” while its absence is denoted by ”-”. RESULTS GAWTS is a method of direct sequencing [Stoflet et al., 1988; Koeherl el al., 19891 which can be used to delineate rapidly the causative mutation in

602

Koeberl et al.

A. Sequence eight regions in factor IX gene of hemophiliac with GAWTS. A T G C

t

A T G C

Sequence change -b

Hemophiliac,

B.

Normal

Sequence region of mutation in related females.

A Diagnosis:

Carrier

Noncarrier

Fieure 1. Schematic of direct carrier testing. The sequenw pattern of a hemophiliac is followed by the sequence pattern or camer and noncarrier females in the family- The 8 regions of likely function significanceare sequenced by GAWK3 (see Methods) in the factor IX gene of the hemophiliac. Males have only one X chromosome so an autoradiogram of the sequencing gel will show the mutant base as wcll as the absence of the normal base. In this case, a single-base mutation has changed a C (cytosine) to a T (thymine) as indicated by the starred band at the position of the arrow. Having located the mutation, the aEfected region of at-risk females in the family can be sequenced. A carrier has one mutant factor IX gene and one functional gene. Since both alleles are present in equal amounts and amplification occurs without bias, the amplified products contain equal amounts of both alleles. Thus, the signal at the site of the mutation will be about equally divided between the mutant and the functional bands (Le., half the intensity of other bands in that sequence). In contrast, a noncarrier will have two copies of the functional sequence.

Direct Carrier Testing for Hemophilia

603

cousin (III-3), and two of her daughters could not bc diagnosed since certain crucial relatives were unavailable for analysis.

a hemophiliac. Subsequently, carrier status can be determined directly in at-risk females by tcsting for the presence of the mutation by sequencing the appropriate region of factor IX (Fig. 1).

The logic of indirect diagnosis has been described in detail [Sommer and Sobell, 19871, but the following outlines the logic in family one. Thc -1 polymorphism was informative and it provided all the information that could be obtained from the The propositus RFLP analysis (Figure 3A). allele from (indicated by the arrow) received the his mother. While the + allele has no intrinsic functional sienificance, it is closely linked to thc mutation and, as such, it cosegregates with the

The regions of the factor IX gene most likely to contain the causative mutation were sequenced with GAWTS. The 34 kilobase factor IX gene contains 8 exons and 7 introns [Yoshitake et al., 1985; Anson et al., 19841. The introns, which account for over 90% of the gene sequence, have no known physiological significance for the individual. For this study, eight regions encompassing 2.46 kb were chosen for sequencing (Fig. 2 and Methods).

+

HB24 HB26

htrm lsnoth W):

.f

6.2

*

A

7.1

3.6

* * *

**

BIC

D

9.4

2 6

** * * E F

t. 1

1.2

** *

G H5’

H3’

Fieure 2 Schematic of the factor M gene showing the ei&t regions which were sequenced (boxes). In each region, thc amino acid d i n g sequcnces arc shaded and bordered by dashed h es.The additional sequences that are deheated by the solid lines include the promoter (circles), the 5’ untranslated sequence (triangles), the splice junctions (asterisks), and parts of the 3’ untranslated region including the poly A addition signal (crosshatched). The short arrows indicate the start and stop of transcription. The unsequenced intronic segments which account for 92% of the gene are drawn to a Merent scale. Note that the length of these segments are indicated in kilobascs while the length of the sequenced regions are indicated in bases. The location of the mutations in the families of HB24 and HB26 are indicated by the long, thin arrows.

To illustrate the advantages of direct testing, both standard RFLP-based testing and direct testing were performed in two families. For RFLP testing, the Hinfl (intron a), (intron c), and -1 (intron d) polymorphisms were dctermined. These polymorphisms provide most of the information that can be obtained from RFLP analysis in hemophilia B [Winship et al., 19841. Familial Hemophilia -- Indirect Diagnosis The indirect and direct diagnostic data for family one are shown in Figure 3. In summary, the sisters and nieces of the hemophiliac could be diagnosed by RFLP analysis (Fig. 3A). The error rate of diganosis is about 3%. However, the first

mutation. Thus, it is a marker in this family which allows the segregation of the mutant gene to be followed through the pedigree. The mother has a + and a - allele; the allele marks the defective factor IX gene and the - allele marks a functional gene since the mother does not have the disease. Assuming that all the sisters have the same biological father, that father must have had a - allele. Therefore, +/- in a sister indicates a carrier, but nonpaternity could result in a diagnostic error. -/indicates a noncarrier and this diagnosis is independent of the specific paternity since the f allele is absent. For thc cousin (111-3), it is not possible to determine carrier status because of the death of her father. Consequently, it is not possible to determine the carrier status of her first two

+

Koeberl et al.

604

A.

daughters (IV-1, IV-2). However, the third daughter (IV-3) is a noncarrier because she does not have a + allele.

HB24 I

glt

g/t

t/t

@ = noncarrler by sequence analysis (t/t)

= hernophlliac (g at pos 30199)

@ = noncarrwr bv inference

@ = obligate carrier (g/t) @ = carrier by sequence analysis ( g h )

B.

Direct Testing

HB24 ATGC -A C

Position 30119 A+C

*

ATGC

ATGC

4

C A A C

;/ T

Fiwrc 3 The pedipree of family one fHBZ4 propasitus) with indirect and direct DNA diagnosis data. The mI polymorphic alleles are shown. Tbe presence of the TaqI site in intron d is indicated by n+n and its absence is indicated by n-n. The HinfI and polymorphic alleles are not shown because they do not yicld additional information. The logic of indirect diagnosis is described in the text Sequencing of region G with GAWTS directly sbowed which females are carriers. The hemophiliac has a G instead of a T in the sense strand at position 30119. Carrier females have a G at position 30119 on one X chromosome and a T on the other. Noncarrier females are homozygous for T. As indicated, the cousin and hcr two oldest daughters were found to be carriers. Arrow indicatcs propositus. B.

The diagnoses that could be made suffer from uncertainties due to the possibilities of nonpaternity, recombination between the mutation and the polymorphism (less than 1%assuming that a dramatic hotspot of recombination does not exist in the gene), and the possibility of as yet undiscovered genetic heterogeneity in hemophilia B. While the magnitudes of these uncertainties are not precisely known, genetic heterogeneity (see Discussion) is the major uncertainty in this family; an accuracy of diagnosis of 97% is an approximate but reasonable estimate based on current data.

Direct ses-uencinp ~cls. The gels show base 30119 of the antisense strand of the p r o p i t u s , a carrier, and a noncarrier, respectively, in family one. The carrier is a heterozygote who displays two bands at base 30119. In the antisense strand, the mutation produces a C instead of an A Arrows indicate the site of the mutation on the gel

All 11 of the at-risk females could be diagnosed by direct testing with an error of less than 1%. To perform direct testing in family one, Regions A through H in the hemophiliac (HB24) were sequenced with GAWTS. The sequence showed only one change, a T -> G transversion at 30119 (Region G) which causes a Cys222 -> Trp substitution in the factor IX catalytic domain (Fig. 3B). Reamplification and resequencing confirmed the sequence change. We conclude that this represents the causative mutation because: (1) this is the only sequence change found in the regions of functional significance; (2) there is a very low rate of polymorphism in these regions [Koeberl et al., 19891 and this particular change has not been seen in more than 50 unrelated individuals; (3) the involved amino acid is evolutionarily conserved in the various species for which factor IX sequence is available (Fig. 4A), and (4) the amino acid is evolutionarily conserved in related serine proteases such as factor VII, factor X, protein C, and trypsinogen (Fig. 4B). GAWTS of Region G in the at-risk femalcs in family one affirmed that the RFL,P-based diagnoses were correct. The previously undiagnosed first cousin IV(1113) and two first cousins once removed "-1, 2) were found to be carriers for a total of 11 diagnoses in the family (Fig. 3A). As positive controls, Region G was also sequenced in two obligate carriers. Sporadic Hemouhilia--Indirect Testing Additional advantages of direct diagnosis are

Direct Carrier Testing for Hemophilia A.

HH?4(Trpj

Factor I X Human M0UE.O

& A r g substitution in the catalytic domain was the only sequence change found (Fig. 5B). We conclude that it is the causative mutation for the same reasons By mentioned above for family one (Fig. 4). sequencing Region H-5' in the at-risk females, it was found that the mother and the two sisters wcre carriers while the grandmother, great aunt, the first cousins once removed, and, by inference, two second cousins (IV-4,IV-5) are noncarriers. The data indicate that the mutation must have arisen in either the egg of the grandmother or the sperm of the deceased grandfather, providing even further evidence that the causative mutation has been identified. Consequently, two more great aunts (11-3, 11-4) and a first cousin once removed (111-3) are also noncarriers for a total of 11 diagnoses in the family (Fig. 5A). The accuracy of direct testing is estimated at greater than 99%.

Koeberl et al.

606

DISCUSSION

A. HB26

I

II Ill

= hemophiliac (a at posltion 31052)

@ =obligate carrier (aig)

@ = noncarrler by sequence analysts (gig) @ = noncamer by inference

@ =carrier by sequence analysis (aig)

B. HB26 ATGC

ATGC

ATGC

-G A C

c + Posltlon 3 1 0 5 2 C +

c

T C

;/

Fi-gure 5 A T h e uediaee of fa& two (FIB26 propositus) with direct and indirect DNA diamosis data. The HinfI polymorphic alleles are shown. The presence of a 50 bp insertion is indicated by n+n, while its ahsence is indicated by '-4 The -1 and polymorphisms are not shown because they do not provide additional information. As discussed in the text, uncertainty about the germ cell of origin of the mutation and nonparticipation by certain family members precludes an accurate diagnosis by indirect mcthods in all analyzed specimens except the first cousin once removed (III-5). Direct sequencing of the eight regions indicates that the hemophiliac has an A instead of a G in the scnse strand at position 31052 By direct sequencing, 11 at-risk females were diagnosed Arrow indicates propositus. B.Direct sequencing pels. The gels show base 31052 of the antisense strand of the propositus, a carricr and a noncarrier, respectively, in family two. In the antisense strand, the mutation produces a T instead of the C which is norma& present. The carrier is a heteroqgote who displays both the C and T bands at base 31052 Arrows indicate the site of the mutation on the get

A necessary but by no means sufficient criterion for RFLP-based carrier testing for X-linked diseases is the presence of two different alleles at the site of a known polymorphism in certain crucial For example, the mother must be females. heterozygous at a given locus on her X chromosome if the polymorphism is to be informative for diagnosis in an at-risk sister. Within a 60 kilobase segment which includes the factor IX gene, six polymorphic sites have been described in individuals of Northern European extraction [Winship et al., 1984; Winship and Brownlee, 1986; Koeberl et al, 1990; Giannelli et al., 1984; Camerino et al., 1985; Freedenberg et al., Due to linkage 1987; Hay et al., 19861. disequilibrium (nonrandom association of a polymorphism at one site with another), these polymorphisms are uninformative and preclude RFLP analysis in 20% of families [Winship and Brownlee, 19861. In other ethnic and racial groups, the known polymorphisms are much less informative. For example, the frequency of heterozygosity for one or both of the &I and &I polymorphisms was found to be 44, 28, 19, 6, and 0% in white Americans, black Americans, East Indians, Chinese, and Malayasians, respectively [Lubahn et al., 19871. Thus, unlike hemophilia A, where the frequency of the known polymorphisms seems to be relatively constant in most ethnic groups, most families worldwide with hemophilia B are not heterozygous at the requisite sites of polymorphism. Moreover, the lack of participation by appropriate relatives (as in family two) and the presence of Uncertainties due to nonpaternity, genetic heterogeneity, hotspots of recombination, new mutations, and germline mosaicism make RFLP-based diagnosis complex and subject to more error than generally appreciated. Indeed, when all sources of uncertainty are considered, it is unusual to have an accuracy of more than 97% in any RFLP test. Direct diagnosis of carrier status circumvents these problems, as shown in these two families. Accurate diagnosis is possible by direct diagnosis with GAWTS in situations where RFLP analysis is uninformative. The utility of this approach is illustrated by several of the women involved in this study who have resolved uncertainties over whether they might pass the hemophilia trait to their children. Those diagnosed as carriers can now avail themselves of very accurate prenatal diagnosis. It should be emphasized that once the mutation has been delineated in a family, direct carrier testing need not be performed by sequencing. If the mutation alters

Direct Carrier Testing for Hemophilia

607

a restriction site, for example, diagnosis can be made by DNA amplification, digestion with appropriate restriction endonuclease, and detection of the mutant fragment by gel electrophoresis. Alternately, amplification followed by hybridization with allele specific oligonucleotides or an array of other recently developed techniques can be used [Landegren et al, 19881. For large families and for diagnoses to be rendered by laboratories without expertise in sequencing, one of these alternate techniques may be preferable. However, direct sequencing will likely be easiest and most practical for small families.

deletions (4 and 13 bp), and 27 single-base substitutions were found ([Koeberl et al, 19891 and unpublished data). Thus, hemophilia in each of these 30 cases could be attributed to a mutation in the factor IX gene. If an affected individual in the future were to lack an identifiable mutation, it would be possible for the defect to reside in a region of the factor IX gene that has not been sequenced. However, careful linkage studies in such a family could demonstrate genetic heterogeneity by showing that the disease does not segregate with the factor IX gene.

At present, the 8 regions described herein have been sequenced in more than 50 unrelated males [Koeberl et al., 19891. Polymorphisms have been found in only 2 sites examined in these individuals. It is conceivable that a rare polymorphism or sequence variant and not the causative mutation might be identified in a given hemophiliac. However, even in such a situation, carrier testing would still be reasonably accurate since there is a low probability of an error in diagnosis due to recombination between the intragenic polymorphism/rare variant and the undiscovered mutation elsewhere in the 34 kb factor IX gene (

Comparison of direct and indirect methods of carrier detection in an X-linked disease.

American Journal of Medical Genetics 35:600-608 (1990) COMPARISON OF DIRECT AND INDIRECT METHODS OF CARRIER DETECTION IN AN X-LINKED DISEASE Dwight D...
803KB Sizes 0 Downloads 0 Views