181

Forensic Science International, 52 (1992) 181- 191 Elsevier Scientific Publishers Ireland Ltd.

STATISTICAL ANALYSIS OF THE MEASUREMENT ERRORS IN THE DETERMINATION OF FRAGMENT LENGTH IN DNA-RFLP ANALYSIS

BIRTHE ERIKSENa,

AKSEL BERTELSENb

and OLE SVENSMARKa

aZnstitute of Fortmsic Genetics and *Statistical Research Unit, University of Copenhagen (Denmark). (Received September 2nd, 1991) (Revision received November 22nd, 1991) (Accepted November 29th, 1991)

Summary DNA from human whole blood samples was digested with the restriction enzyme HinfI and RFLP analysis performed using the single lotus probes MSl, MS31, MS43a and YNH24. The intergel variation of 3291 duplicate measurements of fragment lengths in terms of basepairs was investigated. The differente between two measurements of the same fragment on different gels increased approximately exponentially with increasing fragment length. After transformation of the fragment length into a normalized migration distance it was found that the differente between two transformed measurements was normally distributed with a S.D. (0.70 mm) which was independent of the fragment length. The errors of band 1 and band 2 on the same lane were correlated (r* = 0.8). It is useful in the calculation of frequenties and in retrieval procedures and also in the calculation of likelihood ratios to be able to use a S.D. which is independent of the fragment length. Key words: DNA-profiling; RFLP; VNTR; Single lotus probes; Measurement errors; Statistical analysis

errors; Correlation of

Introduction

DNA-RFLP analysis using VNTR probes results in a multitude of restriction fragments which for practica1 purposes with current techniques may be considered to represent a quasicontinuous distribution of alleles. This is in contrast to classica1 markers which are expressed as a limited number of discrete alleles. Obviously, RFLP analysis requires the use of more elaborate statistics than those required in classica1 systems. First of all, it is highly important to analyze the measurement errors of the determination of fragment length. Without knowledge of the measurement error it is not possible to estimate frequenties, to calculate likelihood-ratios or to perform efficient and reliable retrieval procedures. Correspondence to: Birthe Eriksen, Institute of Forensic Genetics, Frederik V’s Vej 11, DK-2100 Copenhagen, Denmark. Abbreviations: bp, basepairs; kbp, kilobasepairs; RFLP, Restriction Fragment Length Polymorphism; VNTR, Variable Number of Tandem Repeats.

182

Despite this fact, investigations of the measurement errors in the determination of restriction fragments lengths have been scarce. Average measurement errors of about 0.6% basepairs, i.e. an error proportional to the fragment length have been reported [l - 31. Other observations indicate that the coefficient of variation (C.V.) increases with increasing fragment lengths [4,5]. Data on the intrage1 variation were reported by Gil1 et al. [6]. S.D. as estimated in terms of migration distance normalized in relation to the distance between two size marker bands was approximately constant (0.5 - 0.7 mm) in the range 1.6 - 12.2 kbp. The coefficient of variation calculated in units of basepairs increased from 1.4% at 1.6 kbp to 1.9% at 12 kbp with a minimum of 0.6% at 3 kbp. It was shown that the logarithm of the standard deviation (basepairs) was approximately directly proportional to the fragment size in the range 1.6- 12.2 kbp. In the present study measurement errors derived from 3291 duplicate measurements of restriction fragment lengths (1.5 - 20 kbp) were analyzed. The two measurements were performed on different gels, i.e. only the intergel variation was investigated. The aim of this study was to find a measure of the error which is independent of the fragment length. To this purpose a transformation of fragment length into normalized migration distance is suggested. A preliminary account of part of this study has been presented elsewhere [7]. Methods Blood samples from 416 unrelated individuals involved in Danish criminal cases were collected. The blood was drawn either without anticoagulating agents or in ethylenediaminetetraacetic acid (EDTA) or acid citrate dextrose solution (ACD). DNA was recovered from the samples after phenol/chloroform extraction, essentially according to Ref. 8. DNA was restricted with HinfI (Boehringer) and the restriction checked by mini gel electrophoresis. The concentration of DNA was determined by fluorimetry (Hoefer TKO 100 fluorimeter) using the interchelating dye Hoechst 33258. Electrophoresis was performed in 7 x 200 x 200 mm gels of 0.7% agarose (Sigma type 11) in a Pharmacia GNA 200 apparatus. The electrophoresis buffer was TBE, pH 8.8, (134 mM Tris, 75 mM H3B03, 2.5 mM EDTA) containing 0.5 pg/ml ethidium bromide. The gels were run at room temperature overnight at 60 V (maximum 75 mA) until the 2.3 kbp band of the molecular weight marker had reached 14 cm from the application point. A genomic control (human DNA), three lanes of the molecular weight marker (Amersham SJ5000), two on the outside lanes and one in the middle of the gel and up to 18 DNA samples were run in each gel. The amounts of DNA applied were 1 pg for each sample.The amount of marker DNA was as recommended by the supplier. DNA was blotted onto nylon filters (Hybond N, Amersham) using Southern blotting. The filters were subsequently hybridized with the VNTR probes MSl, MS31, MS43a (Cellmark Diagnostics) and YNH24 (Promega Corporation). The probes were labeled with cytidine 5 ‘-[~~~Pltriphosphate (a[32P]dCTP) using random priming before hybridisation [9]. The hybridisation was performed at 62°C for 16 h according to Ref. 8. The filters were washed three times for 30

183

min at 65°C in 1 x standard saline citrate (SSC), 0.1% sodium dodecyl sulphate (SDS) containing 50 pg/ml herring sperm DNA and twice at 65°C for 30 min in the same solution diluted ten times. Autoradiography was carried out using Fuji X-ray film HR-G with intensifying screens at -80°C. The positions of size marker and sample fragments on the autoradiograms were measured by hand. The migration distance was measured from the point of application, which is visible on the autoradiogram, to the centre of the fragment. The measurements were performed by two different persons and the mean was stored in the database. The fragment lengths were calculated by the local reciprocal hyperbolic method of Elder and Southern [lol. Correction for transversal skewness of the migration was obtained by linear interpolation between corresponding pairs of size marker bands. The calculated fragment lengths as wel1 as the migration distances were stored in the database together with relevant information on the sample and the electrophoretic run. Curve fitting was performed by means of the routines included in SigmaPlot (Jandel Corporation, Corte Madera, CA). The part of the database used in this study contained 6582 determinations on 416 different DNA samples with 4 probes. None of the measurements were included twice. The following terms wil1 be used: unless otherwise stated S.D. is the S.D. of single observations. S.D.(bp) is the S.D. as estimated from fragment length measured in basepairs. S.D.(mm) is the S.D. as estimated from fragment lengths transformed into normalized migration distances (mm). Results Migration

distance as a function

of fragment

length

From 255 size marker lanes, each with 12 fragments, the distance from the application point to each band was measured. For each lane the data were normalized by multiplication by the factor 100/(dlo - d,) where dlo and dl are the migration distance (mm) for band 10 and band 1, respectively (Table 1). Curve fitting was carried out on the averaged normalized data. Using m as migration distance (mm) and b as fragment length (kbp) the function f(b)

= m = 796/(3.7 + bl,‘) + 32.3

was shown to fit the means of the normalized of fragment length (Fig. 1). Measurement

migration

UI

distances as a function

errors in duplicate determinations

Duplicate length measurements (kbp) of 3291 restriction fragments were collected from our database. The two measurements were performed on different plates. The first measurement (from the plate with the lowest registration number) was designated as bl and the second measurement as bz. The difference (b, - b2)/2 was plotted against the mean value of the fragment length

184

TABLE

1

MIGRATION DISTANCES

FROM THE APPLICATION

POINT TO EACH BAND IN THE SIZE

MARKER LANE WERE MEASURED AND NORMALIZED BY MULTIPLICATION BY THE FACTOR lOO/(d,,, - di) WHERE di, and di ARE THE MIGRATION DISTANCE (mm) FOR BAND 10 AND BAND 1, RESPECTIVELY. THE AVERAGE DISTANCES AND S.D. ARE GIVEN FOR THE NORMALISED DATA. 255 LANES WERE MEASURED. Band

Fragment hth

íkW

Migration distance (mm) Normalized data Mean

S.D.

1 2

22.01 19.32

40.0 41.6

2.7 2.7

3 4 5 6

13.29 9.69 7.74 6.22

47.5 55.2 63.4 73.4

3.0 3.1 3.0 3.0

7 8 9 10

4.25 3.47 2.69 2.39

96.2 111.5 130.7 140.0

2.9 2.8 2.7 2.7

11 12

1.88 1.48

158.2 177.6

2.8 3.0

;

180 E



;60

W =

140

a 120 EJ n

0 H Q Q

H E

80 60 40

FRAGMENT

LENGTH



Fig. 3. (0) The data were grouped in intervals of 1 kbp and the S.D.(bp) of the measurement error was estimated and plotted as a function of the fragment length (kbp); (-), the coefficient of variation (S.D.(bp) in percent of the fragment length) as calculated from the average S.D.(mm) = 0.50 mm. The calculated values were plotted against the fragment length (kbp); (- - - - -), logIoS.D.(bp) plotted against the fragment length.

0

5 FRAGMENT

20

15

10 LENGTH

(kbp)

Fig. 4. The deviation of duplicate measurements of 3291 fragment [(f(b,) - f(bs))/2] as a function of the fragment length (kbp).

lengths transformed

into mm

187

0’. 5 DEUIATION

i (mm>

Fig. 5. The distribution of the deviations of duplicate measurements of 3291 fragment iengths I(f(bi) - f(bs))/2]. (-). The fitted curve of the normai distribution (mean = 0.03 mm, variante = 0.10 mm).

of the error which is independent of the fragment length. S.D.(mm) was estimated to be 0.50 mm. Furthermore the S.D.(mm) was investigated for multiple determinations of the same sample. The fragment lengths of the genomic control were measured in 70 gels. The fragment length and S.D.(mm) are given in Table 2. It wil1 be seen that in this case S.D.(mm) is also independent of the fragment length but slightly lower than the average of 0.50 mm found above. This, most probably, is due to TABLE 2 AVERAGE CONTROL Probe MS1 MS1 MS31 MS31 MS43a MS43a YNH24 YNH24

AND S.D.(mm) OF MULTIPLE Band

(70) DETERMINATIONS

OF THE GENOMIC

Fragment length (kW

S.D.(mm)

14.068 12.461 7.347 7.375 5.937 4.629 4.032 2.639

0.42 0.36 0.45 0.40 0.44 0.42 0.37 0.43

DEUIATION

BAND

1

(mm>

Fig. 6. The correlation between the measurement errors [Cf(b,) - f(bz))/Z] of two bands on the same lane. Abscissa, measurement error of the high molecular band 1 after transformation into normahzed migration distance (mm); ordinate, measurement error of the low molecular band 2 after transformation into normalized migration distance (mm).

the fact that the genomic control was always placed adjacent to a marker lane. S.D.(mm) can be transformed into S.D. measured in units of basepairs, S.D.(bp) using the inversed function of J In this way S.D.(bp) can be calculated as a function of the fragment length (full line in Fig. 3). It is seen that this function corresponds to the experimental results. Logi$.D.(bp) was plotted as a function of the fragment length (dashed line in Fig. 3; see discussion). Correlation of error.5 Many factors that contribute to the measurement errors wil1 be common to both bands in the same lane. Such factors may be distortion of the gel, imprecise TABLE 3 CORRELATION OF MEASUREMENT ERRORS (TRANSFORMED) ASSOCIATED WITH THE DETERMINATION OF TWO BANDS IN ONE LANE. THE DATA WERE GROUPED ACCORDING TO THE LENGTH OF THE HIGH MOLECULAR WEIGHT BAND. Fragment

length

r4

Number of band pairs

0.80 0.80 0.82 0.79

1583 452 813 318

OW

1.5-20 1.5-5 5-10 10-20

189 TABLE

4

CORRELATION OF MEASUREMENT ERRORS (TRANSFORMED) ASSOCIATED WITH THE DETERMINATION OF TWO BANDS IN ONE LANE. THE DATA WERE GROUPED ACCORDING TO THE DISTANCE BETWEEN THE TWO BANDS (mm). Distance between band 2 and band (1 mm)

r”

Number of band pairs

0-10 10-20

0.88 0.83

678 366

20-30 30-40 40-50 50-80 80- 125

0.76 0.70 0.69 0.66 0.69

229 164 71 64 21

detection of the application point or insufficient correction for transversional skewness of the migration and band-shift. A correlation of the measurement errors of the two bands in one lane has been reported [ll]. It was investigated whether this also applies to the transformed errors. From the database 1583 band pairs measured on two different plates were collected. In Fig. 6 the measurement errors of band 2 (the low molecular band) were plotted against the errors of band 1. The plot indicates a clear correlation (r2 = 0.8). To investigate whether this correlation depends on the fragment length the data were grouped according to the length of the high molecular band and the correlation calculated. The correlation was essentially independent of the fragment length (Table 3). On the other hand, the correlation decreased with increasing distance between the two bands (Table 4). Discussion It was shown in this study that it is possible to obtain a measure of the error which is independent of the fragment length. S.D. as estimated in units of normalized migration length was 0.50 mm. The S.D. of the differente between two measurements accordingly was S.D.& = 0.70 mm. In terms of basepairs the S.D. varied with the fragment length from 0.65% at 3 kbp to about 3% at 20 kbp. This agrees with previous findings. The finding of Gil1 et al. [6] that the logarithm of S.D.(bp) increases linearly with the fragment length could not be confirmed (dashed line in Fig. 3). A measure of the error which is independent of the fragment length is preferable in the handling of RFLP data. When calculating likelihood ratios it is an obvious advantage to use a measure of S.D. which does not vary with the fragment length. In the estimation of frequenties and in retrieval from databases it is most convenient to use a window of constant width. It is, however, essential that the window represents a measure of the error which is independent of the fragment length. Nonetheless, it is usual to use a window of constant width in terms of % basepairs, e.g. 5.6% [12,13]. Such a window wil1 not be cons-

190

tant with respect to the measurement error. According to our study it ranges from about 10 x S.D. (mm) at 1.5 kbp to less than 2 x S.D.(mm) at 20 kbp. The correlation of the measurement errors of the two bands on the same lane may be taken into account in the estimation of frequenties, in searching and in the calculation of likelihood ratios [11,12]. The function used for the transformation was based on the relation between the fragment length (kbp) and migration distance (mm) obtained with our electrophoretic protocol. Other protocols may require other functions which could be derived as described in this study. However, the easiest way to obtain a S.D. independent of fragment length is to use the primary data, i.e. the migration distante as obtained from the scanning equipment. To compensate for the run to run variation the measure should be normalized in relation to the distance between two bands of the size marker [6]. Acknowledgements We are obliged to Mrs. Jane Hellung Lauridsen and to Mrs. Susanne Billesbdle for excellent technical assistance. References

2

M. Baird, 1. Balazs, A. Giusti, L. Miyazaki, L. Nicholas, K. Wexler, E. Kanter, J. Glassberg, F. Allen, P. Rubinstein and L. Sussman, Allele frequency distribution of two highly polymorphic DNA sequenses in three ethnic groups and its application to the determination of paternity. Am. J. Hum. Genet., 39 (1986) 489-501. C. Brenner and J.W. Morris, Paternity index calculation in single lotus hypervariable DNA probes: Validation and other studies. Proc. Int. Symp. Human Zdentification, Promega Corporation, Madison, WI, 1989, pp. 21- 54. D.J. Endean, RFLP analysis for paternity testing: Observations and caveats. Proc. Int. Symp. Human Zdentification, Promega Corporation, Madison, WI, 1989, pp. 55 - 76. J.A. Chimera, C.R. Harris and M. Litt, Population genetics of the highly polymorphic lotus D16S7 and its use in paternity evaluation. Am. J. Hum. Genet., 45 (1989) 926-931. D.A. Galbraith, P.T. Boag, H.L. Gibbs and B.N. White, Sizing bands on autoradiograms: A study of precision for storing DNA fingerprints. Electrophoresis, 12 (1991) 210 - 220. P. Gill, K. Sullivan and D.J. Werrett, The analysis of hypervariable DNA profiles: problems associated with the objective determination of the probability of a match. Hum. Gmet., 85 (1990) 75- 79. 0. Svensmark and B. Eriksen, Measurement errors in DNA-profiling. Zdentification, Promega Corporation, Madison, WI, 1991, p. 322.

Proc. Symp. Human

J.C. Smith, CR. Newton, A. Alves, R. Anwar, D. Jenner and A.F. Markham, Highly polymorphic minisatellite DNA probes. Further evaluation for individual identification and paternity testing. J. Forensic Sci. Sec., 30 (1990) 3 - 18. 9 J.C. Smith, R. Anwar, J. Riley, D. Jenner, A.F. Markham and A.J. Jeffreys, Highly polymorphic minisatellite sequences: Allele frequenties and mutation rates for five lotus-specific probes in a Caucasian population. J. Forensic Sci. Sec., 30 (1990) 19-32. 10 J.K. Elder and E.M. Southern, Computer-aided analysis of one-dimensional restriction fragment gels, In: M.J. Bishop and C.J. Rawlings (eds.), Nucleic Acid and Protein Sequence Analysis, IRL press, Oxford, 1987, pp. 165- 172. 11 I.W. Evett and D.J. Werrett, Bayesian analysis of single lotus DNA profiles. Proc. Int. Symp. Human Zdentitcation, Promega Corporation, Madison, WI, 1989, pp. 77 - 101. 8

191 12 13

I.W. Evett and P. Gill, A discussion of the robustness of value of DNA single lotus profiles in crime investigations. P. Gill, I.W. Evett, S. Woodroffe, J.E. Lygo, E. Millican control and interpretation of DNA profiling in the home trophoresis, 12 (1991) 204-209.

methods for assessing the evidential Electrophoresis, 12 (1991) 226 - 230. and M. Webster, Databases, quality office forensic science service. Elec-

Statistical analysis of the measurement errors in the determination of fragment length in DNA-RFLP analysis.

DNA from human whole blood samples was digested with the restriction enzyme HinfI and RFLP analysis performed using the single locus probes MS1, MS31,...
699KB Sizes 0 Downloads 0 Views