Percentiles as Reference Values for Laboratory Data ROBERT G. ROSSING, M.D., PH.D., AND WILLIAM E. HATCHER III, B.A.

Rossing, Robert G., and Hatcher, William E., Ill: Percentiles as reference values for laboratory data. Am J Clin Pathol 72: 94-97, 1979. It has been suggested that laboratory results be reported, not only in conventional units, but also in terms of equivalent percentiles from a reference population. This would afford the clinician an easily interpretable measure of how unusual such a result would be in the reference population. This paper describes a modified method for estimation of reference percentiles. This method does not require any assumption about the distribution of the reference population, but it has a smaller variability than that from earlier methods based on interpolation. This method is now being used to furnish percentile values for spirometric data in the authors' laboratory. (Key words: Percentiles; Reference values; Laboratory data.)

Research Laboratory, Veterans Administration Center, Temple, Texas, and College of Medicine, Texas A & M University, College Station, Texas

THE INTERPRETATION of clinical laboratory results requires information regarding the distribution of values for the test in a reference population. Traditionally, this has been provided as a "normal range," which may be a function of age, sex and other factors. Recently, various investigators 4,71217 have questioned the utility of such "normal ranges." Briefly, the reasons for these questions are two: (1) the conceptual problem involved in the implication that patients whose results lie within this range are "normal" and all others are "abnormal"; (2) the inaccuracies of such ranges when derived by the application of gaussian methods to data that do not follow the gaussian distribution. Recognition of the second problem has caused several investigators to suggest the use of percentiles to describe the reference population, since these can be derived without assumption regarding the underlying distribution. 6 - 91819 These investigators, however, still discuss the problem in terms of a "normal range" and offer the percentile approach merely as a nonparametric method of defining such a range. Several suggestions of methods for relating laboratory results to a reference population without the definition of a single "normal range" have been made. Unfortunately, the majority of these are based on the mean and a dispersion unit that is a function of the standard deviation. 1,2,8,10,2 ° Thus, by implication, the reference population is assumed to have a gaussian

Methods Estimation of Percentile Most investigators 4,7,913 have suggested that the pth percentile of a distribution containing N ranked values be defined as the value whose rank is equal to i, calculated as i = p-(N + 1) (1) This is based on the fact that, in a uniform distribution, the value of p is given by

Received April 7, 1978; accepted for publication June 12, 1978. Supported by the Medical Research Service of the Veterans Administration. Address reprint requests to Dr. Rossing: Research Laboratory, Veterans Administration Center, Temple, Texas 76501.

N + 1

(2)

However, David 3 suggests that for other distributions

0002-9173/79/0700/0094 $00.70 © American Society of Clinical Pathologists

94

Downloaded from http://ajcp.oxfordjournals.org/ by guest on June 7, 2016

distribution. Lo and co-workers 11 suggest a "clinical unit," which is derived from their "clinical limits," but this really reintroduces the concept of the "normal range" and also implies that the reference population distribution is symmetrical if not necessarily gaussian. Elveback 4,5 and Feinstein 7 have addressed this problem by suggesting that the percentile equivalent of each laboratory result be reported along with the result in conventional units. This would afford the clinician an easily interpretable measure of how unusual such a result would be in the reference population without any assumption as to the population distribution. It is true, as was pointed out by Lo and associates, 11 that percentile units are of little utility for the description of values in the clearly abnormal range other than to indicate that the value is "below the first" or "above the 99th" percentile. However, in the "grey zone" i.e., in the distal 10 or 20% of the reference population distribution where the clinician frequently needs interpretative help, the percentile technic provides information that can be very useful. This paper describes a modification of the usual method of estimation of percentiles and illustrates its clinical use in reporting results of spirometric tests.

BRIEF SCIENTIFIC REPORTS

Vol. 72 . No. 1

95 ........ I . . 1

SO .30

t

zHI 2 E | 3

i3

.20 . PW«67.i ^ ' , '

.10

P 05*58.8

.05 .02 .01 .005

- i-

rfr 35

40

45

50 55 60 65 PERCENT PREDICTED

70

75

80

FIG. 3. Lower 20% of data points from Figure 1. 5th and 10th percentiles estimated by interpolation.

SO

70

90

110

130

150

170

190

210

PERCENT PREDICTED

FIG. 1. The cumulative probability distribution for 148 values of maximum respiratory flow rate measured at 30% of vital capacity (V:,„). Subjects are asymptomatic female non-smokers. Abscissa is in terms of percentage of predicted normal (prediction based on age regression15). Note satisfactory linearity in tails of distribution.

1

50 55 60 65 PERCENT PREDICTED

70

75

80

FIG. 4. Lower 20% of data points from Figure 1 with line fitted as described in text. 5th and 10th percentiles are estimated from regression coefficients.

a better estimator of p is .4 Pi =

N + .2

(3)

which, upon rearrangement, gives for i, the formula i = p-(N + .2) + .4

t

25



50

:

75

100

125

150

175

200

225

HAPTOGLOBIN - M O / 1 0 0 ML

FIG. 2. The cumulative probability distribution for 100 values of haptoglobin given by Reed and associates.13 Despite overall curvature there is no systematic deviation from linearity in the tails.

(4)

This results in estimates very similar to those given by the earlier formula except for values of p in the extremes of the distribution.16 Either method often gives values for i that are not integers. When this occurs, the usual suggestion has been to estimate the percentile by linear interpolation between the values whose ranks are immediately above and below the calculated i.4-7913 As mentioned above, our objective was to provide percentile equivalents for all laboratory values at least in the lower and upper segments of the distribution of

Downloaded from http://ajcp.oxfordjournals.org/ by guest on June 7, 2016

30

Se>20

FEV;

3.80

3.39

17

Test

FEVt/FVC

79.7

78.1

>20

MMEF

4.24

3.27

16

Vao

2.43

1.70

6

Vao

1.30

.82

5

FIG. 5. Sample laboratory report illustrating value of percentile reporting.

Results Figure 3 displays the data of the lower end of Figure 1 with the fifth and tenth percentiles estimated by linear interpolation. Figure 4 shows the estimates obtained from the same data by the regression method. The percentile estimates agree relatively closely, but estimates of all other percentiles may also be obtained from the regression line. Table 1 gives the correlation coefficients obtained by this same regression procedure when applied to other spirometric variables, demonstrating that the data are fit equally well by this procedure for all variables with which we were concerned. Discussion The spirometric results displayed in Figures 1, 3 and 4 are in units that may be unfamiliar to readers not acquainted with pulmonary function testing. Most spiro-

Table I. Correlation Coefficients Obtained during Regression of Percentage of Predicted Values on Percentile Correlation Coefficient

Variable

Male

Female

Forced vital capacity (FVC) Forced expiratory volume in 1 sec (FEV,) Ratio of FEV./FVC Maximum mid-expiratory flow (MMEF) Flow rate at 30% of vital capacity (V30) Flow rate at 20% of vital capacity (V20)

.980 .993 .943 .978 .959 .980

.989 .986 .973 .981 .994 .989

Downloaded from http://ajcp.oxfordjournals.org/ by guest on June 7, 2016

the reference population. This would obviously require multiple interpolations. It seemed that a preferable procedure would be to fit smooth lines through the extreme segments of the cumulative distribution and estimate all desired percentiles from the equation of fit. We also found that this procedure had the additional advantage of producing narrower confidence limits.16 Investigation of several theoretical cumulative probability distributions, including those with moderate degrees of skewness, revealed that when plotted on probability paper their terminal segments (from 1-20% and from 80-99%) were quite linear,16 and the same was found to be true of the distributions of laboratory values (Figs. 1 and 2). We, therefore, fitted a line through the lowest and highest 20% of data points using as the dependent variable the observations, and as the independent variable the probability transformation of (i - .4)/(N + .2). Using the resulting regression equations, the percentile equivalent of any observed value could be calculated. The same result could also be obtained graphically from a plot on probability grid, as is shown in Figure 4.

metric variables are highly correlated with age and some also with height. Therefore, regression equations involving age or age and height are routinely used to calculate a predicted value for the subject being tested. After this predicted value is derived, two practices are commonly followed in pulmonary laboratories. The individual test result is related to the predicted either in terms of volume deviation (e.g., ".45 1 less than predicted") or in terms of percentage (e.g., "72% of predicted"). For reasons discussed elsewhere15 we prefer the latter. However, we have found that the general shape of the cumulative distribution shown in Figure 1 and the observation concerning the approximate linearity in the tails remain the same when volume deviation is considered rather than percentage predicted. Another peculiarity of spirometric testing is that only deviations below predicted are of clinical significance. Therefore, if a single normal limit is desired, it is customary to exclude either 5 or 10% of values at the lower end. However, the technic illustrated can be used equally well for tests where deviations in either direction are significant and where a normal range is commonly chosen so as to exclude either 2.5 or 5% at either end of the distribution. As we have illustrated, in both theoretical and empirical distributions with moderate degrees of deviation from normality, the distal 20% at either end are satisfactorily linearized by a probability transformation. However, this would probably not be the case with more bizarre distributions, such as those theoretical distributions illustrated by Reed and Wu.14 We have found that for the x2 distribution, linearization becomes progressively less satisfactory as the degrees of freedom fall below 8. We suggest, therefore, that before utilizing the probability transformation, its validity for the data set concerned be investigated by a plot similar to that of Figure 1. However, as well shown by Reed and Wu,14 other transformation methods also experience increasing difficulty as the distributions become more bizarre. Fortunately, most biologic data,

BRIEF SCIENTIFIC REPORTS

Vol. 72 • No. 1

References 1. Amador E: Normal ranges, Progress in Clinical Pathology. Vol 5. Edited by M Stefanini. New York, Grune and Stratton, 1973, pp 59-83

97

2. Casey AE, Downey E: Further use of statens in the recording, reporting, analysis and retrieval of automated computerized laboratory data. Am J Clin Pathol 53:748-754, 1970 3. David HA: Order Statistics. New York, John Wiley and Sons, Inc., 1970, pp 66-67 4. Elveback LR: How high is high? A proposed alternative to the normal range. Mayo Clin Proc 47:93-97, 1972 5. Elveback L: The population of healthy persons as a source of reference information. Hum Pathol 4:9-16, 1973 6. Elveback LR, Taylor WF: Statistical methods of estimating percentiles. Ann NY Acad Sci 161:538-548, 1969 7. Feinstein AR: Clinical biostatistics. XXVII. The derangements of the "range of normal." Clin Pharmacol Ther 15:528-540, 1974 8. Grasbeck R: " N o r m a l " and "reference" values for laboratory data. Lancet 1:244-245, 1976 9. Herrera L: The precision of percentiles in establishing normal limits in medicine. J Lab Clin Med 52:34-42, 1958 10. Lennox B: Normalisation: A possible general solution to the units problem. Lancet 2:1085-1086, 1975 11. Lo JS, Kellen JA, Moore RW: Expressing results of laboratory tests. Clin Chem 22:1759-1760, 1976 12. Mainland D: Remarks on clinical " n o r m s . " Clin Chem 17: 267-274, 1971 13. Reed AH, Henry RJ, Mason WB: Influence of statistical method used on the resulting estimate of normal range. Clin Chem 17:275-284, 1971 14. Reed AH, Wu GT: Evaluation of a transformation method for estimation of normal range. Clin Chem 20:576-581, 1974 15. Rossing RG, Hatcher WE: Estimation of percentiles in laboratory data, Proceedings of the 16th San Diego Biomedical Symposium. Edited by Jl Martin. New York, Academic Press, 1976, pp 459-463 16. Rossing RG, Hatcher WE: A computer program for estimation of reference percentile values in laboratory data. Computer Programs in Biomedicine 9:69-74, 1979. 17. Sunderman FW Jr: Current concepts of "normal values," "reference values," and "discrimination values" in clinical chemistry. Clin Chem 21:1873-1877, 1975 18. Thompson WR: Biological applications of normal range and associated significance tests in ignorance of original distribution forms. Ann Math Stat 9:281-287, 1938 19. Werner M, Marsh WL: Normal values: Theoretical and practical aspects. CRC Crit Rev Clin Lab Sci 5:81-100, 1975 20. Wright BM: Normal = 10 ± 2. Lancet 2:1261, 1975

Lymphatic Invasion in Pigmented Nevi MARTHA EMMA ADELINE BELL, M.D., DONALD PATRICK HILL, M.D., AND M. KRISHNA BHARGAVA, M.D.

Bell, Martha Emma Adeline, Hill, Donald Patrick, and Bhargava, M. Krishna: Lymphatic invasion in pigmented nevi. Am J Clin Pathol 72: 9 7 - 1 0 0 , 1979. All 124 pigmented nevi registered at the Canadian Tumour Reference Centre between

Canadian Tumour Reference Centre, Department of Pathology, University of Ottawa, Ottawa, Ontario, Canada

Received April 25, 1978; received revised manuscript and accepted for publication August 31, 1978. Presented at the Canadian Congress of Laboratory Medicine, June 1977, Hamilton, Ontario, Canada. Address reprint requests to Dr. Bell: Canadian Tumour Reference Center, Department of Pathology, University of Ottawa, Ottawa KIN 9A9, Ontario, Canada.

July 1958 and May 1969 were reviewed. Nevus cells invading endothelial lined spaces were observed in serial sections from five cases. The significance of this finding is discussed in relation to published reports of the presence of nevus cells in lymph nodes. (Key words: Pigmented nevus; Lymphatics; Metastasis.)

0002-9173/79/0700/0097 $00.70 © American Society of Clinical Pathologists

Downloaded from http://ajcp.oxfordjournals.org/ by guest on June 7, 2016

although not normally distributed, do not manifest extreme degrees of skewness or kurtosis, and linearization by the probability transformation would be expected to be satisfactory. The two methods of estimating percentiles discussed in this paper have been compared as to their performances on simulated data.16 The regression method was shown to have a negligible bias and to provide estimates more stable and less variable than those obtained by interpolation. This, in addition to the convenience of obtaining all desired percentile equivalents from a single procedure, makes the regression method advantageous. We have now incorporated the method presented for estimation of percentiles into the computer programs that analyze our clinical spirometric tests. We routinely report to the clinician not only the observed and predicted values in conventional units of volume or flow, but the corresponding percentile value as well. An example of such a report is seen in Figure 5. This subject has values that are clearly normal for forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV,), and maximum mid-expiratory flow rate (MMEF). However, his values for flow rates when 30 and 20% of his vital capacity remain to be expelled (V30 and V20) would be considered borderline and would merit further investigation. This fact is brought more clearly to the attention of the clinician by the percentile values than would be the case had we reported only the predicted and observed values.

Percentiles as reference values for laboratory data.

Percentiles as Reference Values for Laboratory Data ROBERT G. ROSSING, M.D., PH.D., AND WILLIAM E. HATCHER III, B.A. Rossing, Robert G., and Hatcher,...
2MB Sizes 0 Downloads 0 Views