Clin. Biochem. 9, (4) 198-202 (1976)

A Simple Method for "Reference Range"

Estimation from Routine Laboratory Data SAMUEL Y. CHU, P. CHEUNG and V.E. TURKINGTON Biochemistry Laboratory, Department of Laboratory Medicine, Ottawa General Hospital, Ottawa, Ontario (Accepted March 30, 1976)

CLBIA 9, (4) 198-202 (1976)

Clin. Bioehem. Chu, Samuel Y., Cheung, P. and Turkington, V. E.

BiocheTnistry Laboratory, Department of Laboratory Medicine, Ottawa General Hospital, Ottawa, Ontario A SIMPLE METHOD FOR " R E F E R E N C E RANGE" E S T I M A T I O N FROM R O U T I N E LABORATORY DATA A simplified method is presented to estimate reference ranges from hospital laboratory data. I t is based on a combination of graphical estimation o~ relative sizes of normal and abnormal populations and the "mode-center" concept in which the mode of the total population centers on the 50% cumulative frequency of the normal population. This method can be applied to determine reference ranges even ,though the data source contains abnormally high a n d / o r low values. The reference ranges obtained for BUN and calcium from in-patient and out-patient sources by this method were found to be similar to those reported for "healthy" subjects.

CLINICAL LABORATORIES N O W A D A Y S are confronted with increasing automation and n e w methods. Some enzyme assays, for example, employ a large variety of assay conditions. It is desirable, therefore, for each laboratory to establish the "normal range" or "reference range" for the methods they select. The heterogenicity of population groups has also prompted the determination of the "local norms". "Reference ranges" have been conventionally derived from a selected group of "health" subjectsC'~L The difficulty in obtaining a large sample size, however, have indicated the need for simpler ways of obtaining such limits. A number of indirect methods have been suggested which are based either on routine laboratory data (6"m or a selected hospital out-patient population"2L Not only has the the accuracy of these methods been disputed~,30 m, but also the validity of establishing a "normal range" from "normal healthy" subjects has also been questioned c6,'s~,since a range derived from healthy, active individuals m a y not always be relevant for a hospital population. Correspondence: Dr. Samuel Y.'Chu, Department of Laboratory Medicine, Ottawa General Hospital, 43 Bruyere Street, Ottawa, Ontario K I N 5C8

P r e v i o u s i n d i r e c t m e t h o d s (8°'m a s s u m e d t h a t t h e d a t a source c o n t a i n e d one side c o n t a m i n a t i o n only. B u t we r e p o r t h e r e a s i m p l e and c o n v e n i e n t w a y of e s t i m a t i n g " r e f e r e n c e values" f r o m h o s p i t a l l a b o r a t o r y d a t a in which the d a t a source m a y c o n t a i n a b n o r m a l l y h i g h a n d / o r low values. The m e t h o d h a s been a p p l i e d to e s t a b l i s h the " r e f e r e n c e r a n g e s " f o r s e r u m u r e a n i t r o gen and calcium which we believe to be r e l e v a n t to h o s p i t a l i n - p a t i e n t / o u t - p a t i e n t p o p u l a t i o n s . The results o b t a i n e d by t h i s method w e r e also f o u n d to be c o m p a r a b l e to those r e p o r t e d f o r " h e a l t h y " s u b j e c t s . Mode-Center

Concept

I f d a t a f r o m a g r o u p of " h e a l t h y " s u b j e c t s w i t h a G u s s i a n d i s t r i b u t i o n is m i x e d w i t h " d i s e a s e d " g r o u p s on e i t h e r sides h a v i n g h i g h and low values, the mode of t h e r e s u l t i n g m i x e d p o p u l a t i o n will coincide w i t h the mode of o r i g i n a l " n o r m a l " population. P r o v i s i o n has to be made, however, t h a t t h e o v e r l a p of the p o p u l a t i o n s is not e x t r e m e or t h a t the d a t a c o n t r i b u t e d by the a b n o r m a l g r o u p s to t h e 30-70% ( ± 0.55 S.D.) r e g i o n of the " h e a l t h y " p o p u l a t i o n is n o t m a r k e d . I f a b i m o d a l or t r i m o d a l d i s t r i b u t i o n occurs, one is to a s s u m e t h a t the a p p r o x i m a t e r e f e r e n c e r a n g e can be d i s t i n g u i s h e d f r o m the mode or modes of t h e a b n o r m a l population. The mode here r e f e r s to the t e s t or class value h a v i n g t h e g r e a t e s t c o n c e n t r a t i o n o f observed i n c i d e n t s and is e s t i m a t e d f r o m t h e b e s t f i t t e d curve of the h i s t o g r a m of t h e m i x e d u n g r o u p e d data. F o r t h e g r o u p e d d a t a , we p r e f e r to use t h e " m o m e n t s of f o r c e " method "7~ w h e r e mode is g o v e r n e d by the e q u a t i o n : re'ode L ..... -{- f" f~ fb C Where ~ o lower limit of modal group f.~ frequency of class interval above modal group fb frequency of class interval below modal group C size of class interval F i g . 1 shows t h e t h r e e f r e q u e n c y d i s t r i b u t i o n curves of t h r e e populations, each w i t h n o r m a l l y dist r i b u t e d data, L, N and H d e r i v e d f r o m t h e g r o u p of f a c t i t i o u s d a t a shown in Table 1. The f r e q u e n c y curve of the total m i x e d p o p u l a t i o n is r e p r e s e n t e d by the d o t t e d line M ( F i g . 1) and one can observe t h a t the mode of t h i s composite curve coincides w i t h the mode of the u n c o n t a m i n a t e d " h e a l t h y " p o p u l a t i o n " N " , since t h e " d i s e a s e d " p o p u l a t i o n s do n o t i n t e r f e r e s i g n i f i cantly. I f t h e a b n o r m a l g r o u p s L o r H i n t e r f e r e d to a l a r g e e x t e n t i.e. f r e q u e n c y a t or n e a r t h e mode of the " h e a l t h y " p o p u l a t i o n ( N ) , t h e n t h e mode of the

R E F E R E N C E RANGE

199 TABLE I

FREQUENCY DATA FOR THREE ARBITUARY POPULATIONS L, N AND H REPRESENTING "LOW DISEASED", "HEALTHY" AND "HIGH DISEASED" POPULATIONS RESPECTIVELY

Iqo

,~0

N

incidence

\

U3 ill

c_)

Value

ZIoo Z 0 (Do

m Z

'~-rf

~o

o

g

IO

~S

ARBITRARY

"--

zo Z5 SCALE

30

3~

F i g . 1 - - F r v q u e n c y c~crves of three f a c t i t i o u s p o p u l a t i o n s L, N a n d H a n d t h e i r combined data M (see Table I ) .

r e s u l t i n g p o p u l a t i o n M may s h i f t considerably such t h a t the two modes would no longer coincide. Such a large o v e r l a p p i n g would be unacceptable, since one c a n n o t d i s t i n g u i s h the healthy f r o m sick subjects. This "mode-center" concept had also been implied b y P r y c e ''°' and Becktel ''') where they proposed t h a t mean and mode would coincide. U s i n g the above concept, then the mode of the mixed total population can be t a k e n to r e p r e s e n t the mean of the " h e a l t h y " population and s u b s e q u e n t l y its 50% c u m u l a t i v e f r e q u e n c y can be located if the data is plotted on p r o b a b i l i t y paper as discussed later.

The Effect of Low and High abnormal Values on Cumulative Percentages I n the presence of high a b n o r m a l values, the cumulative p e r c e n t a g e vahles of the n o r m a l population will be reduced while they will be increased i n the presence of low a b n o r m a l values. The c u m u l a t i v e decimal fraction (x) for a given value for the n o r m a l population and the c o r r e s p o n d i n g c u m u l a t i v e decimal f r a c t i o n (y) of the total population can be d e t e r m i n e d b y the f o r m u l a e 1 & 2 if there is no overlapping of the populations or i f the a b n o r m a l Values have a m i n i m a l influence on the. 30-80% region of the n o r m a l population. (x) N + L 1.0

y . . . . . (i)

N + L + H = 1.0...

(2)

In the above equations L, N, H are population decimal fractions representing the "low diseased", "normal" and "high diseased" groups respectively; thus for example if L ~ 0 and the contamination is exclusively due to abnormally high values and if X = 0.50 (or 50%), N = 0 . 5 , H = 0 . 5 , then y = 0 . 2 5 (or 25%).

L

N

H

Combined Data (M)

1

2

2

18

18

3

60

60

4

80

10

90

5

36

17

53

6

3

33

36

7

60

60

8..

90

90

9.

137

137

10.

153

153

11

148

148

12..

128

2

130

13.

92

3

95

14...

62

5

67

15...

39

6

45

16...

18

10

28

17

9

13

22

18

3

20

23

19..

24

24

20.

26

26

21.

28

28

22.

29

29

23.

28

28

24.

26

26

25.

22

22

26.

18

18

27.

15

15

28.

10

10

29.

5

5

30.

4

4

31.

3

3

32.

2

2

33.

1

1

Total . . . . . . . .

1498

On the other hand, if H = 0, i.e.- the contamination is .exclusively due to abnormally low values and if X = 0.5, N = 0.5, L = 0.5, then y becomes 0.75 (or 75%). Thus, in the presence of high abnormal values, the

CHU, CHEUNG AND TURKINGTON

200

993

9o

M-~ ,o.O"°

so

8~

7o Go

7o o

.'o'

~bo

/

LU 30

'ot //J

2o

/

Z

÷

/

_...1

/ J

J /

:Z) ,-z

¢o

/

/



/ 64

0.1

I

L

i

i

i

0

z

÷

~

t

i

i

I

UNITS

I

UNITS Fig. 2 - - Probability plots of the populations L, N, H and 31 (see Table 1).

Fig. 3 - - Graphical estimation of population sizes and reference valuee determination (see text for details).

points on line (N) (see Fig. 2) representing the normal population in Table 1 will shift downwards with a larger shift at the higher values. Conversely, in the presence of low abnormal values, the s t r a i g h t line (N) will s h i f t upwards with g r e a t e r shift at lower values. The presence of both low and high abnormal values will result in a curve (M) i n t e r s e c t i n g the line "N" of the normal population (Fig. 2).

.~.S

9~

BUN .................

.#. ¢'+'r'+

/

95

/i

~0

,/"

80

Estimation

of the

The fractional sizes of the d i f f e r e n t populations can be estimated graphically provided the data of the normal population is distributed in a Gaussian fashion. The cumulative percentages of the total data from Table 1 were plotted on probability paper (Fig. 3) and three sections on curve "M" were observed with the middle portion (AB) corresponding to the normal population. Point ' T ' is the point of inflexion and a tangent (T) is drawn at that point to the arc (AC). The point "P" where this tangent and the best straifht line drawn through the middle portion (AB) meet, represents the fractional population size of the combined normal and abnormal low values, i.e.-L -~- N. In this case " P " ~ 0.8 and therefore, H is = to 0.2 (20% of the total population). I f all possible abnormally high values are not present or i f . those values are not Gaussian distributed, it may be difficult to locate the point of inflexion. In t h a t case, a best s t r a i g h t line is drawn through those points immediately to the r i g h t of arc AC as illustrated in F i g . 4.

/ 7

7o

F r a c t i o n a l Sizes

IO /

o(~O

/

t.~ 4"o

>3o

A simple method for "reference range" estimation from routine laboratory data.

Clin. Biochem. 9, (4) 198-202 (1976) A Simple Method for "Reference Range" Estimation from Routine Laboratory Data SAMUEL Y. CHU, P. CHEUNG and V.E...
399KB Sizes 0 Downloads 0 Views