Quantitative assessment of quality of digitized portal images: receiver operating characteristic analysis applied to imaging in radiation therapy.

Therapeutic Okajima, Kimura,

Kaoru

Issyuu

MD MD

#{149} Manabu #{149} Yoshthisa

Quantitative ofDigitized Operating to Imaging

Nakata, Nakano,

Assessment

terms: Images, quality #{149} Radiography, #{149} Receiver operating characteristic curve . Therapeutic radiology, quality assur-

Radiology

1991;

P

ORTAL

ing

the most to verify tion

ever,

important

notorious

as a result treatment

trast

imaging performed the treatment beam

of portal

by ushas been

available

location images

of radiaare, how-

for their

of the beam.

poor

of the the con-

many

tech-

niques have been tried and have been successful (3-9). them, the digital technique employed widely in recent

image

contrast

has been

tling

and

Among has been years, and

to

method when subjectively disadvantages

enhancement,

noise

some

reported

be improved with this the quality was assessed (3,4). However, several

of digital

quality

high energy To improve

images,

such

exaggeration

as motof arti-

facts, in addition to the drawback of inferior spatial resolution, have also been reported (5). The objective and quantitative evabuation of image quality appears to be an indispensable part ment of new techniques ing portal images, but

of the

clinical practice. ages are obtained setup error, their

there

Since

the

From

Nagata,

the

Departments

Y. Nakano,

of Medicine,

and

(K.O.,

Y.

Nuclear Medicine University, 54-Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606, Japan; and Department of Radiology and Nuclear Medicine Services, Kyoto University Hospital, Kyoto (MN., S.Y.). From the 1990 RSNA scientific assembly. Received December 4, 1990; revision requested January 3, 1991; revision received May 24; accepted May 28. Address reprint requests to KO. C RSNA, 1991

(1K.), Faculty

MA.)

of Radiology

Kyoto

kV). and

of the hancing

AND

phantom

level

(120

of human

for

bone

en-

(5).

METHODS

portal from

im-

the the

of the image by applying

characteristic

(ROC) analysis, in which setup of radiation fields were employed

errors

1

To determine ation field could by

an

observer

The

accustom images.

phantom that was and polyurethane hon, Kyoto, Japan). phantom obtained 1. Ninety-six

obtained with within-a-field 15M;

portal

by

Mitsubishi

a GF

a linear

Electric,

(Kasei

equalization,

standard

Half

a “sensi-

performed. was also

Optonix,

the images

error,

to

portal a chest

made of human bone (Kyoto Kagaku HyouA radiograph of the at 120 kV is shown in

screen-films used were XTL-5 (Eastman Kodak, and

error,

test” was pilot study

images

a double-exposure method by using

generated

Abbreviations: ROC = receiver

as one

as a setup

the observers to digitized To obtain images, we used

as “signals.” We used a phantom model and compared conventional portal images with digitally processed images obtained by means of histogram

is accepted

how large a shift of a radibe correctly recognized

and specificity purpose of this

tivity

rays

which

made

techniques contrast

Experiment

method quality

operating

effective image

chest

energy

was

MATERIALS

Figure

receiver

of the

The phantom polyurethane.

of setup errors. The purpose of this study was to provide an objective and quantitative for assessment of portal images

Radiograph

at a diagnostic

have

to determine quality

1.

obtained

develop-

clinical point of view must be evaluated on the basis of the detectability

i

Figure

for improv-

been only a few articles about the objective evaluation of digital portal imaging (5,6). These studies employed a set of phantoms that were developed by Luts and Bjarngard (10), but the method used cannot be applied in

181 :273-276

MD

Applied

method

the precise fields (1,2). The

Nagata,

Quality

of

Portal Images: Receiver Characteristic Analysis in Radiation Therapy’

An objective and quantitative method for the evaluation of the quality of megavoltage portal images was developed by applying receiver operating characteristic analysis. On the basis of therapeutic use of portal images, setup errors were employed as “signals” in this experimental study that compared the original portal films with digitized images. Six readers observed 104 portal images of a chest phantom, half of which were “abnormal” (ie, had setup errors). Digital images (2,048 x 2,048 matrix) were enhanced by means of histogram equalization and then printed with a laser printer for observation. The readers showed a higher discrimination capacity with the digitally enhanced images, although a statistically significant improvement was not demonstrated. The present method of assessment of image quality proved to be both simple and dinically reasonable. Index digital (ROC) ance

RT #{149} Shinsuke Yano, RT #{149} Yasushi MD #{149} Mitsuyuki Abe, MD

Radiology

were

were

field10-MV

accelerator Tokyo).

x (ML-

The

an 11 x 14-inch Rochester, NY) Tokyo).

“normal”-show-

FPF = false-positive fraction, operating characteristic, SE TPF = true-positive fraction.

=

273

ing

the

and

original

the

ages,

radiation

other

half

which

of 0.5,

fields

were

included

1.0,

2.0,

exactly-

“abnormal” 12 setup

or 3.0

mm

imerrors

each

in treatment

All 96 radiographs were digitized film digitizer (DG3; Hitachi Medical, Chiba, Japan) with a 175-p.m spot depth. with

a laser

size

and

10-bit

sion

into

a 2,048

required

analog-to-digital

conver-

x 2,048

matrix.

for digitization

onds

per

image.

enhanced

Then

the

by using

(Hipacs IWS6; gram equalization. tal processing histogram is described

The

was

was performed equalization. The elsewhere (11).

Finally, images printer (Ektascan;

were

processor

Medical) simplicity,

For

70 sec-

images

an image

Hitachi

time

about

with histono digi-

except software

for used

were printed on Eastman Kodak).

a.

a laser These

printed images were slightly smaller than the original ones (length, 91 % of the original).

An

original

hanced

image

image

two

of test

to these

control

radiographs

radiation

fields

sets

showing were

also

en-

The

test

images

viewed six

and

side

original

created

observers.

rienced

Four

in the

From

and

(MN.,

engaged

in performing

at least original

was

The

observers “yes” or “ no” setup with

two had

the

other

three

was

the sensitivity

calculated was defined

for

number number

of “ no” responses of normal images,

was

ratio

the

sponses

of the

to the

and

each as the

actual

speci-

A curve

by

threshold

was

calculated

After

re-

of abnormal

2 the

pilot

portal

and showed

study. was

an

not

however,

ROC

analysis

or absence of setup One hundred four

radiographs

abnormal images this ples

study,

presence performed.

were 52 normal. a setup

All error

The “difficulty” completely

because

the

obtained;

were

52

52 abnormal of I mm in

of all the homogeneous, radiation

two

gaussian

images

same

observers with

five

categories

2

=

4

=

probably probably

then

evaluated

a confidence (1

=

rating

terval week.

274

read

each

between

twice,

the

at least

I

CR

given

by

(equation

the

following 3 and

4 of reference

15,

=

(mean

Azl

mean

Az2)/SE(diff)

SE(diff)

=

2”[S#{247},,(1

(1)

-

+S,+u,r(1rs,_u.r)IlS,rl.

in-

variance

in area

found

that

one

of a set

of different

on

case

two

The value were obtained puter program sr, and

the method quantity relations radiographs by

by

would

sample

one

reader

be found

read

once

by by

each

obin area that would be one reader read one case readers;

or more

and

St,,

(13).

=

independent

of Az and its variance by using a ROCFIT The

quantities

occa(S+,,)

comS,+u,r,

were calculated according to of Swets and Picket (15). The was

estimated

of the ratings obtained using

having

of a set of different case = Sb2, + S, the observable

having

ities

difference

I in chapter

5 in chapter

-

The Azl

parameters = Az for

enhanced relation

#{149} Radiology

normal

was performed acof McNeil et al (14), version of the method (15). In this method, error (SE) of the difof three separate SEs to the case variability (c), variability (br), and variability (wr). The critical

and

sessions

was

of the

each SL,,

sample sions.

at

the

from

the

cor-

given to individual with the two modal-

method

of Hanley

and

The sensitivity and specificity are shown in Table 1 for each reader and for each magnitude of setup error. There was no difference in specificity in readings between the original and enhanced images. Regarding sensitivity, setup errors of 0.5 mm and 3.0 mm appear to be of no use for an

1.

ab-

set of images

for

be

read once samples;

servable variance found by having

likelihood binormal to have implied

respectively):

normal). Viewing time was not restricted. A set of images was viewed in one reading session, and for three of the observers,

who

the rat-

cases

corresponding

equation

definitely

normal, abnormal,

index,

from fitting

assessment

of

normal, 3 = equivocal, 5 = definitely

under

same

McNeil

the scale

in a unit

the

observable correlation the areas obtained when a single reads a set of case samples at two 1 = number of readers; S+w. the observable variance in area

settings; + that would

s

of TPF

reads ‘c-wr

between ROC curves cording to the method which is a modified of Swets and Picket the overall standard ference is a merging

equations

were

set at various sites in the phantom. Two sets of test and control images were obtamed in the same way as in experiment

The

increase confi-

a binormal

distributions

between-reader within-reader ratio (CR) was

sam-

fields

plotted

from

resulting

Statistical

for scoring errors was

was

between reader

abnormal groups (13). When a binorROC curve is plotted on double-probability paper, it is transformed to a straight line, which can be described by the y-intercepts and the slope. To determine these two parameters, maximum likelihood estimarion is used (13).

images.

Experiment

is altered. of FPF

a set of readers two settings;

by

and mal

modality. ratio of the

number

dence

ing data by means of maximum estimation (13). According to the model, ROC curves are assumed the same functional form as that

image as

of “yes”

will as the

ROC curve

to the actual and sensitivity

number

given

were created. (TPF, or sensifraction (FPF,

or 1 -[specificity]) generally or decrease concomitantly

which

of a

test defined

ratings

square (12). We obtained the area the ROC curve (Az) as an accuracy

observers.

the

confidence

as a function

therapy

were asked to respond regarding the presence

Then

ficity were Specificity

techfully

of them observed then the order

error after comparing a control image that

normal.

bK.,

were been

radiation Half first;

for

expe-

other who

S.Y.)

3 years. images

reversed

were

Y. Nagata,

the

nobogists

interpreted

(K.O.,

the

the observers, ROC curves The true-positive fraction tivity) and the false-positive

images

and

of them

radiologists

Y. Nakano),

for the

side

by using the treatment beam (10 MV method. (a) Original conventional radio(b) Enhanced digital image provides

images,

the

control

by

2. Example of a set of portal images obtained and the double-exposure, field-within-a-field shows a radiation field in the right lower lung. contrast of the ribs.

2. In

manner.

were by

a digitally

in Figure

addition

same

and

are shown

b.

Figure x rays) graph better

were original

defined images;

images; between

(2)

as follows: Az2 = Az

observable the

areas

obtained

for

corwhen

(16).

RESULTS Experiment

ROC

study

1

because

the

former

could

not be detected at all and the batter were completely detected with both techniques. The sensitivity of detection of a setup error of 1.0 mm could be significantly improved with digital image enhancement (P = .013, twotailed paired t test), and an error of this magnitude appeared to be appro-

October

1991

Table

1 and Specificity

Sensitivity

of the Detection Reading

of Setup

of Original

Sensitivity of Setup

Magnitude

Observers

Specfficity

0.5

38/48

7/12

B

38/48

3/12

C

38/48

D

37/48

7/12 5/12

E

32/48

5/12

F

220/288

Note.-Data

show

(76.4) the ratio

2

Areas

under

8/12

34/72 (47) of the

parentheses are percentages. *P < .05 (paired two-tailed

Table

8/12

7/12

37/48

Total

9/12 6/12 10/12 8/12

observers’

t test).

ROC Curves

49/72*

by Error (mm)

2.0

3.0

10/12 10/12 12/12 12/12

12/12 12/12 12/12 12/12

11/12

12/12

Other

67/72

responses

correct

differences

between

Spedficity 39/48

38/48 37/48

38/48 34/48

12/12

(93)

72/72

regarding

37/48

(100)

the presence

the two techniques

or absence

were

of setup

2.0

3.0

6/12 5/12 5/12 6/12 4/12 7/12

10/12 8/12 11/12 10/12 8/12 9/12

12/12 11/12 12/12 11/12 11/12 12/12

12/12 12/12 12/12 12/12 12/12 12/12

(46)

56/fl*

table

S,,

0.7263

0.0026 0.0028 0.0023

0.7954 0.7360 0.8031

0.0021 0.0026 0.0020

0.192 0.194 0.166

0.0029 0.0033

0.7679 0.7662

0.0023 0.0024

0.0022 0.0027

0.7891

0.7763

the results

of experiment

from fitting the rating data by means

was

used

abnormalities”

to produce

in experiment

“subtle

2.

2

Composite ROC curves for the two modalities were calculated from the reader-specific curves by averaging TPF values for each FPF (Fig 3). The part of the curve near the bower left corner

and

the

segment

near

the

up-

per right corner, respectively, represent the stricter and bess strict confidence thresholds (12). The curve for the digital images was generally higher than that for the original images, indicating that greater discrimination was possible with digital images. Table 2 shows the area under the ROC curve (Az), the variance of area Volume

181

#{149} Number

1

Az

of samples.

(100)

Numbers

in

Images

Az

S

S,

0.8074

0.228 0.118

. ..

. . .

. . .

...

.

.

. . .

. . .

...

0.0023

0.275

. . .

. . .

. . .

...

0.0023

0.195

. . .

. . .

. . .

...

twice.

0.658,

and

was

1

=

1.089, so the ence between the curves was tisticalby significant.

6. The differ-

not

Az and

S+,,

were Obtained

2 to be as follows: mean Azl = 0.7101, mean Az2 = 0.7763, S#{247},,,. = 0.00249, S,+u,r 0.00123, SJr = 0.00023, = ratio

Enhanced

0.0017 0.0027 0.0026

and the intraobserver correbation (r) for each reader. The mean areas under ROC curves were 0.7101 and 0.7763 at the first reading for original and enhanced images, respectiveby. The parameters in Equations (1) and (2) were estimated from Table

T,,.u,r

Reading

0.7773 0.7224 0.7169

(S+wr),

critical

Experiment

r

2. Only observers A, B, and C read the images ofmaximal Likelihood estimation. Values for

0.195,

1Images

Origina

Az

priate for use as a “subtle abnormality,” since the sensitivities were 68% for nonenhanced images and 78% for enhanced images. An error of 2.0 mm could also be used, but the difference of sensitivities between the modalities was smaller than that obtained with a 1-mm error. A setup error of 1 mm therefore

Images

S,,

shows

72/fl

69/fl(96)

not significant.

Az

0.7510 0.6802 0.6413 0.7713 0.7101

Note.-This curve resulting

(7)

to the total number

errors

Second

Enhance d

Images

0.6903

Mean

(mm)

for Each Observer

Original

A B C D E F

Error

1.0

First Reading

Observers

by

0.5

33/fl

(77.4)

223/288

Images

Sensitivity of Setup

Magnitude

12/12

()

of Enhanced

Reading

10

A

by Each Reader

Errors

Images

sta-

DISCUSSION There has recently been a strong tendency to digitize portal images with the baser scanner (4) or abternative detectors (3,5,6). These techniques provide various attractive advantages such as on-line imaging (5) or digital contrast enhancement (3-6,9), but image quality remains a problem. The quality of portal images was reported to be improved with digital techniques, but such images can also suffer from mottling noise

0.0019

0.7169

0.0026

0.8159

0.0018

were calculated from a binormal according to reference 16.

ROC

and false edges of the radiation field (Fig 4). Thus, their quality should be assessed objectively by taking such demerits into account. Many studies of the objective assessment of image quality in diagnostic radiology have been reported, including those on the problems of sampling pitch (17) and image compression (18). The advantages of ROC studies are clearly established: they are objective and quantitative, and allow differences in inherent diagnostic capacity to be distinguished from the effects of the decision criterion (12). The image quality in radiation therapy, however, should be considered in a different way from that in diagnostic radiology, because the purpose for which the image is used is different. Portal images are used to find setup errors. Therefore, it seems reasonable to investigate the detectability of setup errors in the evaluation of imaging systems for use in radiation therapy. Our therapeutic version of ROC analysis enjoys some merits in addiRadiology

#{149} 275

tion to the above-mentioned tages of ROC studies. First,

advanit is clini-

cably

setup

reasonable

to employ

errors as a “signal.” Second, the kind of “abnormality” present in portal images

is only

single

(ie,

a setup

er-

ror). Last, the “difficulty of a sample” can be easily quantified with a single index, the distance that the radiation field has been shifted. Therefore, an ROC study is easier to design for radiation therapy than for diagnostic radi-

TPF

0.5

0.0 0.5

1.0

obogy.

FPF

In our experimental study with a chest phantom, the mean area under the curve was larger with the digital images

but

than

with

no statistically

provement ues, however,

was

the

original

Figure 3. ROC curves combined by averaging the TPF for the six readers at each FPF.

The curves

ones,

significant

im-

shown. Larger were obtained

Az with

valthe

enhanced images in eight of nine reading trials in experiment 2, and a higher discrimination capacity of the enhanced images was shown in experiment 1. It is possible, therefore, that the difficulty of detecting a 1-mm error

prevented because viewed.

noted images

We believe

improvement of the small

that

this

being number

method

equalization to be one

rithms

for

(4,5), of the

improving

which

best portal

radiology,

smooth

from

lower

corner

corner.

the original

setup

errors

1.

2.

3.

is re-

were

PhD,

for his helpful

We thank guidance

Charles E. Metz, on the concepts of

hook

near

4. Portal image that has undergone excessive digital processing shows considerable mottling and artifacts along the edge

right

the radiation

the

dows were equalization

images.

10.

11.

Marks

JE, Haus

AG,

Sutton

of frequent

HG,

treatment

Griem

ML.

verificaerror Cancer

in 12.

1976; 37:2755-2761.

algoimages.

as “signals.” This version proved to be a clinically reasonable and simple method for the evaluation of image quality. #{149}

a slight

tion films in reducing localization the irradiation of complex fields.

pro-

employed

Acknowledgment:

formed

and extended to the upper

Figure

in di-

References

Byhardt RW, Cox JD, Homburg AH, Liermann G. Weekly localization films and detection of field placement errors. Int Radiat Oncol Biol Phys 1978; 4:881-887. Gur D, Deutsch M, Fuhrman CR, et al. The use of storage phosphors for portal imaging in radiation therapy: therapist’s perception of image quality. Med Phys

13.

14.

1989; 16:132-136.

CONCLUSION

in which

They

left

ROCs

right upper corner and then crossed at the point of (0.85, 0.92). Most of the curves for the digital images were higher than those for

4.

analysis

the

The value

The quality of portal images should be evaluated on a clinical basis. We therefore developed a version of ROC

like any other

agnostic

of

vides an alternative to subjective assessment and should enable the objective evaluation of many imaging techniques, such as adaptive histo-

gram ported

were,

5.

6.

7.

8.

ROC analysis.

Sherouse GW, Rosenman J, McMurry HL, Pizer SM, Chaney EL. Automatic digital contrast enhancement of radiotherapy films. Int J Radiat Oncol Biol Phys 1987; 13:801-806. Leszczynski KW, Shalev S. Digital contrast enhancement for online portal imaging. Med Biol Eng Comput 1989; 27:507-

15.

512.

17.

Wilenzick

CRB, Balter S. Megavoltage portal films using computed radiographic imaging with photostimulable phosphors. Med Phys 1987; 14:389-392. Reinstein LE, Orton CG. Contrast enhancement of high-energy radiotherapy films. BrJ Radiol 1979; 52:880-887. Shiu AS, Hogstrom KR, Janjan NA, Fields RS, Peters U. Technique for verifying treatment fields using portal images with diagnostic quality. Int J Radiat Oncol Biol Phys

9.

Meertens energy

RM,

field.

High-pass

used in addition in this image.

filters

and

of

win-

to histogram

Luts WR, Bjarngard BE. A test object for evaluation of portal films. Int J Radiat Oncol Biol Phys 1985; 11:631-634. Morishita K, Yokoyama T, Sato K. Automatic grey-level transformation of image enhancement of a PACS workstation (abstr). Radiology 1990; 177(P):343. Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986; 21:720-733. Metz CE. Some practical issues of expenmental design and data analysis in radiological ROC studies. Invest Radiol 1989; 24:234-245. McNeil BJ, Hanley JA, Funkenstein HH, Wallman J. Paired receiver operating characteristic curves and the effect of history on radiographic interpretation. Radiology 1983; 149:75-77. Swets JA, Picket RM. Evaluation of diagnostic systems, methods from signal detection theory. New York: Academic Press, 1982.

16.

Meritt

Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148:839-843. MacMahon H, Vyborny CJ, Metz CE, Doi K, Sabeti V, Solomon SL. Digital radiography of subtle pulmonary abnormalities: an ROC study of the effect of pixel size on observer performance. Radiology 1986; 158: 21-26.

18.

Ishigaki

T, Sakuma S, Ikeda M, Itoh Y, SuS. Clinical evaluation of irreversible image compression: analysis of chest imaging with computed radiology. Radiology 1990; 175:739-743.

zuki M, Iwai

1987; 13:1589-1594.

H. photon

Digital beam

processing of highimages. Med Phys

1985; 12:111-113.

276

#{149} Radiology

October

1991

Quantitative assessment of digitized portal images: effect of sampling frequency on observer performance.

Receiver operating characteristic curves: a basic understanding.

How to read a receiver operating characteristic curve.

Comparison of semiparametric receiver operating characteristic models on observer data.

Lognormal Lorenz and normal receiver operating characteristic curves as mirror images.

Receiver operating characteristic analysis of chest image interpretation with conventional, laser-printed, and high-resolution workstation images.

The average receiver operating characteristic curve in multireader multicase imaging studies.

Editorial: Radiology and the receiver operating characteristic (ROC) curve.

Understanding receiver-operating-characteristic curves: a graphic approach.

Comparison between high-field-strength MR imaging and CT for screening of hepatic metastases: a receiver operating characteristic analysis.

Some methodological questions concerning receiver operating characteristic (ROC) analysis as a method for assessing image quality in radiology.

Evaluation of the predictive performance of nutritional indicators by receiver-operating characteristic curve analysis.

Cephalometrics of anterior open bite: a receiver operating characteristic (ROC) analysis.

Receiver operating characteristic analysis of age-related changes in lineup performance.

Receiver operating characteristic curve generalization for non-monotone relationships.

Predictive factors in patients with hepatocellular carcinoma receiving sorafenib therapy using time-dependent receiver operating characteristic analysis.

Receiver-operating-characteristic study of chest radiographs in children: digital hard-copy film vs 2K x 2K soft-copy images.

A Linear Regression Framework for the Receiver Operating Characteristic (ROC) Curve Analysis.

Receiver operating characteristic analysis. Application to the study of quantum fluctuation effects in optic nerve of Rana pipiens.

Fully non-parametric receiver operating characteristic curve estimation for random-effects meta-analysis.

Reply to Letter: "Defining 'The Elderly' Undergoing Major Gastrointestinal Resections: Receiver Operating Characteristic Analysis of a Large ACS-NSQIP Cohort".

Device for correlating CT and radiation therapy portal images.

Weighted Area Under the Receiver Operating Characteristic Curve and Its Application to Gene Selection.

Receiver operating characteristic curve estimation for time to event with semicompeting risks and interval censoring.