Glendon Stanton

G. Cox,

MD

J. Rosenthal,

#{149} Larry

MD

T. Cook,

PhD

Chest Radiography: of High-Resolution with Conventional

This study was performed to compare the performance of observers using three display formats for chest radiography. The display formats were conventional radiographs, digitized radiographs (2,048 x 2,048 X 12 bits) printed on laser film, and digitized radiographs (2,048 X 2,048 X 12 bits) displayed on a high-resolution (2,560 X 2,048 x 12-bit) gray-scale display. The test set for the study consisted of 163 cases. Sixty-four of the cases were normal, whereas the 99 remaining cases demonstrated one or more common radiographic abnormalities. Nine abnormalities were selected for analysis: costophrenic angle blunting, interstitial disease, atelectasis, pneumothorax, parenchymal mass, consolidation, obstructive disease, hilar/mediastinal mass, and apical scarring. Six experienced general radiologists participated in the evaluation. Receiver operating characteristic curves were generated for each abnormality and display format. The results indicate that, while the three display formats are equivalent for the detection of some abnormalities, detectable differences in observer performance may be seen even at 2,048 X 2,048 X 12 bits for the detection of obstructive disease, pneumothorax, interstitial disease, and parenchymal masses.

I

Radiology

1990;

radiography,

McMillan,

MD

conventional radiography, the radiographic film serves as the means for image acquisition, archival, and display. Although film continues to be the medium of choice for chest radiography, the successful applications of interactive display and image processing techniques in other areas of medical imaging have stimubated an interest in the design and development of digital acquisition and display systems that could ultimately replace traditional screen-film techniques. In theory, the separation of image acquisition functions from the functions related to image disN

play

and

proved

interpretation

control

quisition

offers

of either

parameters

im-

process.

can

Ac-

be opti-

mized independent of display parameters, and it becomes possible to correct for technical errors in image acquisition by adjusting various display parameters. Furthermore, the interactive display of images may improve diagnostic performance by effectively

matching

the

is the

assumption

of observers be

mance tional ies

have

that

using

the

equivalent

to the

of observers techniques. been

system

perfor-

in an

effort

to determine the spatial and contrast resolution necessary for adequate reproduction of chest nadiognaphs. Most of these investigations have been concerned with determining

60.11,

176:771-776

the Department of Diagnostic Blvd. Kansas City, KS 66103.

ber

1, 1989; revision requested January Address reprint requests to G.G.C. C

RSNA,

1990

Radiology, University of Kansas Medical From the 1989 RSNA scientific assembly. 24, 1990;

final

revision

received

April

Center,

Received 16; accepted

39th

shown that the size, on the larger

the number of pixels the better the observer Chest images displayed 512 format information

provide than

in the image, performance. in a 512 X

more diagnostic those displayed

a 256

X 256 format,

formats terms

are much of observer

but

both

in

of these

less satisfactory performance

in than

are conventional film images. Observer performance continues to improve when the matrix size is increased from 512 X 512 to 1,024 X 1,024

pixels.

While

observer

perfor-

mance when using images displayed at 1,024 X 1,024 pixels approximates that when using plain film for many detection tasks, there is a measurable deterioration of observer performance for the detection of subtle chest lesions. Only recently

tnix

for

Hz.

These

have

display

systems

be-

in the design units, it is now

at refresh

frame

rates

buffers

for the present 2,048 displays.

of 72

are the

generation Because

basis

of 2,048 of the high

X

data rates that must be maintained between the frame buffer and the display in these

monitor, the high-resolution

driven vertens. up

video

monitors systems are

by 8-bit digital-to-analog The use of high-speed

tables

level

I From Rainbow

generally the pixel

eledigistud-

possible to construct high-resolution frame buffers that can support images on a 2,560 X 2,048 X 12-bit ma-

using convenA number of stud-

performed

ies have smaller

result of improvements of solid-state memory

performance

display

the minimum number of pixel ments required for displaying tized chest images (1-5). These

come available that are capable of displaying images of greaten than 1,024 X 1,024 X 12 bits. Primarily as a

characteristics

of the display to those of the obsenyen’s visual system and by allowing certain features in the image to be enhanced or suppressed. A primary concern in the design and construction of an interactive display system for chest radiography

Index terms: Diagnostic radiology, observer performance #{149} Radiography, comparative studies #{149} Radiography, computer-assisted #{149} Radiog#{149} Thorax,

H III, PhD

Comparison Digital Displays and Digital Film’

must

raphy, digital 60.1215

#{149} John

J Dwyer

#{149} Samuel

and

interactive

functions

allow

conlook-

window!

the

entire

con-

and

DecemApril 19.

Abbreviations: characteristic, =

tive

true-positive

fraction

ROC SD

=

fraction

receiver

standard assuming

operating

deviation,

TPF185

a false-posi-

of 0.185.

771

tnast

range

despite

of the

the

image

to be viewed

limitations

imposed

by

the 8-bit digital-to-analog converters. There are few data documenting the performance of observers using 2,560

x 2,048

interactive

displays

to conventional

film.

relative

One

recent

study did indicate that observer penformance for the detection of septal lines and pulmonary nodules with a 2,048 X 2,048 interactive display was

comparable film

to that

and

suggested

with

conventional

that

a 2,048-line

completed

a study

comparing

pixel ages

using

laser-printed displayed

x 2,048 graphs

images interactively

This article our study.

presents

the

MATERIALS

radioen-

of the

chest.

results

of

AND

during

the

first

3 weeks

reviewed

all

tween

1 and

5 was

dence

value

of 1 indicating

dition

was

“definitely

condition fidence

was values

“definitely of 2 or

type

present”

A confidence the presence

agnostic mality

criteria for were strictly

any overlap in vations. Where

for the were

each type of abnordefined to eliminate

the classification necessary, specific

classification For

example,

was defined opacification

volume

loss,

terstitial

with

component. category of increased

chymal mation.

destruction, Because

The included lung

of interstitial airway disease,

structive

pattern to the

gory

rather

category.

cluded or

previously

in the

described

making

their

tions

of the

were

then

decision

two

Concordance

defined

either

The

reviewing

evaluated

nator.

by the study

as exact

assigned

“obstructive

loss

areas

of linear

focal The

was

defined

Because granubomas such lesions

unless

they

772

were

Up to four

Radiology

#{149}

in one

level at either For example,

extreme if the first

a particular

rating

of 4 by

the

first

nodule” of the in were

lobes category

larger

nodules

prevalence

of

were

category

this

tween

joint

in

1 or 2 was

4 or 5 by the

which

all

di-

were

could not reference

were

subjected

excluded

from

in

Dunby a

able-

set.

nor the the receiver

operating characteristic (ROC) studies. The resulting test set consisted of 163 chest typical

radiographs of the

tered

in

our

that patient

and

were regarded as population encoun-

institution.

none

Of

of the would

these,

abnormalities have

been

99 remaining one or more

presented

64 dem-

being

in

Ta-

This

was

of abnormal were multiple of cases.

The

by test cases

greater

than

radiographs abnormalities

be-

abnormalities

ranged

in difficulty

to very subtle. radiographs

and

examinations

were included in the test set. The posteroanterior radiographs were obtained in an automated medium (Eastman

chest screens, Kodak,

room by OC film Rochester,

using Lanex , and a 12:1 grid NY). These

examinations were photo timed at 140 kVp with an average of 6 mAs. The portable examinations were obtained without a grid by using Lanex Regular screen with

OC film (Eastman

Kodak).

tons were

and

80 kVp,

was

per per

view

be-

the test

202.

num-

of

of abnormalities

anteroposterior

ples ples to

be satisfactorily to the diagnostic

are

Technical

an average

with

inch, line

of the

in the test set were X 4,096 X 12-bit digi-

a laser

film

digitizer

base

The

pixel

in

were

The

the

a one-on-one

To

record

particular

a 2,048

laser

interpolation

to map

were

format

X 2,048

printer

uses the

by

recorded image

images

43-cm Ektascan laser-sensitive man Kodak) by using a laser er.

with

X 12 bits 2,048 X then

to create

study.

size

0.08 mm. The image was

X 2,048 resulting

images tape

for the

recorded

a total of 4,096 samthe 35-cm field of

to 2,048 The

X 12-bit magnetic

(Ma-

NY). The was 0.08 mm, was 312 sam-

was therefore of each digitized

reduced averaging.

2,048 on

giving across

digitizer.

this digitizer matrix size then pixel

facexpo-

3 mAs.

tnix Instruments, Orangeburg, baser spot size for this unit and the sampling frequency

of the three parreviewer dis-

study coordinator participated

was

tab matrix

of the for any

resolved

of at beast two Cases in which

set

The

category

number

The radiographs digitized to a 4,096

discon-

discrepancies

test

total

set

sure

obser-

considered

the

the observers

consensus ticipants.

ed as normal. The tions demonstrated

of abnor-

of

and

were

session

than

categories

5 by

to be in were considered

of abnormality

onstrated tabulated

in

and

assignment

our patient popunot categorized 5 mm

a confi-

reviewer

in the

1 . The

portable

of the rating reviewer as-

Cases in which the observations two reviewers were discordant

Neither the two reviewers

scar-

cavitary

two

confi-

abnormality

reviewer

criteria

cate-

or more

the

ble

investigated.

of each

from quite obvious Both posteroanterior

vations of the reviewers were concordant were included in the test set without further review. To eliminate equivocal cases from the test set, all assignments of 3 in the face of any other assignment by the

crepancies viated with

as-

parenchymal

to include

calcified bation,

ameter.

disease”

“parenchymal

masses.

bulla forof a

were

by

disease

reflected

of the

of one

being

of examples

in a number

was

dence rating of 1 and the second reviewer assigned a rating of 2, the observations were considered concordant. Similarly, a

ing

fibrosis in obcases of an ob-

fibrosis

coordi-

agreement

normabities

the number cause there

radiologists

or differences

signed

in

observa-

of observations

ratings

dence scale.

criteria

classifications.

a joint review by the study coordinator and the two reviewing radiologists.

of eviparen-

than the “interstitial disease” The “atelectasis” category in-

volume

ring. or

with

alof

“obstructive cases volume,

re-

dant.

an in-

or bleb and of the frequency

component structive signed

or without

“probably

bers

second

“consoli-

as a process with without evidence

and

disease” dence

of obserrules

of abnormalities

formulated.

dation” veolar

di-

Cona

abnormality was “equivocal” or indeterminate. When possible the two reviewers were asked to describe the anatomic bocation of the observed abnormalities. The reviewers were required to refer to the

Cases

the

that

value of 3 mdiof a given type of

reviewer.

(Table

study,

a

the

present,”

other

abnormalities

and

that

was

reviewer

of the

present”

of process

if an

of nine

a confi-

the con-

present.” 4 indicated

by one

purposes

with

or “probably

spectiveby. cated that

on a category. value be-

of 5 indicating

made

the

nadiographs

that

discordant

or more

250

not

value

of

These

assigned,

nations 1). For

20 years

and recorded their observations worksheet listing each disease For each category a confidence

were considered Observations

demonstrating

then preradiologists,

than

the second agreement.

or examinations

radio-

radiography.

certified radiologist serving as the study coordinator. The cases selected represented either radiographicably normal examione

f

of February

From these folders, selected by a board-

a given

more

radiologists

reviewers

To create the test set used in this study, the radiographic records and film folders for patients undergoing chest examina1989 were reviewed. 250 radiographs were

had

in chest

confidence

METHODS

tions

of whom

on

experience

not

and imat 2,560

abnormalities

250 radiographs were to two board-certified

both

present

particular

X 2,048-

pixels to conventional for a variety of commonly

countered

ac-

observ-

2,048

were

graph. The sented

confidence

digital display system might be an ceptable alternative to the conventional methods of displaying chest radiognaphs (6). We have recently en performances

mality

on

data

also 35

X

film (Eastfilm recordimage

a cubic

2,048

this

spbine

image

to

the 4,096 array size used to generate the one-on-one printing format. When printing the images on film, window and bevel settings were adjusted to create two yensions

of each

interpret-

was

printed

examinaof the ab-

of 4,095 and higher-contrast

radiograph.

with

the

One

full

version

window

a level setting of 2,048. version of the image

September

width A was

1990

also generated

by using

a window

which

width

of 1,732 and a level setting of 2,288. The study coordinator then compared these two printed images with the original ra-

diograph and selected the image that most closely approximated the appearance of the conventional radiograph for inclusion

in the

test

set.

All other

baser-

printed images were discarded. The interactive display system used for this study is one of the earliest 2,560 X 2,048 X 12-bit systems available (Megascan Technology, Boston). The monitor is driven by an 8-bit 500-MHz digital-to-anabog converter. The frame buffer of the display is a 9-Mbyte memory capable of accommodating an entire 2,560 X 2,048 X 12-bit

image.

Two

of these

displays

were

interfaced to our HYPERchannel (Network Systems, Minneapolis) image management network with VME multibus. The display systems were placed in controlled

environments

where

room

light-

ing could be varied and where no light fell directly on the screen of the display monitor.

The

nonglare

surface

of the

dis-

play screen further reduced the effects of ambient lighting. Local memory for each display

system

chester

800 Mbyte

was

radiologist’s ferred

provided

request,

from

the

by a Win-

magnetic

disk.

At the

were

trans-

images

tape

archive

to the

Win-

chester disk via the HYPERchannel network. Because of the nature of the interface between the Winchester disk and the display

system,

1 5 seconds

were

required

to load each 2,048 X 2,048 X 12-bit image into the frame buffer of the display systern. In the future, this time will be neduced by using a VME bus rather than the emulator. Standard software techniques

were

zoom

used

to provide

electronic

and pan functions.

Six board-certified

radiologists

pantici-

pated in the review of the test set. Each radiologist read the full set of 163 cases. Each

participant

saw

one-third

presented

as conventional

one-third

as baser-printed

and one-third system.

with

Each

of the

film

images,

the interactive

participant

set

radiographs,

was

display

allowed

to

see each case only once. Consequently, each image in the 489-image data base was read only twice by independent observers.

The participants in two session,

reading

completed sessions.

During

the test set the

first

the conventional films and baserprinted images were placed on a standard film alternator. The conventional films and laser-printed images were presented randomly until each radiologist compbeted the hand-copy reading session. The second

reading

session

consisted

presentation of the remaining the test set with the interactive minimum

of 2 weeks

reading

sessions.

sessions

always

elapsed

The interactive followed

the

of the

portion of display. A between

the

display film/digital

hard-copy sessions. No reader saw the images of the same patient twice, so there was no bias due to repeated readings of the same case in different formats. The hard-copy reading session was very much like a standard-film reading session, Volume

176

#{149} Number

3

these

observers

experience

every

day. Consequently, any bias in the study is probably related to the unique features of reading from a high-resolution cornputer display with contrast manipulation available.

During the case review, no radiologist was presented with more than one yension of any given case. The order of presentation of normal and abnormal cases

was randomized for each reader. The readers were supplied with a notebook containing

response

case was identified

forms

only

in which

each

by case number.

The reader was asked to make independent responses for each of the nine types of abnormality on a five-point scale identicab to that used in establishing diagnostic consensus. The diagnostic criteria for

each category of abnormality were reviewed at the beginning of each reading session. Completion of the test set required each observer to assign 1,467 confidence values. Consequently, the data base for analysis of observer performance

comprised 8,802 observer responses. The six participants in the study were selected to reflect a range of experience with interactive digital display systems. Only board-certified radiologists participated in this review. None of the three radiologists involved in case selection was allowed to participate in the study or to be present during the reading sessions. During the review session only the reviewing radiologists and a nonradiobogist observer were present in the reading room. The robe of the nonradiologist observer was to ensure that the recording of responses on the response form matched the case being reviewed. Interaction between the reviewer and the observer was limited to instructions concerning the recording of data and the operation of the interactive display system. In reviewing the conventional and laser-printed films, the observers were encouraged to maintam their usual viewing habits and reading rates, modifying them only enough to allow time for completion of the response forms. No time limit was imposed for completion of any of the reading sessions. The observer response data were ana-

lyzed using ROC techniques (7). Because ROC techniques cannot be used to analyze multiple a composite

abnormalities on to generate curve for all abnormalities,

ROC curves

for each

of the nine

disease

processes were generated separately (8). For the purposes of this analysis, a case was considered negative if it was either normal or if there were abnormalities other

than

the

one

being

analyzed.

between

pairs

of responses

for each given case. The program also computes a comparison of various ROC curves by using the x2 test (9). With this

program,

the ROC analysis

describes

ference

ROC curves are completely by two statistically defined The “a” parameter is the dif-

between

the means

of the two

distributions divided by the standard deviation of the signal-plus-noise distribution. The “b” parameter is the quotient of the standard deviation (SD) of the noise distribution divided by the SD of the signal-plus-noise distribution. Given conrebated data sets, the Corroc2 program uses a bivaniate normal model to estimate joint probability densities. A x2 statistic with 2 df can be constructed from a covaniance matrix by using parameters estimated with Cornoc2. This statistic can then be

used

to compare

the significance

of ap-

parent differences between any two ROC curves (9). The statistical significance of the differences between the ROC curves was estimated by using three indexes. On the basis

of the

under

calculated

the

curve

and

icance of their with the paired second

values

for

their

SDs,

differences two-tailed

index

the

areas

the

signif-

was

evaluated test (10). The

t

of comparison

was

the

bi-

variate x2 test just described. The final index of comparison in our study was the calculated value of the true-positive fraction assuming a false-positive fraction of 0.185, designated of TPF185 as an

as TPF185. appropriate

The index

selection for

comparison two-by-two observer

was based on a conventional contingency-table analysis of responses. For purposes of this

analysis,

observer

responses

of

1 and

2

were assigned to the “normal” test result, while responses of 3 or greater were considered “abnormal.” Of the 8,802 total responses, 7,588 represented the negative population for each of the abnormalities tested

and

each

display

format.

For

this

negative population, there were 1,404 false-positive responses, giving a cumulative false-positive fraction for all abnormalities and all display formats of 0.185. In all cases, the hypothesis tested is that the index for one display modality is equal

to that

for

the

modality

with

which

compared. P values of .05 indicate that the compared indexes are different at the 95% confidence bevel. It should be noted that in any ROC analysis based on data from multiple observers, the SDs for the areas under the curves and for the calculated true-positive fractions reflect both interobserver and intraobserver variations. it is being

RESULTS

The

program used to perform the ROC analysis-Conroc2-uses a maximum-bikelihood estimation technique to calculate binormal ROC data, taking into account correlations

The resulting characterized parameters.

the

relationship of the decision variable to the experimentally determined noise and signal-plus-noise response distributions.

The reading usual

time

conventional sessions reading

limit

required the hard minutes.

room

was

and digital film place in the

took

environment.

imposed,

to complete copy ranged The interactive

modality

and

the

time

the reading of from 70 to 120 display

reading session took 50-120 for completion. The areas under the curve display

No

and

minutes

disease

Radiology

for each process #{149} 773

are shown in Table 2, along with the SD for each value. The values for the area under the curve for conventional film ranged from 0.806 for detection of consolidation to 0.982 for the detection of pneumothorax. For the digital images printed on film, the areas under the curves range from 0.805 to 0.981, again for the detection of consolidation and pneumothonax, respectively. The areas for the interactive display system ranged from 0.789 for detection of consolidation to 0.951 for the detection of panenchymal masses on nodules. The P values determined by applying the paired two-tailed t test to the area indexes are shown in Table 3. The P values determined by application of the bivariate x2 test to the panameters used to fit the ROC curves are presented in Table 4. Finally, the calculated TPF.185 values and the P values obtained by applying the pained two-tailed t test values are shown in Tables 5 and 6, respectively. In terms of comparative performance, the three display modalities were equivalent by all indexes of comparison for the detection of costophrenic angle blunting, atelectasis, consolidation, apical scarring, and hiban or mediastinal masses. For the detection of obstructive airway disease, the digital images recorded on film were significantly better than the intenactively displayed images when these modalities were compared with the x2 test. For the detection of pneumothonaxes,

under dexes

comparisons

of the

performance

with

Costophrenic

film this used

images, but the decrease depended for comparison.

significance of on the index For the detec-

tion of parenchymal masses, the digitab images-whether recorded on film on displayed-tended to outperform the conventional film images. For this abnormality, these differences were significant for digital hard copy using any index of companison, but for the interactive display only

the difference was significant when the TPF.185 index was

used. DISCUSSION Our

study

the performance #{149} Radiology

was

designed

to evaluate

of radiologists

using

for Each Display

Digital

Film

disease

Pneumothorax

Interstitial

disease

mass

Parenchymal

Table

.934(.020)

.907 (.022) .880(.025)

.888(.024)

.805(.053) .902(.038)

.789 (.044)

.913(.022)

.842

.857 (.034)

.892(.025)

.982(010)

.981

.797 (.052) .898(.041)

.919 (.019)

.884 (.021) .956 (.014)

(.023)

.863 (.046)

in parentheses

Note-Numbers

Interactive Display

.871

.918(.033) .826(.052)

mass

of

Class

.890(.031) .806 (.046)

HIlar/mediastinal

Obstructive

and

Film blunting

angle

Modality

Conventional

Atelectasis Consolidation Apical scarri.n

.910(.045)

(.013)

(.048)

.838 (.034) .951 (.020)

are SDs.

3 for Comparison

PValues

of the Calculated

Areas

Costophrenic

Digital

blunting

Digital Film vs Interactive

Film vs Interactive

Film

Display

Display

.65

.23

.37

Atebectasis

.79

.61

.82

Consolidation Apical scarnin

.99 .75

.79 .89

.82 .89

Hilar/mediastinal mass Obstructive disease Pneumothorax Interstitial disease Parenchymal mass

.12 .40 .95

.82 .33

.18

Table

angle

Curves

Conventional

vs Abnormality

the ROC

under

Conventional

.10 .05 .25 .84

.05 .04 .08

.21 .05

4

x2 PValues

Bivaniate

for Comparison

of Display

Formats

Conventional

Conventional

vs Digital Abnormality Costophrenic Atelectasis Consolidation

angle

Apical scarring Hilan/mediastinal Obstructive Pneumothorax

in-

tenactive display compared with conventionab and digital film images. For interstitial disease, the interactively displayed images again showed decreased performance relative to the conventional and digital

ROC Curves

the

Abnormality

area

the

2

Areas under Abnormality

the curves and the TPF185 inindicated a significant decrease

in observer

774

Table

Interstitial

Film blunting

mass

disease disease

Parenchymal

mass

Digital

Film vs Interactive

Display

Display

.39 .40

.46 .46

.36 .97

.41 .95

.86 .80

.22 .80

.12

.94

.12

.51

.39

.65

.12

.14 .05

.08 .16

.05 .14 .01 .56

three different diagnostic modalities for the detection of a spectrum of abnormalities commonby encountered in chest radiography. The detection tasks ranged from the relatively simple identification of pneumothoraxes to more difficult determinations such as the identification of areas of early consolidation. A number of normal examina-

gree

tions

multiple abnormalities ing multiple abnormalities

and

a number

Film

vs Interactive

of examinations

with multiple abnormalities were included in the test set because such cases represent a significant population in our practice. The inclusion of cases with multiple abnormalities was also intended to prevent the observers from identifying a particular type of lesion as the object of the study. An attempt was made to include a distribution of abnormalities in terms of difficulty of detection, although no rating of the de-

of difficulty

of the

cases

was

per-

formed.

No ROC techniques are available for generating a composite curve from a test set that includes multiple abnormabities. Consequently, it is necessary to compare the performance of the var-

ious display modalities for each specific disease process. A test set including

examination ysis

as long

or cases showon a single

can be used as the

for ROC

anal-

score

each

observers

abnormality independently, long as the disease categories overlapping or interrelated. structions

to the

observers

and as are not The inin this

study

were designed to meet these criteria. When performing the ROC analysis in a study tion must

of this type, particular be paid to the definition

September

attenof

1990

Table

5

True-Positive

Fractions

Assuming

a False-Positive

Fraction

Conventional Abnormality

Costophrenic Atelectasis

Apical

angle

blunting

scarring

Hilan/mediastinal Obstructive Pneumothorax Interstitial

mass

disease disease

Parenchymal

mass

Note-Numbers

for Comparison

(.044) (.062)

.830 .772

(.044) (.053)

.885 .790

.651

(.066)

.688

(.063)

.613(066)

are

.822 (.089)

.839 (.084)

.684

(.079)

.861

(.064)

.717

(.075)

.724

(.068)

.802

(.064)

.654

(.062)

.974

(.028)

.969

(.026)

.865

(.045)

.857

(.041)

.781

(.054)

.723 .917

(.050) (.037)

.936 (.035)

SDs.

of TPF.185 Determined

Abnormality

Film

angle

(.036) (.052)

.861 (.081)

Conventional vs Digital

Costophrenic Atelectasis

Display

.813 .749

.789 (.055)

in parentheses

Table 6 PValues

Interactive

Film

Film

Consolidation

of 0.185

Digital

blunting

with

Two-tailed

Conventional Film vs Interactive

Display

t Test Digital Film vs Interactive Display

.62 .81

.12 .62

.25 .75

Consolidation

.65

.65

.35

Apical

.62

.84

.84

.07

.69

.07

.28

.40

.12

.91

.02

.01

.18 .03

.03 .03

.38 .98

scarring

Hilar/mediastinal Obstructive

mass disease

Pneumothorax Interstitial

Parenchymal

disease mass

phor system and displayed on 2,560 X 2,048-pixel monitors. These investigatons concluded that the interactive display of chest images on high-resolution displays offers an alternative to the viewing of computed chest nadiographs in a hand-copy format. On the basis of this study, the penformance of 2,048 X 2,048 digital hardcopy

imaging

display

true-positive and true-negative examinations. Because pathologic or surgical confirmation is not available for many abnormalities detected with chest radiography, it has become standard practice to establish the truth status of a test case by a consensus of experienced observers and then to exclude these observers from participation in any other aspect of the study. This was the approach taken in the construction of our test set.

In an ROC study based on simultaneous interpretations for multiple abnormalities, care must be taken when assigning the “truth value” to cases demonstrating one or more abnormalities. In such a study there are three ways in which to define a “negative” examination. First, an examination may be defined as negative only if it shows none of the disease processes being tested. In this analysis only examinations showing the abnormality of interest and examinations that show no abnormality of any type are considered. This approach antifactualby shifts the

ROC curve

to the left and increases

the

area under the curve. A second alternative is to define a case as negative if it demonstrates a disease process other than the one of interest. In such an analysis, examinations showing no abnormality are not considered, and the comparison is between examinations that show the abnormality and examinations that show any abnormality othen than the one being analyzed. This approach artifactually shifts the ROC

Volume

176

#{149} Number

3

curve to the night, decreasing the area index. The third approach to the definition of negative is to include both normal cases and cases that do not show the abnormality being tested. This was the option selected for the present study and results in values for the area under the ROC curves that are intermediate between the upper and lower bounds established by the other methods. Regardless of the way in which the class of true-negative results is defined, all of the analyses reveal similar trends in terms of observer performance. Reports of applications of display systems with matrix sizes of approximately 2,048 X 2,048 are just beginning to appear. Hayrapetian et al (6) cornpared observer performance for the detection of septal lines and parenchymal nodules using conventional radiographs, digital hard copy, and 2,048line digital display with and without user interaction. On the basis of their findings, these investigators concluded that 2,048-line displays might be an alternative to conventional film in chest radiography. In a follow-up of this diagnostic study, Widoff et al (1 1) reported that the detection of simulated nodules placed in an anthropomorphic chest phantom using a 2,048 interactive display was comparable to that achieved with analog film. A third study reported by Frank et al (12) examined observer performance for cornputed chest radiographs obtained using a 2,140 X 1,740 X 10-bit storage phos-

is generally

to

equivalent

that of conventional radiography. The only exception to the trend toward statisticab equivalence-as determined by means of the area index-is the detection of panenchymal masses. For this task, the digital hand copy shows significantly improved performance cornpared with the analog film images regardbess of the index of comparison used. This improvement in observer performance for detection of parenchymal nodules is likely related to the improved rendition of contrast information by the digitized hard copy. With regard to the performance of the interactive display relative to anabog film on digital hard copy, the area and TPF185 index comparisons show that the conventional nadiognaphs were significantly better than the interactive for

the

detection

of

interstitial

lung disease. The finding that observer performance for the detection of interstitial disease was equivalent for digital hand copy and analog radiography, while the performance of an observer using the interactive display decreased significantly, suggests that the decrease is due to the nature of the interactive display

system

rather

than

being

an

ef-

fect rebated to the digitization of the images. We believe that this detenioration in performance is related to the unfamiliarity of our observers with soft-copy display of chest radiographs. Because no training sessions were used, the participants necessarily applied the diagnostic critenia used for conventional radiographs in making their assessments. Because of the improved contrast rendition of the digital format compared with that of conventional film, minor adjustments in the window and level parameters of the interactive display tend to accentuate the interstitiab markings. Consequently, the ROC curve for the interactive display is shifted to the night and the area index decreases. Another effect observed in this study was a twofold increase in the time required for completion of the interactive reading sessions compared with that for the hard-copy sessions. This increase was due largely to the interactive adjustment of image contrast and brightness. Effects of this sort are reduced or eliminated as observers become more experienced with the display system and as the user interface is improved.

For

the

detection

of pneumothoraxes,

Radiology

#{149} 775

the conventional film and digital hard copy are found to be equivalent, while comparisons of the area and TPF.185 indexes indicate a significant decrease in observer performance for the interactive display. That observer performance for the digital hard copy is equivalent to that for conventional film again suggests observer unfamiliarity with the interpretation of images when using interactive display features. Furthermore, there is some inherent loss of edge definition because of the raster scanning operation of the monitor. This loss of definition undoubtedly contributes to diagnostic errors in tasks requiring accurate definition of linear features oriented obliquely or perpendicularly to the raster lines of the video monitor. Theoretically, this effect may be at least partially eliminated by the use of image magnification and edgeenhancement algorithms. The use of the bivariate x2 test mdicated other significant differences in the observer performance with the three diagnostic systems. For the detection of obstructive airway disease, there is a significant decrease in observer performance for the interactive display compared with the digital hard copy. Two factors are likely to account for this finding. First, the uniform contrast rendition of the digital hard copy allowed greater sensitivity to regional differences in image contrast. On the other hand, the interactive window and level features of the display tend to confound the observers’ interpretation of these contrast differences. Results of the x2 test also indicated a significant decrease in performance with the interactive display compared with the digitized radiographs in the detection of interstitial disease, as well as a significant decrease in the performance with conventional film compared with digitab hard copy for the detection of parenchymal masses and nodules. Several objections to the use of the area and the X2 indexes for comparison of ROC curves can be made (13,14). Principal among these is the contnibution to the area index by the region of the curve at high false-positive fractions. For comparing ROC curves that cross or have similar configurations in the region of high false-positive fractions, the area and x2 indexes are perceived as relatively insensitive to local variations in observer performance in the range of false-positive fractions encountered in the clinical setting. For this reason, comparison of the calculated true-positive fractions at a selected

776

#{149} Radiology

false-positive true-positive

fraction fraction

or of the over

a range

average of

clinically relevant false-positive values has been advocated by some investigatons (4,7,14). As the final comparison of observer performance in our study, the selection of the TPF185 was based on the calculated average false-positive fraction for all abnormalities and all display formats tested rather than on an arbitrary assessment of a “clinically relevant” false-positive fraction or range of false-positive fractions. One difficulty with the display systern used for this study was related to the phosphor system selected for the monitor. The original version of the 2,560 X 2,048 monitor used a white phosphor (#P167). This resulted in a noticeable orange tint that tended to distract the observers. The output of the monitor was also characterized as “dim” compared with the back-bit conventional and digital hard copy. The manufacturer has recently changed the phosphor to provide a blue tint (#P104) similar to the tint used in the base of many radiographic films. The luminance of the monitor has also been significantly increased. We believe that these improvements will significantly enhance performance with the interactive display. Our study was designed to accentuate any differences in observer performance between conventional radiographs, digital hard copy at 2,048 X 2,048 pixels, and interactively displayed 2,560 X 2,048 X 12-bit images. Our results suggest that for certain abnormalities, the performances with the three display formats are not equivalent. Our findings do indicate that for all abnormalities tested, the digital hard copy performed as well as or betten than conventional film. We also found that in some instances, the performance of the interactive display systern failed to match that of digital hard copy or of conventional film (15). Although the causes of these differences can probably be reduced or eliminated by further experience with the display system and by applying image enhancement, it is premature to conclude that the present generation of 2,560 X 2,048 displays can produce images equivalent to those on conventional film for all detection tasks. It is equally premature to conclude that the interactive display of images at 2,560 X 2,048 pixels offers no advantage over conventional film, since observers using the interactive system show significant improvements in performance for certam tasks, such as the detection of parenchymal nodules. U

References 1.

Chakraborty DP, Breatnach ES, Yester Soto B, Barnes GT, Fra.zer RG. Digital conventional chest imaging: a modified

ROC study

2.

of observer

simulated nodules. 158:35-39. Foley WD, Wilson

performance

Radiology CR,

MV, and

using

1986;

Keyes

CS, et al.

The effect of varying spatial resolution on the detectability of diffuse pulmonary nodules: assessment with digitized conventional radiographs. Radiology 1983; 141:25-31. 3.

Goodman LR, Foley WD, Wilson Rimm AA, Lawson TL. Digital ventional chest images: observer

CR, and conperfor-

mance with film digital radiography tem. Radiology 1986; 158:27-33. 4.

Lams PM, requirements

Cocklin ML. for digital

Spatial resolution chest radio-

graphs: an ROC study of observer mance in selected cases. Radiology 158:11-19. 5.

MacMahon

H,

K, Sabeti

Vyborny

V. Solomon

CJ,

SL.

observer performance. 158:21-26.

7. 8.

Hayrapetian A, al. Comparison play formats in an ROC study. Metz CE. ROC imaging. Invest Rockette

HE,

perfor1986;

Metz

CE,

Digital

raphy of subtle pulmonary an ROC study of the effect

6.

sys-

Doi

radiog-

abnormalities: of pixel size

Radiology

on

1986;

Aberle DR. Huang HK, et of 2048-line digital disconventional radiographs: AJR 1989; 152:1113-1118. methodology in radiology Radiol 1986; 21:720-733.

Gun

D, Cooperstein

LA,

et al.

Effect of two rating formats in multi-disease ROC study of chest images. Invest Radiol

9.

1990;

25:225-229.

Metz CE, Wang P-L, Kronman HB. A new approach for testing the significance of differences sured from

between correlated

F, ed. Information imaging.

The

ROC curves meadata. In: Deconinck

processing

Hague:

in medical

Martinus

Nijhoff,

Probability

and

Englewood

Cliffs,

1984; 431-445. 10.

Miller

I, Freund

tistics

for

NJ: Prentice-Hall,

11.

12.

13.

14.

Widoff

1965;

B, Aberle

sta-

166.

DR. Brown

K, et al.

Hard copy versus soft copy display of 2,000 digital chest images: ROC study with simulated lung nodules (abstr). Radiology 1989; 173(P):401. Frank MS. Jost RG, Blame GJ, Moore SM, Whitman RA, Hagge R. Interpretation of mobile chest radiographs from a high-resolution CRT display (abstr). Radiology 1989; 173(P):401. Habicht JP. Assessing diagnostic technologies (abstr). Science 1980; 207:1414.

Hanley istic

15.

JE.

engineers.

JA.

(ROC)

Receiver methodology:

operating the

characterstate

of the

art. CRC Rev Diagn Imaging 1989; 29:307335. Slasky BS, Gun D, Good WF, et al. Receiven operating characteristic analysis of chest image interpretation with conventional, laser-printed, and high-resolution workstation images. Radiology 1990; 174:775-780.

September

1990

Chest radiography: comparison of high-resolution digital displays with conventional and digital film.

This study was performed to compare the performances of observers using three display formats for chest radiography. The display formats were conventi...
1MB Sizes 0 Downloads 0 Views