Or iginaI Research

Routine Quantitative Analysis Spaces with MR Imagihng' Ron Kikinis, MD Martha E. Shenton, PhD Guido Gerig, PhD John Martin, MS Mark Anderson, BS David Metcalf, BS Charles R. G. Guttmann, MD Robert W. McCarley, MD William Lorensen, MS Harvey Cline, PhD Ferenc A. Jolesz, MD A computerized system for processing spin-echo magnetic resonance (MR)imaging data was implemented to estimate whole brain [grayand white matter) and cerebrospinal fluid volumes and to display three-dimensionalsurface reconstructions of specified tissue classes. The techniques were evaluated by assessing the radiometric variability of M R volume data and by comparingautomated and manual procedures for measuring tissue volumes. Results showed [a)the homogeneity of the MR data and ( b )that automated techniques were consistently superior to manual techniques. Both techniques, however, were d e c t e d by the complexity of the structure, with simpler structures (eg, the intracranial cavity) showing less variability and better spatial correlation of segmentation results between raters. Moreover,the automated techniques were completed for whole brain in a fraction of the time required to complete the equivalent segmentation manually. Additional evaluations included interrater reliability and an evaluation that included longittidinalmeasurement, in which one subject was imaged sequentially 24 times, with reliability computed from data collected by three raters Over 1 year. Results showed good reliability for the automated segmentation procedures. lndextemu: Braln. MR, 10.1214

Cerebrosplnal fluid, MR. 10.1214 * Comparative studles Image display * Image processing * Three-dlmensional imaging 0 Volume measurement

JMRI 1982;2:619-629

AbbrevhtIom: CSF = cerebrospinal fluld. ICC = intracranial cavity. ROI = region ofinterest. S D = standard deviatlon. 3D = three-dimensional.

MOST MAGNETIC RESONANCE (MR) imaging assessments of pathologic changes in the brain have relied on subjective visual interpretation of two-dimensional cross sections. Such interpretations are limited because (a) they do not allow accurate quantitative analysis, (b) boundary definitions are inexact, and ( c )a pixel-by-pixelanalysis is not feasible. The superior spatial and contrast resolutions of current MR images, however, make it possible to quantify pathologic changes in the brain, changes that are evident in many brain disorders. A prerequisite for quantifying and visualizing morphometric changes in whole brain, or in specffic tissue, is to be able to reliably identify the relevant structures. We have recently developed a computerized method of segmentation ( 1) based on a multistep procedure. A crucial element of this procedure is the optimization of the MR imaging data set (2), including the application of a filter to reduce noise (3).A twostage multistep segmentation procedure that includes volumetric analysis of specified tissues and/or structures is then applied. Finally, three-dimensional (3D) renderings from surface models are generated for the evaluation of morphometric features. Herein, we report the reliability data obtained with these procedures (includingnewly developed, semiautomated generation of a mask for identifying the intracranial cavity [ICC])in a subset of cases in which we compare automated techniques with manual measurements for both interrater and intrarater reliability estimates.

'

From the Departments ofRadlology1R.K.. J.M.. M.A.. D.M.. C.R.G.G.. F.A.J.) and Psychiatry [M.E.S.. R.W.M.I. Harvard Medical School. Brigham and Women's Hospital. 75 Francis St. Boston. MA 02 115: the Communication Technology Laboratory. Image Sclence Division. ETH. Zurich. Switzerland 1G.G.): and GE Corporate Research and Development Center, Schenectady, NY 1W.L.. H.C.). Recelved Aprlll3, 1992: revision requested April 14: revision received and accepted September 4. Supported in part by grants from the SWISSNational Foundation 1R.K.. C.R.G.G.1: by a Research Scientist Development Award from NlMH lK01-MH00746051. a Milton grant. and a Scottlsh Rite grant (M.E.S.): by National Inslitules of Health grants PO1 CA41167.5-KO4-NS011083.and 2 PO1 AC04953. and by a grant from NYNEX 1F.A.J.l: by the Department of Veterans M a i r s Medical Research Service and NlMH grant 40.799 IR.W.M.1: by the Theodore Vada Stanley Research Award (M.E.S.. R.K.) and the Whitaker Foundatlon 1R.K.I:and by Swiss National Science Foundation grant 4018-1 1082 lC.R.G.G.1. Mdreie reprfnt requests to R.K. SMRI. 1992

619

0

MATERIALS AND METHODS Patient

Subjects The data came from several sources, including ( a )an MR imaging study of a patient with multiple sclerosis, ( b ) a neurosurgical patient selected for surgical planning, and ( c ) 15 healthy male control subjects evaluated prospectively in an ongoing study of schizophrenia. The average age of the control subjects was 38 years (range, 23-54 years).All were screened for drug and alcohol abuse and for psychiatric disorders: no subject was receiving medications with known effects on brain volume (eg, steroids) (41. D a t a Acquisition Protocol All brain images were acquired with the same 1.5-T Signa system (GE Medical Systems, Milwaukee).A double-echo spin-echo acquisition, covering the whole brain, was performed in the axial plane. The section thickness was 3.0 mm, and sections were acquired contiguously (no gap) by combining two interleaved sequences in the individual acquisitions. Half-Fourier sampling (0.5excitations) at 54 section locations was done in 12 minutes with 192 phase-encoding steps, TEs of 30 and 80 msec, and a TR of 3,000 msec. The field of view was 24 cm. To reduce flow artifacts, we used a gradient-moment-nulling flow-compensation technique (2.5).See Image Acquisition in Figure 1.

Zmage Processing The data were transferred through an Ethernet connection to our Sun workstations (Sun Microsystems, Mountain View, Calif),where images were processed. The techniques used were based on a multistep approach (1,6,7).Briefly, the gray-scale images were translated into label classes, in which each label represented an entity defined by the operator, such as gray matter, white matter, ventricular and/or subarachnoidal cerebrospinal fluid (CSF),and lesions. The operator defined the seed points for each tissue class, thus providing the initial information. It is important to emphasize, however, that the actual classifications and resulting label maps were done automatically (see below). These label maps, in turn, were easily accessible for computer analysis, which resulted in volume determinations for the different label classes and/or 3D reconstructions derived from them. Noise reduction with an anisotropic dflusionfilter (3).-A filter was applied to each set of images to reduce noise without blurring fine morphologic details (see Image Processing in Fig 1 ). This filter was based on the simulation of anisotropic diffusion of heat originally reported by Perona and Malik ( 8 )and subsequently adapted for double-echo MR imaging by Gerig et a1 (3,6).Two user-specified parameters were necessary for this implementation: ( a )the number of iterations and ( b )the threshold level to distinguish noise and real signal ( kvalue). These two parameters were determined empirically by repeatedly applying the filter at different settings to a set of images and by selecting the image with the optimal parameter combination, which, in the present study, was found to be three iterations and a k value of 8. These parameter values were then applied to all data sets. Figure 2a 620

JMRl

November/December 1992

Image Acquisition

\

Image Processing

Magnet Field Strength Data Acquisition Protocol -Slice Thickness -Field of View

Preprocessing Filter

Segmentation -Supervised Multivar Analysis -Classification -Grouping

-Pulse Sequence

-1 D = linear -2D = area

I

4

Figure 1. Flowchart gives a n overview of the data acquisition and processing procedures used in the Surgical Planning Laboratory. Multiuar. = multivariate, I D = one-dimensional, 2 0 = two-dimensional.

and 2b show unfiltered double-echo gray-scale images, and Figure 2c and 2d show the filtered images for this image pair. Note the improved quality of the latter images. Supervised segmentation into tissue classes.-After application of the filter, a segmentation algorithm, based on a multivariate analysis, was used to differentiate tissue classes. In using this algorithm, the operator begins by providing sample points selected from a corresponding set of gray-scale images (about 20 sample points per tissue class) that are then used to calculate a classificator with a nonparametric statistical algorithm (k,, [ nearest-neighbor supervised classification]).This classificator basically represents a two-dimensional lookup table for efficiently assigning the most probable category to the double-valued measurements of each voxel. If the operator is not satisfied with the results, additional "training" points can be picked from either of the two gray-scale images or from the label maps, and the classificator can then be recalculated. Generally, this last step was not necessary for an experienced operator. Figure 3 shows the color-coded display of the different tissue classes. Generation and application of a mask of the ICC. -By using our acquisition parameters, we found several structures outside the ICC that had signal intensity distributions similar to those of some structures within the ICC (eg, the orbits were classified a s CSF). To address this problem, we generated a mask of the ICC by first changing all labels that represented gray matter, white matter, and CSF into one class and then labeling everything else as background. The area that represented the ICC was then eroded (6)to break connections that might exist between the ICC and extraneous structures (eg, the optic nerve that connects the ICC with the orbits). Supervised connectivity was then used to remove the extraneous structures, and a dila-

Figure 2. Original grayscale images before (a. b) and after (c, d) filtering show an axial section at the level of the orbits in a patient with multiple sclerosis. First- (a. c ) and secondecho (b,d) images (protondensity- and T2-weighted, respectively). Comparison of white matter on the unfiltered and filtered images demonstrates the removal of the salt-and-pepper texture and enhancement of the boundaries between tissues on the latter. C.

d.

tion was subsequently performed to reverse the erosion of the edited ICC. To remove the holes left by vessels, which were generally classified as background (when vessels were not selected as a tissue class), a modified connectivity algorithm was used. Application of the mask to the segmentation results.--The mask generated in the previous step was applied to the segmented images, and all labels outside the mask were reset to the background value. Supervised connectivityfor additional classes.-

Where necessary, additional classes, such a s ventricular versus subarachnoidal CSF, were identified interactively by applying the connectivity algorithm to specified voxels. Generation of 3 0 reconstructions and determination of the volume of the dgerent label classes.-By using the dividing cubes algorithm, surface models of the different tissue classes were generated ( 9 , l O ) . These models were then interactively evaluated alone and in relation to other tissue classes. Volumes were Volume2

Number6

JMRl

621

Figure 3. User interface for sampling. The images in the top half are the first- (left)and second-echo images. The result of the segmentation is shown in the lower-leftpanel, and the classificator of feature space is shown in the lower-right panel, with actual sample points that were used to calculate the map (blue = CSF, yellow = white matter lesions, pink = skin. and gray = gray matter).

obtained by adding up the number of single voxels in each label class. This number was multiplied by the volume in milliliters of each voxel to obtain the volume of each label class in milliliters (Fig 1).Figure 4 illustrates the final segmentation of brain into tissue classes.

Specifc Analyses Radiometric Variability of MR Volume Data A classification based on absolute signal intensity values assumes that a given tissue class yields voxels with constant values. To test radiometric variability, we chose white matter areas as our reference class because these areas represent large regions throughout the data sets. At different locations, within sections and in different sections, we selected regions of interest (ROIs) representing white matter areas. The statistical parameters assessed, therefore, reflected a mixture of the variability in anatomy (white matter is not completely homogeneous), a human factor introduced by interactive selection of the ROIs, and radiometric distortion. Reliability of Image Processing Measurements To critically evaluate our automated segmentation approach, we ( a )correlated the automated segmentation results with manual measurements performed by the same five experienced operators (R.K., M.E.S., and three raters who had spent more than 1 year segmenting brain images with the automated techniques used in our laboratory) in one section showing the largest body of the lateral ventricles, ( b )assessed results of automated segmentation performed by three 622

JMRl

.

November/December 1992

operators in a single case (54 section levels, with two sections [first and second echo] per level), ( c ) examined overall reproducibility by imaging the same subject 24 times and segmenting the data set, and ( d )analyzed the volume of whole brain (gray matter and white matter) and CSF in a small but carefully defined group of healthy, right-handed male control subjects ( n = 15)whose MR images were collected prospectively from the general population in the Boston area. The methods used for the four evaluations are described in detail below. Supervised automated segmentation compared with manual measurement in a single section.-To assess interrater accuracy and to reference our automated measurements to another available method, a section pair (first and second echo) was selected at the level of the cella media of the lateral ventricles in a healthy volunteer. The five experienced raters each then independently traced the outline of the ICC, the brain surface, the surface of the white matter, and the CSF on the same filtered image (T2-weighted image). For automated segmentation, the five raters selected training points on the same image for gray matter, white matter, and CSF, and then the segmentation algorithm was used to compute the surface area encompassed by each of the tissue classes. The ventricles were identified with a connectivity algorithm. To further assess statistical and/or systematic error sources, the boundary length for each tissue class was calculated. On a binary raster image, a boundary is determined by a continuous series of "cracks" between adjacent pixels. To correct for the staircase effect of diagonal boundaries, steps were interpolated by connecting the centers of neighboring cracks (poly-

a. b. Figure 4. Final result of the segmentation procedure. (a) Section that has been segmented and separated from bone is shown. Gray matter is shown in gray, white matter in green, white matter lesions in yellow, and CSF in blue. (b)Three-dimensional reconstruction derived from all sections in the brain. White matter lesions around the ventricular system are shown in yellow,

skin in brown, gray matter in gray, and CSF in violet.

gon-fit procedure). Because of the limited resolution of the raster images and the binary decision process used to assign a category to each pixel, we expected the segmentation error to increase with decreasing size and increasing complexity of the tissue class boundaries. The numbers in Figure 5b accorded with increasing complexity of the boundary line, with 1 being the least complex. The complexity of an object can be expressed as a percentage by comparing the boundary length (P) of an object with its area (A) by the following: C = [P/A) x 100.In a circular object with radius r, C varies with 2 / r . This results in small values for large objects and large values for small objects. C can be described as the relative boundary length per pixel. If an algorithm results in an inaccurate estimation of the boundary pixels, C expresses the rate of change for the area calculation. Small objects or objects with a complex boundary are thus more likely to be affected by changes in the boundary pixels than are large, compact objects. Supervised automated segmentation in a single subject by three raters-For further estimation of reliability, a single study consisting of 108 sections was analyzed by three raters. The segmentation was performed for brain, white matter, gray matter, and CSF, and the reliability for the volumes of each of these tissue classes as determined by the three raters was calculated. Longitudinal analysis of the s a m e brain.-Data from a female patient were initially obtained at weekly intervals over 8 weeks, then biweekly, and later monthly for a total of 24 examinations. This represents the most rigorous test of the overall reliability of our measurement system, because all components of

analysis and measurement were involved. This patient was studied over a period of l year, and thus we have data that reflect the stability of the MR imager and data that show the reliability of image processing performed by three different raters (the three raters evaluated nonoverlapping examinations, n = 8 for each). Application of the method to a small, well-defined group.-Data from a prospective study of 15 healthy, right-handed male control subjects were used to illustrate the kind of information that can be obtained with these procedures. Volumes were determined for whole brain, gray matter, white matter, CSF, subarachnoid CSF, and ventricles. The ventricles were separated further from subarachnoid CSF by applying a connectivity algorithm to the classified images. Three-dimensional reconstructions of specified tissue were then rendered (see below). Three-dimensional reconstructions.-In all 15 cases, 3D reconstructions of the brain surface were generated. In selected cases, additional 3D reconstructions were generated for all available tissues.

RESULTS Radiometric Variability of the MR Volume Data Our first goal was to determine the reliability and consistency of our imager. Accordingly, we measured whether the same tissue had the same signal intensity in different locations in one representative section and in different sections. This was done by selecting ROls in white matter from different quadrants of the image. Thus, at different locations within sections and in different sections, we selected ROIs representVolume2

-

Number6 * JMRl * 623

Cornplexlty v s -A Overlap

,

0

I

0

20

40

M

80

120

100

~p(Critv

a.

b. Figure 5. lnterrater reliability. [a) Example of segmentation result used for the analysis in b and in Table 3. Subarachnoid CSF is shown in light blue, gray matter in brown, white matter in yellow, and ventricles in dark blue. (b) Increasing tissue complexity (see text) is plotted versus percent overlap for four of five raters. Five experienced raters determined the areas of the same structures with supervised multivariate analysis and manual measurements. There was better overlap for each of the six structures with the automated procedure than with manual segmentation. In addition. automated segmentation provided more complexity in the more complex structures. reflecting a more consistent handling of the partial-volume data. 1 = ICC. 2 = brain, 3 = ventricles, 4 = white matter. 5 = gray matter, and 6 = subarachnoid CSF.

ing white matter exclusively. The multivariate statistical parameters within single ROIs and of the complete population of ROIs were compared. These values are shown in Tables 1 and 2. The results demonstrate the excellent homogeneity of the MR data. (Head phantom data, collected to determine the validity of our measurements, have previously shown that the measurement error for brain parenchyma is 4%-6% [ 11 ].)

Reliability of Image Processing Measurements Supervised multivariate analysis compared with manual measurements.-For both the automated and manual measurements, the following areas were determined: white matter, gray matter, the ICC, the ventricular system, and subarachnoid CSF. For the automated procedures the five raters selected training points for the tissue classes (see Materials and Methods), and for the manual measurements the five raters used the cursor to draw a line that followed the boundary of each tissue class. Composite structures were determined by adding up the areas of their constituent parts (eg, brain = gray matter + white matter). Figure 5 shows the computed measurements for gray matter, white matter, and CSF based on the automated results from the five raters and the manual measurements done by the same raters. Because the absolute size of these structures varied greatly, the data were normalized by using the average of all 10 measurements from each structure and then calling this average the 100%value for the ICC, with which the individual values were then compared. A s 624 * JMRl * November/December 1992

Table 1 Radiometric Variability of M R Volume Data in One Imagine Section ~~

Mean Volume (mL) Location Upper left Upper ri ht Lower d t Lower right Equally distributed

No. of Voxels

First Echo

Second Echo

01

20 20 20 20

477.1 487.5 483.6 510.7

220.4 201.9 234.7 230.7

30.7 20.5 22.3 23.3

7.2 7.1 7.6 9.5

200

487.2

216.1

26.5

15.1

a2

Note.--nl and u2 are the standard deviations (SDs) (in milliliters) for the first and second echoes, respectively. Samples of white matter signal intensity were obtained in the various locations. There is an increase in the firstecho mean value in the lower-right quadrant of the image and a decrease in the second-echo mean value in the upper-right quadrant. Compared with the values of the equally distributed voxel population, the maximal changesare+23.5(+4.8%)and -10.1 (-2.l%)forthe first-echovaluesand +18.6 (+8.6%)and-14.2 (-6.6%) for the second-echo values.

expected, the scatter between the raters increased with increasing complexity of the structures (Fig 5 and overlap in Table 3 ) .That is, deviations between methods (error) increased with increasing complexity. However, although there was clearly a methodical error-in particular in the more complex tissues-

Table 3 Complexity Analysis Tissue Class Manual segmentation ICC Brain Ventricles White matter Gray matter SAS Supervised segmentation ICC Brain Ventricles White matter Gray matter SAS

Area [pixels2)*

[%I*

19,655.4 15,914.8 1.528.6 8,169.6 7,745.2 2,212.0

0.9 2.2 3.6 9.7 7.7 18.6

555.9 1.874.2 256.7 1,488.5 3.180.3 1,870.2

2.8 11.6 16.8 18.2 41.1 84.5

99.3 95.9 93.9 86.2 80.4 63.4

19,814.8 17,369.4 1,518.6 8,595.6 8,773.8 926.8

0.5 0.8 2.1 6.6 6.5 15.5

608.6 1,611.8 278.4 2,853.5 4.252.1 1,083.6

3.1 9.3 18.3 33.2 48.5 116.9

99.6 99.4 99.5 92.4 92.6 81.8

SD

Perimeter [pixels)t

Complexity

Overlap

[%I+

[ 2 4 raters) (%)

Note.-The boundary was extracted as a series of cracks between background and object structure and approximated with a polygon fit. These data are represented graphically in Figure 5b. SAS = subarachnoid CSF. The areas are given as the mean and S D of five segmentations. The perimeters were measured in one case. Complexity is equal to the perimeter divided by the area, x 100.

;

~~~~~~~~~~~~~~~~~

Table 2 Radiometric Variability of M R Volume Data from Different Imaging Sections Mean Volume (mL)

Axial Section Pair

No. of Voxels

First Echo

Second Echo

cr1

02

1 2 3 4 5 6 7 8

200 200 200 200 200 200 200 200

491.9 496.3 483.0 479.5 487.7 487.2 492.2 484.0

223.3 249.9 219.7 213.0 219.2 216.1 228.1 226.2

32.5 19.6 37.5 30.5 29.0 26.5 15.4 21.0

17.7 10.9 23.5 20.6 17.8 15.1 14.8 14.7

Note.-ul and a2 are the SDs (in milliliters) for the first and second echoes, respectively. This table and Table 1 allow comparison of mean values of different sample groups within white matter throughout the data set.

the variability within one method was relatively small (good interrater reliability). A measure of complexity was also used to compare the automated and manual ratings. A s described in the Materials and Methods section, the boundaries of the segmented objects were extracted and approximated by the polygon-fit procedure. Table 3 lists the areas, boundary lengths, and complexity values. The results are different for manual and automated segmentations but nonetheless show the same trends: Complexity increases from the ICC, which is the least complex, to white matter, gray matter, and subarachnoid CSF, which is the most complex. This order of increasing complexity is also reflected by the SDs and the measurements for degree of overlap (ie, with increasing complexity, the SD becomes larger and the degree of overlap becomes smaller. The correlation of SD and overlap with C can be expressed numerically: The correlations between SD and complexity were r =

.927 for manual and r = .985 for automated segmentation, and the correlations between overlap and complexity were r = - .979 for manual and r = - .976 for automated segmentation. These results illustrate that the reliability decreases for objects of complex shape. On the basis of these results, it is possible to define confidence boundaries for detecting statistically significant differences between tissues. A closer examination of the pixel-by-pixel overlap between manual and automated segmentation, however, shows that although both methods resulted in a similar number of white matter voxels being classified (8,169manual vs 8,595 automated),setting our criteria to a minimum of four raters obtaining identical classifications per pixel resulted in 86.2% overlap of pixels for manual and 92.4%for automated techniques (Table 3 ) . Moreover, although selecting the training points for the segmentation was done in one section, this information can be used to segment the entire brain (all sections). In contrast, the manual measurements, which took approximately 2 hours to complete as opposed to 10 minutes for the computed measurements, were for only one section. Another 60-80 hours would have been needed to complete manual measurements for all sections. Supervised automated segmentation o f a whole brain data set by three raters.-Table 4 lists the results of analysis of the same whole brain data set by three raters. As for a single section pair (Table 31, brain and ICC could be measured with high reliability, whereas measurements of white matter, gray matter, white matter lesions, and CSF showed greater variability, though still quite acceptable reliability. Longitudinal analysis of the same subject over several weeks.-Figure 6 plots ICC and brain volumes derived from measurements over time in a single female patient. We found that the volume of the ICC, which is anatomically bounded mostly by bone and should accordingly not change, was very stable (SD was 1.2% of the mean). Brain volume was found Volume2

Number6

JMRl

625

Table 4 Interrater Variability for Three Raters Volume (mL) ~~~~

Tissue Class First run ICC Brain CSF White matter Gray matter WML Second run* ICC Brain CSF White matter Gray matter WML

~

~

Error Bounds [%)

Rater 1

Rater 2

Rater 3

Mean

1.495.8 1.342.8 129.7 601.1 74 1.7 11.7

1.486.9 1,333.0 132.5 661 .O 672.0 10.7

1,474.3 1,358.6 101.6 853.2 505.4 7.1

1.485.7 1,344.8 121.3 705.1 639.7 9.8

+9.3.-16.2 +21.0,-14.7 15.9,-2 1.O +19.0.-28.0

1.448.0 1.302.2 145.7 613.1 679.4 9.7

1,435.2 1.283.1 152.1 637.7 630.8 14.6

1.44 1.3 1.301.1 140.2 679.1 614.4 7.7

1.441.5 1.295.5 146.0 643.3 64 1.5 10.7

+0.4.-0.4 +0.5,- 1.O +4.2, -4.0 +5.6.-4.7 +5.9,-4.2 +37.1.-28.2

+0.7.-0.8

+ 1.o.-0.9

+

Note.-WML = white matter lesion. * Results obtained by same three raters, using same data set. 3 months after first run.

to fluctuate slightly more over the same time (SD was 6.34% of the mean). This correlates well with the higher complexity value for the structure of the brain. This consistency of measurements over time was obtained even though three different raters segmented the data sets. Application ofthe method to a small, well-defined group.-Table 5 lists the volumetric results from the 15 healthy control subjects. The trend that we found in the single sections (see above) was repeated here. The scatter, as reflected in the SD, increased from 6% for the ICC to 28% for the subarachnoid CSF. Because the volumes were determined in different subjects, the natural scatter within the group added to the methodical error. Three-dimensional reconstructions.-Our 3D reconstructions of the skin surface and brain were generated from the same segmentation results used to calculate the tissue volumes [Table 5).The surface anatomy is shown in Figure 7.

DISCUSSION In recent years, the quality of MR imaging data has improved considerably. This is because advanced features in the hardware and software of commercial imagers have provided more spatially homogeneous data and a better signal-to-noise ratio. Such improvements have allowed us to optimize MR imaging acquisition parameters and to apply multichannel image processing to MR data sets, neither of which would have been possible without such improvements. We have now successfully applied these newly developed image processing techniques to more than 365 routine clinical MR studies of the brain. Herein we have reported an analysis of the reliability and reproducibility of the method in a subset of these cases. With few exceptions (6).earlier attempts have not explicitly used the concept of multiple steps but have relied on one or a few algorithms. These earlier attempts can be divided into two types: ( a )single-pixelbased methods (eg, 12) and ( b )neighborhood algorithms (eg, 13). 626

JMRl

November/December 1992

550000 .. 500000 .~ 450000 ~. 400000

~~

350000

Single-pixel-based classification of MR images, such a s by windowing or multivariate analysis, failed because of the inhomogeneity of the available data ( 1416). Similarly, attempts to apply neighborhood algorithms to gray-scale data did not result in automated procedures. Different edge-detection schemes have been used, but thus far all have relied on substantial user interaction ( 17) and/or supervised edge tracing (13,15,18,19).Others have used interactive manual segmentation (20,21) or a combination of manual and probabilistic techniques (22.23).The drawbacks of manual segmentation are the amount of time required for the analysis and the introduction of subjectivity. Still others have relied on techniques that optimize contrast between only two tissue classes during data acquisition (24.25). Such a "binary" approach to segmentation is useful if only one tissue is of interest,

Figure 7. Example of segmentation-derived 3 D reconstruction shows reconstructed skin from posterior oblique view and simulated craniotomy. The opening in the skin surface allows visualization of the central sulcus. Part of the gray matter (which is white and gray) was removed on the pre- and postcentral gyms to emphasize the white matter (which is yellow).

Table 5 Application of Computer Methods to a Small, Well-Defined Group of Healthy Volunteers Volume (mL) Tissue Class ICC Brain White matter Gray matter Subarachnoid CSF Ventricles Total CSF

Mean

1,562.10 1,440.81 681.59 759.22 104.48 16.81 121.29

SD

104.95 214.01 112.33 101.68 29.21 4.4 34.66

~~~~~

Note.-The 15 healthy. right-handed male volunteers were prospectively chosen.

I

I

as in CSF volume determination (24);however, it falls short in cases in which multiple tissue classes are of interest-for instance, when it is useful to differentiate among gray matter, white matter, and CSF. Another problem that limits image processing techniques is separation of the ICC from extraneous tissue outside the ICC. This is because many soft tissues outside the ICC have signal intensity properties similar to those of tissues inside the ICC (eg, the contents of the eyeball have the same signal intensity properties as CSF). Some groups have used data only from the upper parts of the skull, where there are no direct connections with soft-tissue bridges (23).Others avoided this problem because they used manual correction in outlining the brain ( 13,151.We have generated a full 3D description of the ICC as an individual

step using automated techniques. Anatomically speaking, the boundary of the ICC is marked by the inner table of the skull bone (which has low signal intensity on all MR images) on one side and by CSF and brain tissue (which have intermediate, low. or high signal intensity, depending on the echo used) on the other side. Since there are only small openings in the boundary of the ICC (eg, the different foramina), we have developed a procedure that cuts through bridges that connect soft tissues inside and outside the ICC. Our approach offers the following: (a)The data are compatible with routine clinical evaluations because the images contain the full contrast range of clinical images; ( b )the imager and the data acquisition protocols provide spatially homogeneous data sets; (c) whole-brain data sets are acquired; (d ) multiple tissues can be extracted from a single acquisition -the method is not a binary segmentation (eg, CSF vs everything else); and ( e )except for the supervision required to select the initial pixels, the method is essentially automated, thus having the potential to be fast (relative to techniques based on a larger proportion of interactive work) and to have greater reproducibility and less variability. Although there are still instances when manually guided ROIs are necessary (eg, in evaluating the hippocampus and parahippocampal @us [ 2 5 ) ) ,the automated techniques offer a clear advantage. Moreover, this system is under full control of the operator because the operator can perform the training step iteratively until a satisfactory result is obtained. However, while the operator decides which sample points are picked for the different tissue labels, computerized tools are used to extract signal intensities derived from those sample points Volume2

Number6

JMRl

627

and to define classification rules based on the statistics. Using the techniques described herein, we have automated the identification of different anatomic structures in MR images of the brain. On the basis of our experience and the fact that much more powerful computers will become available and that the algorithms will be improved, we believe that application of our technique to generate 3D reconstructions and volume determinations of brain structures will become part of the routine evaluation of MR imaging data sets of the head. These new segmentation procedures can be used in two different ways: ( a )to determine the volumes of structures identified and ( b )to generate 3D reconstructions of identified structures. These two capabilities will likely increase our ability to more reliably and accurately diagnose such disorders as Alzheimer disease, multiple sclerosis, brain atrophy, and hydrocephalus. Follow-up studies in patients with diseases such a s tumor, brain edema, and multiple sclerosis will allow a more precise determination of the progression of disease on the basis of quantitative volumetric measures. in addition to the qualitative analysis by the radiologist. The routine evaluation of the brain in 3D representations also affords access to new methods of diagnosis of conditions such a s brain atrophy and hydrocephalus ( 7 )and for surgical planning (26.27).The latter would include, but not be limited to, identification of the central sulcus for the planning of neurosurgical procedures, the localization of tumor relative to the brain surface for the planning of optimum entry for tumor excision, and/or interactive identification of blood vessels to assist in selecting the optimum entry for neurosurgery (Fig 7).

CONCLUSIONS Our main findings can be summarized as follows: 1. We were able to determine the ICC, brain (gray matter and white matter), and CSF volumes by using new automated MR image processing procedures. We were also able to create 3D reconstructions of these tissue components, which greatly enhances the appreciation of complex anatomy. The validity of these measurements was not the focus of this study; rather their reliability was the focus. Since no standard of reference exists for determining the validity of MR volumetric measurements, we compared volumetric data from computed segmentation with postmortem volumes reported in the literature and found that our results were consistent with that data (25).We have also reported volumetric data calculated from head phantom images, which also showed consistent results (1 I]. Thus, while our focus in the present study was to assess the reproducibility/reliabilityof our measurements, we also have indirect assessments of the validity of these measurements. 2. There was less variability among the five trained raters when they used the automated procedures than among the same raters using manual outlining for any given structure. 3 . The variability in measurements changed as a function of the complexity of the structure being assessed. Simpler structures such a s the ICC showed less measurement variability among the five raters 0

628

JMRl

November/December 1992

than did more complex structures such as the subarachnoid CSF. This was true for both manual and automated techniques, although the variability was less for the automated measurements. 4. Automated segmentation of the whole brain by three raters showed high reliability for the ICC. brain. and CSF, with measurement variability a function of structural complexity. 5. A longitudinal analysis of the same subjects over several weeks showed that the volume of the ICC was stable (SD, 1.2%). as was brain volume (SD. 6.3%). 0 Acknowledgments: The authors gratefully acknowledge the technical and administrative support provided by Diane Doolin. BS, Marianna Jakab. MSEE, Adam Shostack, BS. Andre Robatino. M S , Brian Chiango, RT, and Maureen Ainslie RT.

References 1. Cline HE, Lorensen WE, Kikinis R, Jolesz FA. Three-dimensional segmentation of MR images of the head using probability and connectivity. J Comput Assist Tomogr 1990: 14:1037-1045. 2. Jolesz FA, Schwartz RB, LeClerq GT, et al. Half Fourier spin echo imaging in routine clinical brain and cervical spine protocols (abstr).Magn Reson lmaging 1990: 8(suppl 1):62. 3. Gerig G , Kubler 0, Kikinis R, Jolesz FA. Nonlinear anisotropic filtering of MRI data. IEEE Trans Med Imaging 1992: 11:221-232. 4. Shenton ME, Kikinis R, McCarley RW. et al. Application of automated MRI volumetric measurement techniques to the ventricular system in schizophrenics and normal controls. Schizophr Res 1991; 5:103-113. 5 . Feinberg DA, Hale JD, Watts JC, et al. Halving MR imaging time by conjugation: demonstration at 3.5 kG. RadiolOD1986: 161:527-531. 6. Gerig G. Kuoni W, Kikinis R, Kubler 0. Medical imaging and computer vision: an integrated approach for diagnosis and planning. Presented at the DAGM Symposium on Computer Vision, Hamburg, Germany, October 2-4, 1989. 7. Kikinis R, Jolesz FA. Gerig G. et al. 3D morphometric and morphologic information derived from clinical brain MR images: NATO advanced workshop in Travemiinde. J u n e 1990. In: Hohne KH. Fuchs H , Pizer SM. eds. 3D imaging in medicine: algorithms. systems, applications. NATO AS1 series F: computer systems sciences. Vol6O. Berlin: Springer-Verlag. 1990: 44 1-454. 8. Perona P, Malik J . Scale space and edge detection using anisotropic diffusion. In: Proceedings of IEEE workshop on computer vision. Miami, Fla: IEEE. 1987: 6-22. 9. Lorensen WE, Cline HE. Marching cubes: a high resolution 3D surface reconstruction algorithm. ACM Comput Graphics 1987: 21:163-169. 10. Cline HE, Lorensen WE, Ludke S, Crawford CR, Teeter BC. Two algorithms for the three-dimensional reconstruction of tomograms. Med Phys 1988; 5:320-327. 11. Cline HE, Lorensen WE, Souza SP, et al. 3D surface rendered MR images of the brain and its vasculature. J Comput Assist Tomogr 1991: 15:344-351. 12. Kohn MI, Tanna NK, Herman GT. et al. Analysis of brain and cerebrospinal fluid volumes with MR imaging. I. Methods, reliability, and validation. Radiology I99 1 : 178: 1 15122. 13. Filipek PA, Kennedy DN. Caviness VS. et al. Magnetic resonance imaging-based brain morphomctry: development and application to normal subjects. Ann Neurol 1989; 25: 61-67. 14. Vannier MW, Butterfield RL. Jordan D, et al. Multispectral analysis of magnetic resonance images. Radiology 1985; 154:221-224. 15. Levin DN, Pelizzari CA, Chen GTY, et aI. Retrospective geometric correlation of MR. CT. and PET images. RadiolOD1988; 169:817-823.

16. Udapa J K , Srihari SN, Herman GT. Pattern analysis and machine intelligence. IEEE Trans 1982: 4:41-50. 17. Hiihne KH. Bomans M, Pommert A, et al. Rendering tomographic volume data: adequacy of methods of different modalities and organs. In: Hohne KH, Fuchs H. Pizer S M . eds. 3 D imaging in medicine: algorithms, systems, applications. NATO AS1 series F: computer systems sciences. Vol 60. Berlin: Springer-Verlag. 1990: 197-2 15. 18. Jack C R J r , Gehring DG, Sharbrough FW, et al. Temporal lobe volume measurement from MR images: accuracy and left-right asymmetry in normal persons. J Comput Assist Tomogr 1988: 12:21-29. 19. Jack CR J r . Brain and cerebrospinal fluid volume: measurement with MR imaging. Radiolog 1991: 178:22-24. 20. Press GA, Amaral DG, Squire LR. Hippocampal abnormalities in amnesic patients revealed by high-resolution magnetic resonance imaging. Nature 1989: 341:54-57. 21. Squire LR, Amaral DG, Press GA. Magnetic resonance imaging of the hippocampal formation and mammillary nuclei distinguish medial temporal lobe and diencephalic amnesia. J Neurosci 1990: 10:3106-3117. 22. Schroth G. Naegele T, Klose U. Mann K. Petersen D. Reversible brain shrinkage in abstinent alcoholics, measured by MRI. Neuroradiolo@ 1988: 30:385-392.

23. Rusinek H. de Leon M J , George AE. et al. AIzheimer disease: measuring loss of cerebral grey matter with MR imaging. Neuroradiology 1991; 178:109-114. 24. Condon B, Wyper D, Grant R. Patterson J , Hadley D, Teasdale G. Use of magnetic resonance imaging to measure intracranial cerebrospinal fluid volume. Lancet 1986: 1:1355-1357. 25. Shenton ME, Kikinis R. Jolesz FA. et al. Left temporal lobe abnormalities in schizophrenia and thought disorder. N Engl J Med 1992: 327:604-612. 26. Kikinis R, Jolesz FA, Lorensen WE, Cline HE, Stieg PE. Black PML. 3D reconstruction of skull base tumors from MRl data for neurosurgical planning (abstr).In: Book of abstracts: Society of Magnetic Resonance in Medicine 199 1. Berkeley, Calif: Society of Magnetic Resonance in Medicine, 1991: 752. 27. Kikinis R. Altobelli DE. Cline HE. Lorensen WE, Mulliken J. Jolesz FA. Planning and simulation of cranio-maxillofacia1 surgery using 3-dimensional reconstructions. Presented at the XI1 International Congress of Head and Neck Radiology. Zurich, Switzerland. October 1991.

Volume2

Number6

JMRl

629

Routine quantitative analysis of brain and cerebrospinal fluid spaces with MR imaging.

A computerized system for processing spin-echo magnetic resonance (MR) imaging data was implemented to estimate whole brain (gray and white matter) an...
1MB Sizes 0 Downloads 0 Views