Practical Radiation Oncology (2013) xx, xxx–xxx

www.practicalradonc.org

Original Report

Evaluation of threshold and gradient based 18 F-fluoro-deoxy-2-glucose hybrid positron emission tomographic image segmentation methods for liver tumor delineation Cem Altunbas PhD a,⁎, Christopher Howells PhD a , Michelle Proper MD b , Krishna Reddy MD, PhD a , Gregory Gan MD, PhD a , Peter DeWitt MS c , Brian Kavanagh MD a , Tracey Schefter MD a , Moyed Miften PhD a a

Department of Radiation Oncology, University of Colorado, School of Medicine, Aurora, Colorado Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia c Department of Biostatistics and Informatics, University of Colorado, School of Public Health, Aurora, Colorado b

Received 21 May 2013; revised 15 July 2013; accepted 5 August 2013

Abstract Purpose: Image segmentation methods were studied to delineate liver lesions in 18F-fluoro-2deoxy-glucose positron emission tomographic (FDG-PET) images. The goal of this study was to identify a clinically practical, semiautomated FDG-PET avid volume segmentation method to improve the accuracy of liver tumor contouring for treatment planning in stereotactic body radiation therapy (SBRT). Methods and materials: Pretreatment PET-CT image sets for 26 patients who received SBRT to 28 liver lesions were delineated using the following 3 methods: (1) Percent threshold with respect to background corrected maximum standard uptake values (SUV; threshold values varied from 10% to 50% with 10% increments); (2) threshold 3 standard deviations above mean background SUV (3σ); and (3) a gradient-based method that detects the edge of the FDG-PET avid lesion (edge). For each lesion, semiautomatically generated contours were evaluated with respect to reference contours manually drawn by 3 radiation oncologists. Two similarity metrics, Dice coefficient, and mean minimal distance (MMD), were employed to assess the volumetric overlap and the mean Euclidian distance between semiautomatically and observer-drawn contours. Results: Mean Dice and MMD values for 10%, 20%, 30% threshold, 3σ, and edge varied from 0.69 to 0.73, and from 3.44 mm to 3.94 mm, respectively (ideal Dice and MMD values are 1 and 0 mm, respectively). A statistically significant difference was not observed among 10%, 20%, 30% threshold, 3σ, and edge methods, whereas 40% and 50% methods had inferior Dice and MMD values. Conclusions: Three PET segmentation methods were identified above as potential tools to accelerate liver lesion delineation. The edge method appears to be the most practical for clinical implementation as

Conflicts of interest: None. ⁎ Corresponding author. University of Colorado School of Medicine, Department of Radiation Oncology, Anschutz Cancer Pavilion, MS F706, 1665 Aurora Ct, PO Box 6510, Aurora, CO 80045. E-mail address: [email protected] (C. Altunbas). 1879-8500/$ – see front matter © 2013 American Society for Radiation Oncology. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.prro.2013.08.002

2

C. Altunbas et al

Practical Radiation Oncology: Month 2013

it does not require calculation of SUV statistics. However, the performance of all segmentation methods showed large lesion-to-lesion fluctuations. Therefore, such methods may be suitable for generating initial estimates of FDG-PET avid volumes rather than being surrogates for manual volume delineation. © 2013 American Society for Radiation Oncology. Published by Elsevier Inc. All rights reserved.

Introduction Utilization of 18 F-fluoro-2-deoxy-glucose positron emission tomographic (FDG-PET) images in radiation therapy treatment planning has been a topic of interest as tumor metabolic information can be directly incorporated into determination of gross tumor volume (GTV). Due to the quantitative nature of PET images, standardized uptake value (SUV) from FDG is often proposed to segment the tumor volume. However, segmentation of tumor volumes in PET imaging remains a controversial topic 1 and a robust method has to be developed for correlating tumor boundaries with the FDG-PET avid volume from PET images. To address this issue, a range of tumor segmentation methods were reported in the literature, among which SUV threshold based algorithms are most commonly investigated. Threshold levels are typically determined in phantom studies by correlating the boundaries of FDG-PET avid region in PET images with known dimensions of PET avid objects in the phantom. For tumor segmentation in lung and head and neck, various groups have utilized an absolute SUV threshold, 2 and percent SUV threshold level with respect to maximum SUV (SUVmax). 3,4 Variations of threshold based algorithms have been developed to account for background SUV variations 5-7 or to determine threshold as a function of mean SUV in the target. 8 Besides thresholding, SUV gradient based edge detection methods were developed to determine the FDG-PET avid lesion boundary. 9-13 Using a hybrid approach, volumes generated by multiple segmentation methods may be combined to obtain a “consensus” volume to improve the accuracy and the robustness of FDG-PET avid volume segmentation. 14 The FDG-PET based lesion delineation also plays an important role in radiation treatment of liver cancer. Liver metastases have been treated with stereotactic body radiation therapy (SBRT), 15,16 and PET-CT imaging has been routinely used to supplement contrast-enhanced planning CT images to better identify the GTV location and boundary. Therefore, in this work, we investigated the feasibility to segment liver lesions on PET images and to quantitatively estimate the FDG-PET avid volume for guiding the GTV delineation in liver SBRT. We selected 3 PET image segmentation methods that have been previously studied for the segmentation of lung tumors, and we evaluated their performance in the context of liver tumor segmentation. Our main goal was to assess the accuracy of segmentation methods with respect to manually delineated FDG-PET avid volumes in patient images; ie, semiautomatically generated volume(s) should

be as similar as possible to the manually delineated volume(s) drawn by experienced radiation oncologists.

Methods and materials We evaluated 28 FDG-PET avid lesions belonging to 26 patients with metastatic liver tumors. All patients were treated with SBRT between the years 2004 and 2011, and diagnostic PET/CT scans were acquired prior to radiation treatment. Patients were instructed not to exercise 24 hours in advance of imaging and to fast 4 hours prior to FDG injection. Glucose levels were checked so that nondiabetics were below 150 mg/dL and diabetics below 200 mg/dL. Approximately 18 mCi (range, 12-20 mCi) of radionuclide was injected about 1 hour prior to PET acquisition. All PET images were acquired on the same scanner (GE Discovery ST, GE Healthcare, Fairfield, CT). PET images were reconstructed with the ordered subsets expectation maximization algorithm, 17 and were corrected for attenuation. Patients were instructed to hold their breath during the CT acquisition stage. The SUV was calculated as decay corrected activity per unit mass, and it was normalized with respect to total activity injected per unit body weight 1: SUV ¼

Mean activity concentrationðMBq=gÞ Injected activityðMBqÞ=Body weightðgÞ 1 :  decay factor

Three radiation oncologists (observers) were instructed to manually contour 28 FDG-PET avid lesions using an image contouring workstation (MIM 6.0, MIM Software Inc, Cleveland, OH). Only PET-CT image sets were made available to them during contouring sessions, and they were allowed to adjust window and level settings. Observers were instructed to contour FDG-PET avid lesion with information available in PET and CT images. Three semiautomated contouring methods were also used to contour all 28 lesions. We used the following 2 criteria to select 3 segmentation methods for this study: (1) The method should be practical enough to be implemented in a clinical setting and it should require minimum manual intervention to segment the lesion; and (2) the method should take into account the background SUV variations in normal liver tissue. Based on these criteria, we selected the following 3 methods: (1) Percentage thresholding method. Threshold level was determined with respect to background corrected SUVmax, and threshold levels were varied from 10% to 50% in 10% increments 6; (2) thresholding based on

Practical Radiation Oncology: Month 2013

SUV statistics in normal liver (3σ method), 18 where the threshold level was set at mean SUV plus 3 standard deviations of SUV in normal liver; and (3) SUV gradient based edge detection (edge method). Each FDG-PET avid lesion was compared with respect to the radiation oncologist's (observer) delineated reference volumes using volumetric and distance-wise similarity metrics. Threshold based methods required computation of mean background SUV, SUVbkgd, in normal liver, and its calculation was based on PET response criteria in solid tumors (PERCIST) 18; SUVbkgd was determined in each PET image by averaging SUV in a 3-cm diameter spherical region of interest (ROI) placed in the uninvolved section of the right hepatic lobe. The first semiautomated contouring method was based on percentages of the maximum lesion SUV (SUVmax) with respect to the mean background SUV; ie,   SU V % thresh ¼ % thresh  SU V max −SU V bkgd þ SU V bkgd : The % thresh ∈ {10%, 20%, 30%, 40%, 50%}, and 5 avid volumes were generated corresponding to these thresholds. The second semicontouring method (3σ) utilized, instead, a threshold based on 3 times the standard deviation of SUV background, σbkgd, with respect to the mean background SUV; ie, SU V % thresh ¼ SU V bkgd þ 3σbkgd : For each of these 2 methods, SUV threshold was performed in a sufficiently large ROI encompassing the lesion. The exact dimensions of the ROI did not affect the segmentation of the lesion. The last, and third, method (edge) utilized the edge detection tool provided in the contouring software. 13 It utilizes a gradient-based edge detection algorithm, and requires manual selection of a ROI; to initiate segmentation, the user indicates the search area using a marking tool consisting of orthogonal rays in 3 dimensions, which guides the algorithm to determine the edge of the FDG-PET avid volume. All semiautomated segmentations were performed in the contouring workstation by a physicist experienced in using the segmentation tools. Two 3-dimension similarity metrics were employed to compare semiautomatically generated and manually delineated volumes. The first one is the Dice similarity metric, 19 which determines the fractional overlap of 2 delineated volumes, V1 and V2, D¼

2jV 1 ∩V 2 j ; jV 1 jþjV 2 j

where Dice value for perfect volumetric overlap would be 1. We also evaluated distance-wise discrepancies between the boundaries of 2 volumes using the mean minimal

Evaluation of FDG-PET image segmentation methods

3

distance (MMD) metric, which was inspired from the “Hausdorff” distance metric. 20 The MMD represents the average of minimum Euclidean distances between the points that constitute 2 contour sets. To calculate MMD, let {si} be the set of points with coordinates (xi,yi,zi) representing contour Si. Then, S 2 - norm distance between 2 points, s1 ∈ S1 and s2 ∈ S2, is qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ðs1 ; s2 Þ ¼ ðx 1 −x 2 Þ2 þ ðy 1 −y 2 Þ2 þ ðz 1 −z 2 Þ2 : Then, MMD is defined as the mean of all minimum distances between points of contour S1 and contour S2;     MMD ¼ mean min d ðs1 ; s2 Þ ∪ min d ðs1 ; s2 Þ ; s1 ∈S 1

s2 ∈S 2

where f mins1 ∈S d ðs1 ; s2 Þg is the minimum distances from points in contour S 1 to contour S 2 , and fmin s2 ∈S 2 d ðs1 ; s2 Þg is the minimum distances from points in contour S2 to contour S1. If 2 contours are identical as in an ideal case, the MMD value would be 0 mm. Minimum distances were calculated for all points that constitute contour pairs, and to reduce discretization errors all contours in PET images were upsampled to a uniform spacing of 0.9 mm resolution. Figure 1 shows examples of the computation of the Dice and MMD metrics, illustrating their usefulness as well as their differences. Each of the 3 images contains 1 observerdrawn (green) and 1 semiautomatically generated contour (black). In Fig 1A contour pairs have a Dice metric of 0.59, which is considered a poor value, but a relatively good MMD of 3.2 mm. In Fig 1B, contour pairs have a relatively good Dice coefficient of 0.73, and a poor MMD of 5.6 mm. Both contours have relatively large volumetric overlap, but their shapes exhibit large variations as indicated by the MMD metric. Figure 1C shows an example of contour pairs that have very good volumetric overlap with minimal differences in size and shape; Dice coefficient is 0.9 and MMD is 1.9 mm. All semiautomatically segmented volumes were generated in the MIM contouring workstation. To reduce discretization errors, PET images and contours were upsampled to 0.889 × 0.889 mm 2 pixel size in transverse plane and to 3 mm slice thickness and the calculation of Dice and MMD values was performed using in-house developed MATLAB scripts (MathWorks, Natick, MA). For every lesion, each semiautomatically segmented volume was evaluated with respect to 3 manually delineated volumes in pairwise combinations. Dice and MMD values were averaged over the 3 contour pairs, and referred as average observer (AO) value. Mean Dice or MMD for a given segmentation was computed by averaging all AO comparisons over all lesions. To determine statistically significant differences among segmentation methods, the Tukey honest significant difference test (HSD) 21 was utilized. For the Tukey HSD test, each

4

C. Altunbas et al

Practical Radiation Oncology: Month 2013

segmentation method was compared with every other segmentation method in pairwise combinations to determine 95% pair-wise confidence intervals for differences in mean Dice or MMD values. The results of the Tukey HSD allow one to stratify semiautomated segmentation methods into statistically similar groups. In addition to comparisons of semiautomatically segmented volumes with respect to observer-drawn contours, interobserver consistency in lesion delineation was also evaluated. For each lesion, 3 observer-drawn contours were compared in pairwise combinations, and Dice and MMD metrics were calculated.

Results For all 28 lesions, mean and standard deviation of liver background SUV, and SUVmax were determined and are summarized in Table 1. PET-avid volume for each lesion was calculated by averaging volumes delineated by 3 observers. Manually drawn and semiautomatically generated example contours were shown in Fig 2A and B, respectively. For each segmentation method and each lesion, Dice (Fig 3A) and MMD (Fig 3B) values were calculated with respect to AO. To improve the clarity of comparison, we have only displayed the 30% threshold method within the range of percentage threshold methods we analyzed. The lesion numbers were arranged in volume-wise ascending order. In Fig 4A and B, mean Dice and MMD values and their standard deviation were displayed as a function of observers for each segmentation method. The maximum interobserver variation in mean Dice and MMD values were 0.05 and 0.46 mm, respectively. Mean and standard deviations of Dice and MMD values are given in Table 2. Ten percent, 20%, and 30% threshold, 3σ, and edge methods have very similar mean Dice values in the range of 0.69 and 0.73. Both 40% and 50% threshold methods have significantly lower mean Dice values of 0.62 and 0.50, respectively. For MMD, a similar trend is seen. Mean MMD values ranged from 3.4 mm to 3.7 mm for 20%, 30% threshold, 3σ, and edge methods, whereas 10%, 40%, and 50% threshold methods have mean MMDs in the range of 3.9 mm to 5.1 mm.



Figure 1 Comparing observer-drawn contour (green) and semiautomated method (black) in terms of Dice and MMD similarity metrics. (A) Contour pairs have a Dice coefficient of 0.59, which is considered a poor value, but a “good” mean minimal distance (MMD) of 3.2 mm; while in another patient, (B), contour pairs have a relatively good Dice coefficient of 0.73 but a poor MMD of 5.6 mm; while in still another patient, (C), contour pairs have a very good Dice coefficient of 0.90 and MMD of 1.9 mm. (For color version, see online at www.practicalradonc.org).

Practical Radiation Oncology: Month 2013

Evaluation of FDG-PET image segmentation methods

5

Table 1 Volume and standard uptake value (SUV) characteristics of 28 liver lesions used in the study; mean and standard deviation (SD) of background SUV in liver was calculated per PERCIST criteria Characteristics

Mean SD

PET-avid volume (cc) 32.9 38.2 SUVmax 7.63 2.97 Mean background 2.2 0.36 SUV in liver SD of background 0.2 0.08 SUV

Range (minimum-maximum) 2.1-133.5 3.1-14.6 1.61-2.97 0.11-0.40

PET, positron emission tomography; PERCIST, PET response criteria in solid tumors.

The Tukey HSD confidence intervals for the differences in mean Dice and MMD values among pairs of segmentation methods are shown in Figs 5 and 6, respectively. For any given pair, if the 95% confidence interval does not overlap with zero, the Tukey HSD test indicates a statistically significant difference in means, and it is marked in blue. The differences in mean Dice values for the 10%, 20%, 30% threshold, 3σ, and edge methods are not statistically significant, whereas mean Dice values of 40% and 50% threshold methods are significantly different (P b .05) than the rest of the methods (Fig 5). Differences in mean MMDs are less pronounced among segmentation methods. The 10%, 20%, 30%, 40% threshold, 3σ, and edge methods have mean MMDs that are within the 95% confidence interval (Fig 6). Only the mean MMD of the 50% threshold method is significantly different (P b .05) than the rest of the segmentation methods. Similarity metrics for comparisons of observer-drawn contours are given in Table 3. The analysis was based on pairwise comparisons of 3 observers' contours over 28 lesions. Mean Dice values are in the range of 0.77 and 0.81, and mean MMD values are between 2.22 mm and 2.64 mm. Dice values for individual pairs ranged from 0.38 to 0.95, and MMD values varied from 1.07 mm to 5.77 mm.

Discussions With respect to observer-drawn contours, 10%, 20%, and 30%, 3σ, and edge methods (group A) have similar accuracy in FDG-PET avid volume segmentation. Both 20% and 30% threshold methods have higher mean Dice and lower mean MMD values, and lower standard deviations, which make them favorable in terms of segmentation performance. From a user friendliness perspective, the edge method is favorable because volumes can be generated rather quickly, and calculation

Figure 2 In (A), 3 observer-drawn contours are shown in an axial slice of a lesion. In (B), 3 semiautomatically generated contours are shown in the same slice, including that from the 20% threshold (blue), 3σ (black), and edge (yellow) methods. (For color version, see online at www.practicalradonc.org).

of SUV statistics is not needed. Forty percent and 50% threshold methods (group B) have significantly inferior performance with respect to group A methods, and they are not recommended for segmentation of liver lesions. An

6

C. Altunbas et al

Practical Radiation Oncology: Month 2013

A

Dice

0.75

0.50

0.25

0.00 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Lesion Number (increasing volume) Method

30% vs AO

3 σ vs AO

Edge vs AO

B

MMD

7.5

5.0

2.5

0.0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Lesion Number (increasing volume) Method

30% vs AO

3 σ vs AO

Edge vs AO

Figure 3 Lesion-specific Dice and mean minimal distance (MMD) values with respect to average observer are shown in (A) and (B), respectively. Each semiautomatically generated volume was compared with respect to 3 observer-drawn volumes, and similarity metric values were calculated and averaged. Lesion numbers were ordered as a function of increasing lesion volume (ascending order).

acceptable range for Dice and MMD metrics was not established; however, a semiautomatically generated volume with a Dice value of 0.7 and a MMD value of 3.5 mm was deemed to be qualitatively acceptable in comparison with the manually drawn volume. Although the performance of group A methods was acceptable on the average, their performance showed large lesion-tolesion variations (see the standard deviation and the range in Table 2). Such inconsistent segmentation performance was responsible in part for the lack of statistically significant differences among group A methods as indicated by wide confidence intervals in the Tukey HSD tests (Figs 5 and 6). These results qualitatively indicated that group A methods may not be a surrogate for manual lesion delineation. Because observer-drawn contours were used as reference to assess the performance of segmentation methods, variations in manual delineation from ob-

server to observer may hinder the differences among group A methods. To evaluate potential observer bias in our results, we stratified the values of similarity metrics of segmentation methods based on specific observers (Fig 4). For a given segmentation method, observer-specific mean Dice and MMD values were comparable among 3 observers, which imply that Dice and MMD values of segmentation methods were not grossly biased by a single observer's manually drawn reference volumes. However, this comparison is not direct assessment of observer dependent variations in lesion delineation. To better evaluate interobserver variations, observer-drawn volumes were compared with respect to each other on a lesion by lesion basis. Dice and MMD values of observer-drawn volume pairs were in the order of 0.8 and 2.4 mm, respectively, indicating the existence of differences in observer-drawn volumes (Table 3). Dice and MMD

Evaluation of FDG-PET image segmentation methods

Practical Radiation Oncology: Month 2013

7

A 1.00

Dice

0.75

0.50

0.25

0.00 10%

B

20%

30%

40%

50%

Observer 1

Observer 2

Observer 3



Edge



Edge

6

MMD

4

2

0 10%

20%

30%

40%

Observer 1

Observer 2

50% Observer 3

Figure 4 Mean and standard deviation Dice (A) and mean minimal distance (MMD) (B) values (averaged over 28 lesions) for each segmentation method with respect to specific observers.

values of observer-drawn contour pairs (Table 3) showed less variation than semiautomatically segmented observerdrawn contour pairs (Table 2). The range of similarity variations among observer-drawn contour pairs was not negligible; among 28 lesions, Dice and MMD values for observer-drawn contour pairs varied from 0.38 to 0.95, and from 1.07 mm to 5.54 mm, respectively. Such interobserver variations in manual volume delineation may be responsible in part for the lack of statistically significant differences in group A methods. Another question is whether semiautomatically delineated FDG-PET avid volumes could be translated into a GTV. Our preliminary analysis suggests that a FDG-PET avid lesion by itself might not necessarily reflect the boundaries of the GTV in liver. The 4-dimensional CT scans and contrast-enhanced CT scans obtained at CT simulation are routinely used to account for organ motion and to delineate tumor GTV for liver SBRT. In contrast, FDG-PET images are obtained without any consideration for organ motion. Therefore, issues such as image registration, organ motion, and patient position when trying to overlay the PET image data set and the CT data set obtained during simulation will

be suboptimal at best. Hence, the relationship between GTV and FDG-PET avid volume may not be trivial, and determination of GTV largely depends on how image data are interpreted by the physician. Table 2 Dice and mean minimal distance (MMD) values of segmentation methods averaged over 28 lesions Method

10% 20% 30% 40% 50% 3σ Edge

Dice

MMD (mm)

Mean and SD

Range Mean (min-max) and SD

Range (min-max)

0.70 ± 0.18 0.73 ± 0.15 0.70 ± 0.12 0.62 ± 0.13 0.50 ± 0.16 0.72 ± 0.15 0.69 ± 0.13

0.13-0.96 0.13-0.91 0.34-0.85 0.25-0.81 0.14-0.79 0.16-0.96 0.34-0.88

1.15-14.99 1.51-10.74 1.52-7.18 1.55-10.27 1.77-12.98 1.09-9.63 1.42-9.68

3.94 ± 2.50 3.46 ± 1.75 3.44 ± 1.41 4.11 ± 1.97 5.05 ± 2.51 3.64 ± 1.73 3.67 ± 1.64

For every lesion, each semiautomatically segmented volume was paired with 3 observer-drawn volumes. Similarity metrics were calculated for each pair and were averaged. min-max, minimum to maximum; SD, standard deviation.

8

C. Altunbas et al

Practical Radiation Oncology: Month 2013 Table 3 Pair-wise similarity comparisons of observerdrawn volumes Pair

Dice Mean and SD

MMD (mm)

Range Mean (min-max) and SD

O1 vs O2 0.77 ± 0.15 0.38-0.93 O1 vs O3 0.81 ± 0.08 0.66-0.94 O2 vs O3 0.81 ± 0.14 0.44-0.95

Range (min-max)

2.64 ± 1.10 1.38-5.54 2.43 ± 1.04 1.33-5.77 2.22 ± 0.87 1.07-4.76

Reported values are mean, standard deviation (SD), and range of Dice and mean minimal distance (MMD) values based on 28 lesions.

initial estimate of tumor burden rather than GTV itself. The benefits of this are 2-fold. First, by generating an initial tumor volume estimate, manual GTV delineation process may be accelerated and second, interobserver variations in GTV delineation may also be reduced. 22

Conclusions Figure 5 The Tukey honest significant difference test with 95% family-wise confidence intervals for pairs of segmentation methods. The plot shows the differences in mean Dice values. Blue indicates a true difference in means. (For color version, see online at www.practicalradonc.org).

Given the challenges associated with determining the physical extent of tumors in PET images, we envision that segmented FDG-PET avid volume would serve as an

Two SUV threshold-based and 1 edge detection-based image segmentation methods were identified to delineate liver tumors from FDG-PET images. These methods can generate volumes similar to manually delineated volumes; however, the agreement between 2 volumes may exhibit large lesion-to-lesion variations. Semiautomated segmentation can be employed to generate an initial estimate for the FDG-PET avid volume, which may accelerate manual GTV delineation in radiation treatment planning for liver metastases and may potentially reduce intraobserver variability. We recommend using the edge detection method because this does not require manual estimation of SUV statistics in normal liver and will be a more practical alternative to threshold based methods in a clinical setting.

References

Figure 6 The Tukey honest significant difference test with 95% family-wise confidence intervals for pairs of segmentation methods. The plot shows the differences in mean minimal distance (MMD) values. Blue indicates a true difference in means. (For color version, see online at www.practicalradonc.org).

1. Zaidi H, El Naqa I. PET-guided delineation of radiation therapy treatment volumes: a survey of image segmentation techniques. Eur J Nucl Med Mol Imaging. 2010;37:2165-2187. 2. Patz Jr EF, Lowe VJ, Hoffman JM, Paine SS, Harris LK, Goodman PC. Persistent or recurrent bronchogenic carcinoma: detection with PET and 2-[F-18]-2-deoxy-D-glucose. Radiology. 1994;191: 379-382. 3. Erdi YE, Mawlawi O, Larson SM, et al. Segmentation of lung lesion volume by adaptive positron emission tomography image thresholding. Cancer. 1997;80(12 Suppl):2505-2509. 4. Paulino AC, Koshy M, Howell R, Schuster D, Davis LW. Comparison of CT- and FDG-PET-defined gross tumor volume in intensity-modulated radiotherapy for head-and-neck cancer. Int J Radiat Oncol Biol Phys. 2005;61:1385-1392. 5. Schaefer A, Kremp S, Hellwig D, Rübe C, Kirsch CM, Nestle U. A contrast-oriented algorithm for FDG-PET-based delineation of tumour volumes for the radiotherapy of lung cancer: derivation from phantom measurements and validation in patient data. Eur J Nucl Med Mol Imaging. 2008;35:1989-1999.

Practical Radiation Oncology: Month 2013 6. Drever L, Robinson DM, McEwan A, Roa W. A local contrast based approach to threshold segmentation for PET target volume delineation. Med Phys. 2006;33:1583-1594. 7. Daisne JF, Sibomana M, Bol A, Doumont T, Lonneux M, Grégoire V. Tri-dimensional automatic segmentation of PET volumes based on measured source-to-background ratios: influence of reconstruction algorithms. Radiother Oncol. 2003;69:247-250. 8. Black QC, Grills IS, Kestin LL, et al. Defining a radiotherapy target with positron emission tomography. Int J Radiat Oncol Biol Phys. 2004;60:1272-1282. 9. Drever LA, Roa W, McEwan A, Robinson D. Comparison of three image segmentation techniques for target volume delineation in positron emission tomography. J Appl Clin Med Phys. 2007;8: 93-109. 10. Geets X, Lee JA, Bol A, Lonneux M, Grégoire V. A gradient-based method for segmenting FDG-PET images: methodology and validation. Eur J Nucl Med Mol Imaging. 2007;34:1427-1438. 11. Wanet M, Lee JA, Weynand B, et al. Gradient-based delineation of the primary GTV on FDG-PET in non-small cell lung cancer: a comparison with threshold-based approaches, CT and surgical specimens, in. Radiother Oncol. 2011;98:117-125. 12. Werner-Wasik M, Nelson AD, Choi W, et al. What is the best way to contour lung tumors on PET scans? Multiobserver validation of a gradient-based method using a NSCLC digital PET phantom. Int J Radiat Oncol Biol Phys. 2012;82:1164-1171. 13. Nelson AD, Werner-Wasik M, et al. PET tumor segmentation: multiobserver validation of a gradient-based method using a NSCLC PET phantom. Int J Radiat Oncol Biol Phys. 2009;75(3 Suppl):S627.

Evaluation of FDG-PET image segmentation methods

9

14. McGurk RJ, Bowsher J, Lee JA, Das SK. Combining multiple FDGPET radiotherapy target segmentation methods to reduce the effect of variable performance of individual segmentation methods. Med Phys. 2013;40:042501. 15. Schefter TE, Kavanagh BD, Timmerman RD, Cardenes HR, Baron A, Gaspar LE. A phase I trial of stereotactic body radiation therapy (SBRT) for liver metastases. Int J Radiat Oncol Biol Phys. 2005;62: 1371-1378. 16. Kavanagh BD, Schefter TE, Cardenes HR, et al. Interim analysis of a prospective phase I/II trial of SBRT for liver metastases. Acta Oncol. 2006;45:848-855. 17. Ross S, Stearns C, Sharp IR. White Paper. Waukesha, WI: GE Healthcare. 2010. 18. Wahl RL, Jacene H, Kasamon Y, Lodge MA. From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. J Nucl Med. 2009;50(Suppl 1):122S-150S. 19. Zou KH, Warfield SK, Bharatha A, et al. Statistical validation of image segmentation quality based on a spatial overlap index. Acad Radiol. 2004;11:178-189. 20. Birkfellner W. Applied medical image processing: a basic course. Florence, KY: Taylor & Francis Group. 2010. 21. Rosner B. Fundamentals of biostatistics 6th ed. Pacific, CA: Thomson Brooks/Cole. 2006. 22. van Baardwijk A, Bosmans G, Boersma L, et al. PET-CT-based auto-contouring in non-small-cell lung cancer correlates with pathology and reduces interobserver variability in the delineation of the primary tumor and involved nodal volumes. Int J Radiat Oncol Biol Phys. 2007;68:771-778.

Evaluation of threshold and gradient based (18)F-fluoro-deoxy-2-glucose hybrid positron emission tomographic image segmentation methods for liver tumor delineation.

Image segmentation methods were studied to delineate liver lesions in (18)F-fluoro-2-deoxy-glucose positron emission tomographic (FDG-PET) images. The...
1MB Sizes 0 Downloads 3 Views