Research article Received: 29 October 2014,

Revised: 18 March 2015,

Accepted: 23 April 2015,

Published online in Wiley Online Library: 26 May 2015

(wileyonlinelibrary.com) DOI: 10.1002/nbm.3329

Combined unsupervised–supervised classification of multiparametric PET/MRI data: application to prostate cancer Sergios Gatidisa, Markus Scharpfb, Petros Martirosiana, Ilja Bezrukovc,d, Thomas Küstnera, Jörg Hennenlottere, Stephan Krucke, Sascha Kaufmanna, Christina Schramla, Christian la Fougèref, Nina F. Schwenzera* and Holger Schmidta Multiparametric medical imaging data can be large and are often complex. Machine learning algorithms can assist in image interpretation when reliable training data exist. In most cases, however, knowledge about ground truth (e.g. histology) and thus training data is limited, which makes application of machine learning algorithms difficult. The purpose of this study was to design and implement a learning algorithm for classification of multidimensional medical imaging data that is robust and accurate even with limited prior knowledge and that allows for generalization and application to unseen data. Local prostate cancer was chosen as a model for application and validation. 16 patients underwent combined simultaneous [11C]-choline positron emission tomography (PET)/MRI. The following imaging parameters were acquired: T2 signal intensities, apparent diffusion coefficients, parameters Ktrans and Kep from dynamic contrastenhanced MRI, and PET standardized uptake values (SUVs). A spatially constrained fuzzy c-means algorithm (sFCM) was applied to the single datasets and the resulting labeled data were used for training of a support vector machine (SVM) classifier. Accuracy and false positive and false negative rates of the proposed algorithm were determined in comparison with manual tumor delineation. For five of the 16 patients rates were also determined in comparison with the histopathological standard of reference. The combined sFCM/SVM algorithm proposed in this study revealed reliable classification results consistent with the histopathological reference standard and comparable to those of manual tumor delineation. sFCM/SVM generally performed better than unsupervised sFCM alone. We observed an improvement in accuracy with increasing number of imaging parameters used for clustering and SVM training. In particular, including PET SUVs as an additional parameter markedly improved classification results. A variety of applications are conceivable, especially for imaging of tissues without easily available histopathological correlation. Copyright © 2015 John Wiley & Sons, Ltd. Keywords: PET/MRI; multiparametric imaging; prostate cancer; machine learning; used

INTRODUCTION With the introduction of positron emission tomography (PET)/ computed tomography (CT) (1) and recently PET/MRI (2) hybrid imaging systems it has become feasible to combine different * Correspondence to: Nina F. Schwenzer, Department of Radiology, Diagnostic and Interventional Radiology, Eberhard Karls University Tübingen, Germany. E-mail: [email protected] a S. Gatidis, P. Martirosian, T. Küstner, S. Kaufmann, C. Schraml, N. F. Schwenzer, H. Schmidt Department of Radiology, Diagnostic and Interventional Radiology, Eberhard Karls University Tübingen, Germany b M. Scharpf Department of Pathology and Neuropathology, General Pathology, Eberhard Karls University Tübingen, Germany

914

c I. Bezrukov Department of Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, Germany

NMR Biomed. 2015; 28: 914–922

modalities obtaining morphological and metabolic data. The acquired spatially and temporally correlated multiparametric data provide additional information, which may result in higher diagnostic accuracy (3). d I. Bezrukov Department of Radiology, Preclinical Imaging and Radiopharmacy, Laboratory for Preclinical Imaging and Imaging Technology of the Werner Siemens Foundation, Eberhard Karls University Tübingen, Germany e J. Hennenlotter, S. Kruck Department of Urology, Eberhard Karls University Tübingen, Germany f C. Fougère Department of Radiology, Nuclear Medicine, Eberhard Karls University Tübingen, Germany Abbreviations used: PET, positron emission tomography; ADC, apparent diffusion coefficient; SUV, standardized uptake value; SVM, support vector machine; sFCM, spatially constrained fuzzy c-means algorithm; CT, computed tomography; DCE MRI, dynamic contrast-enhanced MRI; DWI, diffusionweighted imaging; ROI, region of interest.

Copyright © 2015 John Wiley & Sons, Ltd.

CLASSIFICATION OF MULTIPARAMETRIC PET/MRI IN PROSTATE CANCER However, the complexity of the acquired data poses a challenge. It is often difficult to take into account all measured parameters for diagnostic decisions, as parameters may overlap or appear contradictory in purely visual analysis. Significant parts of available information may therefore remain unexploited. One way to overcome these challenges is the application of automated classification algorithms. The use of such tools allows for an objective and reproducible analysis of multidimensional datasets. Classification algorithms can be broadly divided into supervised and unsupervised methods. In supervised classification, a training dataset with available labels for each data point is used to train a classifier. Labels can for example originate from voxel-wise histological correlation, where regions of suspected cancer and benign regions were determined. The trained classifier can then be used to assign labels to previously unseen data. Due to their reliable performance, supervised algorithms are widely used in many fields. For the analysis of medical imaging data, support vector machine (SVM) classifiers (4), but also artificial neural networks (5) and regression analysis (6) have been previously proposed. Unsupervised classification, on the other hand, is performed without or with very limited prior knowledge of the data. Unsupervised classifiers analyze the structure of data to infer labels. Training datasets are not necessary. A major limitation is that the inferred labels do not necessarily represent any biological meaning. Numerous examples exist for the application of unsupervised methods in medical imaging, including fuzzy c-means clustering (7), expectation maximization for mixed distributions (8) or the ISODATA algorithm (9). A well-characterized application of multiparametric imaging is imaging of prostate cancer. Numerous studies have shown that addition of diffusion-weighted imaging (DWI) and dynamic contrast-enhanced (DCE) MRI to conventional T2-weighted sequences can improve the accuracy of local prostate cancer detection. Despite the fact that the value of choline-PET in imaging of local prostate cancer is limited, due to low specificity especially in the central gland and limited spatial resolution, it was shown that the combination of multiparametric MRI and choline-PET can help to identify significant primary prostate cancer (10,11). As stated above, supervised methods depend on the availability of labeled training datasets. In medical imaging, labels often originate from spatially resolved histological examinations. This is feasible for the prostate after prostatectomy. For most organs, however, spatially resolved histology is not easily available. In these cases histological training data cannot be obtained, making the direct application of supervised learning algorithms difficult. The objective of this work was to design and implement a robust and accurate classification algorithm for classification of multidimensional medical imaging data with very limited available prior knowledge that allows for generalization and application on unseen data. Due to its suitability for spatially correlated histological validation, local prostate cancer was chosen as a model for the application of the algorithm.

Patients gave their written informed consent. The study was approved by the local ethics committee. Patient characteristics are summarized in Table 1. PET/MRI study PET/MRI was performed on a 3 T combined clinical scanner (Biograph mMR, Siemens, Erlangen, Germany) (12). 65 ± 19 min after intravenous injection of 622 ± 45 MBq [11C]-choline, PET data of the pelvis were acquired for 15 min and reconstructed using an iterative 3D ordered-subset expectation-maximization algorithm with three iterations, 21 subsets, and a 3 mm Gaussian filter. PET attenuation correction was performed using a T1-weighted 3D-encoded spoiled gradient-echo sequence with double echo for Dixon-based fat–water separation and subsequent attenuation-map estimation. The following MR measurements were performed: a transversal T2-weighted turbo-spin-echo sequence (T2) and a diffusionweighted spin-echo echo-planar imaging sequence (DWI), as well as a DCE study (0.1 ml/kg body weight Gadovist, Beyer-Schering

Table 1. Patient characteristics Age [years]

Weight [kg]

PsA [ng/ml]

Gleason score

Previous therapy h/o TURP, h/o local radiation no previous therapy h/o hormonal therapy no previous therapy no previous therapy no previous therapy no previous therapy no previous therapy h/o TURP no previous therapy no previous therapy no previous therapy no previous therapy no previous therapy no previous therapy no previous therapy

67

76.8

6.1

5

79

90.5

19.0

8

80

91.0

7.8

7

64

90.0

62.0

7

81

77.5

48.0

n/a

61

91.7

4.3

n/a

77

78.5

10.0

6

74

79.9

93.0

8

60 70

86.2 71.1

1.3 8.5

8 6

57

90.1

184.0

8

70

75.0

10.5

9

57

110.0

6.7

7

62

76.7

16.9

7b

73

94.6

7.5

8

46

117

40

9

MATERIALS AND METHODS Patients

NMR Biomed. 2015; 28: 914–922

Copyright © 2015 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/nbm

915

Between October 2011 and June 2013, 16 patients (67 ± 10 years) with biopsy-proven prostate cancer were prospectively enrolled.

h/o, history of. TURP, trans-urethral resection of the prostate. n/a, not available.

S. GATIDIS ET AL. Pharma, Leverkusen, Germany) using a T1-weighted 3D-encoded gradient-echo sequence. Sequence parameters are shown in Table 2. Parametrical maps of the volume transfer (Ktrans) and extracellular volume constant (Kep) were calculated according to the standard Tofts model (13) using in-house software implemented in MATLAB (version R2012b, MathWorks, Natick, MA, USA). Apparent diffusion coefficients (ADCs) were derived from acquired b-value images of DWI (b = 50 and 800 s/mm2) by standard voxel-wise monoexponential fitting of the signal decay using the vendor-provided software. Image registration and interpolation PET images, T2-weighted images, ADC, Ktrans and Kep maps were superposed according to their voxel positions within the PET/MRI coordinate system. Spatial resolution was matched by linear interpolation to 1 mm3. Fusion accuracy was improved manually in cases of obvious misalignment due to patient movement (Vinci, version 4.03, Max Planck Institute for Neurological Research, Cologne, Germany). Prostate segmentation and unsupervised clustering Prior to further analysis, the prostate was segmented manually on the T2-weighted sequences using in-house software (MATLAB) to restrain unsupervised clustering to voxels lying within prostate tissue. To this end, T2-weighted images were displayed and the prostate borders were manually delineated slice by slice by an experienced radiologist (S.G.) using an inhouse software tool implemented in MATLAB (version R2012b, MathWorks, Natick, MA, USA). From the resulting 2D mask, a 3D mask of the prostate was assembled. The 3D datasets with 5D voxels containing information on the T2 signal intensities, ADCs, Ktrans, Kep and PET standardized uptake values (SUVs) were scaled to mean 0 and standard deviation 1. Datasets were clustered using a spatially constrained fuzzy c-means algorithm (sFCM), previously described in Reference 14. Unsupervised algorithms are built to find voxels that have similar imaging properties and to separate them from voxels with different properties, thus dividing imaging data into different clusters. Such procedures only depend on voxel properties and do not need additional information such as already preclustered data (training data). The sFCM algorithm used in this study is an unsupervised clustering method that classifies data points according to their position in the feature space. It is based on the widely used fuzzy c-means algorithm (15), and additionally takes into account spatial information, yielding spatially homogeneous clusters. The sFCM algorithm was implemented in MATLAB using

the following algorithm parameters as suggested in the original publication (14): number of clusters, two; fuzziness, 1.5. The spatial constraint was weighted equally to the fuzzy constraint (parameters p and q set to 1).

Generation of an SVM prediction model Supervised methods such as the SVM algorithm used in this study learn how to separate tumor from non-tumor voxels by finding differences in imaging properties of tumor and non-tumor in imaging data. For this learning procedure, these algorithms depend on a priori information (e.g. from histological data) with predefined areas of tumor and non-tumor (training data). After a training phase, supervised algorithms can identify tumor voxels also in new datasets (test data). In their original form, SVMs are kernel-based supervised learning algorithms for binary decision-making (4). A labeled dataset, denoted the training set, is used to train an SVM classifier. In our case, labels “tumor” and “non-tumor” were obtained for the training set by applying the unsupervised sFCM algorithm to the data as described above. During training, a hyperplane is found that separates training data with respect to given constraints. Classification of new data points is then performed according to their position relative to this hyperplane. In contrast to unsupervised algorithms, manual segmentation of the prostate is not necessary for the application of the SVM classifier. For SVM training, data of parameters containing absolutely quantifiable values (ADC, PET, Kep, Ktrans) were scaled to mean 0 and standard deviation 1 over all data sets (i.e. using the same scaling factors for all datasets). Since T2 signal intensities are relative values, this parameter was scaled separately on each dataset. SVM classification was performed using the dedicated library LIBSVM (16). We used a linear kernel SVM and optimized the cost parameter during the training phase by simple exhaustive search. A linear kernel was chosen due to the lack of prior data suggesting the use of a non-linear kernel, and due to the lower vulnerability for overfitting compared with non-linear kernels. To avoid overfitting we performed 10-fold cross-validation on the training set. The trained classifier was then used to predict tumor regions within test sets. When applying the SVM predictor on a dataset we used the leave-one-out approach, meaning that the SVM classifier was trained using all datasets but the one it was to be applied on. Thus, 15 training and one test datasets are always used irrespective of the reference standard (manual tumor delineation or histology). In addition, probability maps were derived

Table 2. MR sequence parameters

916

TE (echo time) [ms] TR (repetition time) [ms] bandwidth [Hz/px] Matrix size Resolution [mm3] Excitation angle [°] Temporal resolution [s] b-values [mm2/s]

wileyonlinelibrary.com/journal/nbm

Dixon

T2

DWI

DCE

1.23/2.46 3.6 965 79 × 192 4.1 × 2.6 × 2.6 10

100 6500 200 320 × 320 0.63x0.63 × 3 90

60 6000 1860 108 × 192 2.6 × 2 × 5 90

4.4 1.3 501 184 × 256 1.13 × 1.13 × 2.5 15 12.5

50, 800

Copyright © 2015 John Wiley & Sons, Ltd.

NMR Biomed. 2015; 28: 914–922

CLASSIFICATION OF MULTIPARAMETRIC PET/MRI IN PROSTATE CANCER from the SVM classifier using the respective option of LIBSVM. The single steps of the proposed algorithm are summarized in Fig. 1. Manual tumor delineation In all 16 datasets, areas suspicious for the presence of prostatic cancer were delineated independently by two radiologists (N.F.S. and C.S., with 9 and 8 years of experience). Manual tumor delineation was performed on a manufacturer-provided work station (syngo TrueD, Siemens, Erlangen, Germany). Regions of interest (ROIs) were drawn slicewise on T2-weighted images taking into account all acquired parameters, and voxels within ROIs were defined as “tumor”. The thus-segmented regions were used for validation of the proposed algorithm in all datasets (reference standard). To this end, accuracy metrics (see below) were averaged between the two readers. Histological examination Five of the 16 patients underwent radical prostatectomy 1–14 days after PET/MRI. The excised prostate glands were fixed in formalin, embedded in paraffin and cut into 3 mm slices perpendicular to the urethral axis, agreeing with the orientation and slice thickness of the T2-weighted sequence. Tumor regions were outlined by a pathologist (M.S., 10 years of experience) and classified into highgrade (Gleason ≥ 7b) and low- or intermediate-grade (Gleason ≤ 7) regions. Additional findings (e.g. inflammation) were also reported. Histological slices were scanned slice by slice, obtaining images of all histology slices corresponding to single MRI slices. To correct for tissue deformation, corresponding histological images and T2-weighted MR images were automatically and non-rigidly coregistered based on discrete cosine transforms (9) using a dedicated toolbox (SPM, Version 8, FIL Methods Group, University College London). Histologically defined tumor regions outlined on the histological images were then transferred to the corresponding T2weighted images. Voxels within these regions were defined as “tumor”. The thus-segmented images were used for validation of the proposed algorithm in the five respective datasets (reference standard). Data analysis Data are expressed as mean ± standard deviation. Correlation coefficients of voxel parameters were calculated using Pearson or Spearman coefficients as appropriate.

Voxel-wise accuracy of classification results was calculated relative to the available standard, relative either to manual tumor delineation (all 16 datasets) or to histology (five datasets). To this end, the classification results (tumor or non-tumor) produced by the algorithms for each voxel were compared with the respective voxels in the reference standard. Overall accuracy was defined as percentage of correctly classified voxels. Rates of false positive voxels were defined as percentage of voxels classified as tumor where the respective standard was negative; rates of false negative voxels were defined as percentage of voxels classified as non-tumor where the respective standard was positive. In addition, algorithm performance was also assessed qualitatively on a lesion base. The influence of the number of parameters on the classification results was assessed by comparing the described performance metrics using parameter sets of different sizes (one to five parameters). For each set size, all possible combinations of parameters were used as input for the algorithm, and mean accuracy as well as errors were calculated. All 31 possible combinations of the five parameters were evaluated. The influence of each single parameter on classification was assessed by comparing mean accuracy and errors from all combinations of the remaining four parameters with mean accuracy and errors from all combinations that contain the respective parameter. To test for statistically significant differences in voxel-wise accuracies averaged over all tested datasets between the two different classification methods or between different combinations of imaging parameters, we used the Friedman test with Bonferroni correction (using SPSS statistics, IBM, Version 22); p-values less than 0.05 were considered significant. In the following, sFCM denotes classification with sFCM only, and sFCM/SVM classification using the sFCM-based SVM classifier.

RESULTS Correlation of imaging parameters Correlation analysis showed good to moderate voxel-wise correlations between the following pairs of parameters: Kep and Ktrans (r = 0.7 ± 0.13), Kep and PET (r = 0.52 ± 0.09), ADC and PET (r = 0.45 ± 0.25), T2 and ADC (r = 0.43 ± 0.18) and Ktrans and PET (r = 0.38 ± 0.14). Correlation coefficients of all pairs are shown in Table 3.

NMR Biomed. 2015; 28: 914–922

Copyright © 2015 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/nbm

917

Figure 1. Overview of the proposed algorithm. The unsupervised sFCM algorithm is used for generation of a labeled training set for supervised SVM training. Imaging data are then classified using the trained SVM classifier in order to identify tumor regions.

S. GATIDIS ET AL. Table 3. Voxel-wise correlation of imaging parameters of all datasets. Mean correlation coefficients ± standard deviations are reported ADC ADC Kep Ktrans PET T2

1 0.31 ± 0.18 0.24 ± 0.13 0.45 ± 0.25 0.43 ± 0.18

Kep

Ktrans

1 0.70 ± 0.13 1 0.52 ± 0.09 0.38 ± 0.14 0.01 ± 0.13 0.01 ± 0.13

PET

T2

1 0.12 ± 0.17 1

Classification accuracy compared with manual tumor delineation We observed marked variations in classification accuracy compared with manual tumor delineation depending on the chosen algorithm (sFCM or sFCM/SVM) and parameter set. sFCM/SVM consistently performed better than Sfcm only. The three best-performing combinations of parameters were (ADC, Kep, PET), (ADC, Ktrans, PET) and (ADC, T2, Kep, Ktrans, PET) for both algorithms with mean accuracies (%) of 84, 84 and 83 for sFCM/SVM and 78, 78 and 79 for sFCM, respectively. Regardless of the underlying algorithm, overall accuracy increased with increasing number of parameters used for classification (Figure 2(A)). Regarding single parameters we observed increasing mean classification accuracy when including ADC, Kep, Ktrans or PET in the parameter set (%: +2.5, +4.9, +3.7, +6.6 and +4.9, +4.1, +2.3, +10.2 for sFCM and sFCM/SVM respectively). However, the addition of T2 as a further parameter led to a decrease in mean classification accuracy for both algorithms (%: 8.8 and 6.8 for sFCM and sFCM/SVM respectively) (Figure 3(A)).

Classification accuracy compared with histological ground truth Figure 4 shows mean voxel-wise accuracy and false positive and false negative rates of manual tumor delineation as well as sFCM and sFCM/SVM using all five parameters compared with histological ground truth. Whereas sFCM showed relatively poor performance, sFCM/SVM reached accuracy rates similar to those for manual tumor delineation (%: 68, 89 and 87, respectively). The

Figure 3. Dependence of classification accuracy on single parameters without (w/o) and with (w) the presence of the respective parameter for 16 patient datasets compared with manual tumor delineation (A) and for five patient datasets compared with histological tumor delineation (B). * Statistically significant differences.

three best-performing combinations of parameters for sFCM/SVM were (ADC, Kep, PET), (ADC, Kep, Ktrans, PET) and (ADC, Kep, Ktrans, PET, T2), with mean accuracies of 88, 89 and 89%. For sFCM, the highest accuracy was observed using the parameter sets (Ktrans, PET), (Kep, Ktrans, PET) and (ADC, Kep, Ktrans, PET), with accuracies of 84, 83 and 83%, respectively. Again, mean accuracy markedly improved with increasing number of parameters for sFCM/SVM (Figure 2(B)). Regarding

918

Figure 2. Classification accuracy of sFCM and sFCM/SVM. Dependence of overall accuracy, false positive and false negative rates on the size of the parameter set compared with manual tumor delineation for 16 patient datasets (A) and compared with histological tumor delineation for five patient datasets (B). * Statistically significant difference between the curves.

wileyonlinelibrary.com/journal/nbm

Copyright © 2015 John Wiley & Sons, Ltd.

NMR Biomed. 2015; 28: 914–922

CLASSIFICATION OF MULTIPARAMETRIC PET/MRI IN PROSTATE CANCER All high-grade lesions were detected using the sFCM/SVM algorithm with the full parameter set. Lesions missed by this algorithm were small low-grade lesions (mean diameter 3.9 ± 2.4 mm). The sFCM/SVM algorithm led to false positive classification in two cases, both within the central gland; in these cases, moderate inflammation was reported in the respective areas by the pathologist (Figure 5). Overall, sFCM/SVM showed results that were consistent with the histological ground truth. In contrast, when using the sFCM algorithm with the full parameter set we observed overall incorrect classification in one case compared with histology (Figure 5) and a high rate of false positive voxels in two cases (Figure 6).

DISCUSSION

Figure 4. Accuracy of sFCM and sFCM/SVM using all five parameters compared with the histological standard for five patient datasets.

single imaging parameters, addition of PET to the parameter set led to the largest increase in accuracy (%: +6.9 and +17.9 for sFCM and sFCM/SVM, respectively), whereas addition of T2 again led to a decrease in mean classification accuracy (%: 17.3 and 13.7 for sFCM and sFCM/SVM, respectively) (Figure 3(B)). Qualitative analysis in comparison with histology Figures 5–7 show imaging data, histological slices, manual tumor delineation and the results of the sFCM and sFCM/SVM algorithms of three different datasets.

Automated analysis of multiparametric imaging data is a promising approach for simplifying diagnostic work-up and increasing diagnostic accuracy. The potential of classification algorithms has been shown for oncological imaging of the prostate (17–19), breast (20), lungs (8) and brain (21). Supervised as well as unsupervised approaches were used, with supervised algorithms showing better results (17). In these studies, SVM classifiers are the most widely used methods, with high accuracy of classification and the advantage of generalizability compared with unsupervised algorithms. However, as stated above, supervised classifiers depend on training data containing spatially resolved information about the histological gold standard, which is not available in many cases. In this study we introduced and validated an approach for the use of SVM classifiers without the need of histological correlation of the training set. Instead, we used unsupervised classification in the form of the sFCM algorithm to create artificial labels for the training dataset. The rationale behind this

NMR Biomed. 2015; 28: 914–922

Copyright © 2015 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/nbm

919

Figure 5. Imaging data, histological results, manual tumor delineation and clustering results of sFCM and sFCM/SVM using all five imaging parameters. High-grade tumors are histologically outlined in red, low-grade tumors in blue and an area of inflammatory changes by a dotted black line. Regions classified as tumor by the algorithms are overlaid in red; regions classified as normal tissue are overlaid in blue. sFCM clustering led to wrong classification. An example of a low-grade lesion that was not detected by sFCM/SVM is marked by a black arrow. An area of inflammatory changes that was classified as tumor by sFCM/SVM is marked by red arrows.

S. GATIDIS ET AL.

Figure 6. Imaging data, histological results, manual tumor delineation and clustering results of sFCM and SVM/sFCM using all five imaging parameters. High-grade tumors are histologically outlined in red, low-grade tumors in blue. Regions classified as tumor by the algorithms are overlaid in red; regions classified as normal tissue are overlaid in blue.

Figure 7. Imaging data, histological results, manual tumor delineation and clustering results of sFCM and SVM/sFCM using all five imaging parameters. The high-grade tumor is histologically outlined in red. Regions classified as tumor by the algorithms are overlaid in red; regions classified as normal tissue are overlaid in blue.

920

approach is the idea that, although unsupervised algorithms may be inaccurate in single datasets, their application on multiple datasets to create artificial labels and subsequent training of an SVM classifier using these labels may still lead to robust classification. This approach has been proposed before and applied to synthetic data using simple k-means clustering for the step of unsupervised classification (22). Indeed, in this study we observed a significant improvement in overall classification accuracy using the combined sFCM/SVM method compared with sFCM only. Furthermore, we observed

wileyonlinelibrary.com/journal/nbm

a robust behavior of the sFCM/SVM algorithm in the sense that classification results showed high qualitative similarity to the chosen standard without occurrence of obvious and complete misclassification of datasets. Although not directly comparable due to differences in methodology, the accuracy obtained by the algorithm proposed in this study is in a similar range to those in other studies using supervised algorithms (17,23). Unsupervised sFCM on the other hand showed complete failure of classification in single cases. This failure was presumably caused by the existence of more than two distinct regions (other

Copyright © 2015 John Wiley & Sons, Ltd.

NMR Biomed. 2015; 28: 914–922

CLASSIFICATION OF MULTIPARAMETRIC PET/MRI IN PROSTATE CANCER

NMR Biomed. 2015; 28: 914–922

These aspects have to be further addressed, which is ongoing work in our department.

CONCLUSION In this study we proposed a method for classification of multidimensional data using a combination of the unsupervised sFCM algorithm and an SVM classifier. We observed a marked improvement in classification accuracy and robustness compared with ordinary unsupervised classification. Taken together, the described classification method is a promising approach for automated analysis of multiparametric imaging data where a training set is not available and knowledge about underlying ground truth is limited.

Acknowledgements We thank our technicians Gerd Zeger and Carsten Groeper for making the PET/MRI measurements.

REFERENCES 1. Beyer T, Townsend DW, Brun T, Kinahan PE, Charron M, Roddy R, Jerin J, Young J, Byars L, Nutt R. A combined PET/CT scanner for clinical oncology. J. Nucl. Med.. 2000; 41(8): 1369–1379. 2. Schlemmer HP, Pichler BJ, Schmand M, Burbar Z, Michel C, Ladebeck R, Jattke K, Townsend D, Nahmias C, Jacob PK, Heiss WD, Claussen CD. Simultaneous MR/PET imaging of the human brain: feasibility study. Radiology 2008; 248(3): 1028–1035. 3. Antoch G, Saoudi N, Kuehl H, Dahmen G, Mueller SP, Beyer T, Bockisch A, Debatin JF, Freudenberg LS. Accuracy of whole-body dual-modality fluorine-18-2-fluoro-2-deoxy-D-glucose positron emission tomography and computed tomography (FDG-PET/CT) for tumor staging in solid tumors: comparison with CT and PET. J. Clin. Oncol.. 2004; 22(21): 4357–4368. 4. Cortes C, Vapnik V. Support-vector networks. Mach. Learn. 1995; 20(3): 273–297. 5. Jiang J, Trundle P, Ren J. Medical image analysis with artificial neural networks. Comput. Med. Imaging Graph. 2010; 34(8): 617–631. 6. Langer DL, van der Kwast TH, Evans AJ, Trachtenberg J, Wilson BC, Haider MA. Prostate cancer detection with multi-parametric MRI: logistic regression analysis of quantitative T2, diffusion-weighted imaging, and dynamic contrast-enhanced MRI. J. Magn. Reson. Imaging 2009; 30(2): 327–334. 7. Masulli F, Schenone A. A fuzzy clustering based segmentation system as support to diagnosis in medical imaging. Artif. Intell. Med. 1999; 16(2): 129–147. 8. Schmidt H, Brendle C, Schraml C, Martirosian P, Bezrukov I, Hetzel J, Muller M, Sauter A, Claussen CD, Pfannenberg C, Schwenzer NF. Correlation of simultaneously acquired diffusion-weighted imaging and 2-deoxy-[18 F] fluoro-2-D-glucose positron emission tomography of pulmonary lesions in a dedicated whole-body magnetic resonance/positron emission tomography system. Invest. Radiol. 2013; 48(5): 247–255. 9. Ashburner J, Friston KJ. Nonlinear spatial normalization using basis functions. Hum. Brain Mapp. 1999; 7(4): 254–266. 10. Reske SN, Blumstein NM, Neumaier B, Gottfried HW, Finsterbusch F, Kocot D, Moller P, Glatting G, Perner S. Imaging prostate cancer with 11 C-choline PET/CT. J. Nucl. Med.. 2006; 47(8): 1249–1254. 11. Park H, Wood D, Hussain H, Meyer CR, Shah RB, Johnson TD, Chenevert T, Piert M. Introducing parametric fusion PET/MRI of primary prostate cancer. J. Nucl. Med.. 2012; 53(4): 546–551. 12. Delso G, Furst S, Jakoby B, Ladebeck R, Ganter C, Nekolla SG, Schwaiger M, Ziegler SI. Performance measurements of the Siemens mMR integrated whole-body PET/MR scanner. J. Nucl. Med. 2011; 52(12): 1914–1922. 13. Tofts PS. Modeling tracer kinetics in dynamic Gd-DTPA MR imaging. J. Magn. Reson. Imaging 1997; 7(1): 91–101.

Copyright © 2015 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/nbm

921

than tumor and non-tumor regions) within the prostate. Thus, large parts of healthy tissue were classified as tumorous. This example illustrates how unsupervised classification with sFCM depends on knowledge about the number of clusters. A major challenge when applying automated analysis algorithms is the choice of suitable imaging parameters for classification. Our results suggest that supplementing multiparametric MRI data with [11C]-choline PET can markedly improve the accuracy of automated tumor detection. The addition of DWI and DCE MRI also – to a lesser extent – led to improved classification. The T2 parameter however had a negative impact on accuracy, despite the fact that clinical routine diagnostics strongly rely on T2 images for tumor localization. This can be explained by the fact that T2 signal intensities are relative values. Different distributions of signal intensities in different patients may thus have led to varying scaled values, resulting in false classification. A possible way to overcome this problem is quantification of T2 relaxation times (17). However, despite the problems caused by the addition of the T2 parameter, the parameter combination using the full available parameter set (T2, ADC, Ktrans, Kep, PET) was among the best-performing combinations for sFCM/SVM, implying that single parameters with unnecessary or misleading information can be compensated by including multiple useful parameters. In this study, we chose to use two clusters for sFCM and sFCM/SVM, aiming to segment datasets into regions of normal tissue and tumor regions. Prior knowledge justifying this approach was histological proof of prostate cancer in all patients. However, more than two physiologic and pathologic conditions of the prostate or other organ systems may exist (e.g. inflammation, fibrosis), and the number of clusters should be adjusted to the specific clinical question. In two cases we observed misclassification of inflammatory changes as tumor using the sFCM/SVM algorithm. It is known that inflammation can cause imaging patterns similar to prostate cancer (10). Defining additional clusters may help differentiate benign from malignant conditions and allow for the discrimination of low- and high-grade tumors using multiparametric data. Thus, the combination of multimodal, multiparametric imaging with machine learning approaches may allow for more precise characterization of significant prostate cancer and thus enable risk-adapted therapy. Furthermore, tumors and healthy tissues may behave differently depending on anatomical site. In this study we observed most misclassifications within the central zone of the prostate, whereas classification was accurate in the peripheral zone. This might indicate that different models should be used in the future for anatomically different regions. The proposed learning procedure has the potential to be used for further applications in medical imaging but also in other fields. In principle, it can replace simple unsupervised classification in cases where multiple datasets are available. In medical imaging, a possible application is the tumor detection in areas where spatially resolved histopathologic correlation is difficult, e.g. of the central nervous system. This study has limitations. Histological correlation was only available for five datasets. Although manual tumor delineation was used as substitute for the remaining datasets, the performance of the proposed algorithm and its robustness can only be fully appreciated with a larger number of histologically correlated datasets. This would also allow for optimization of algorithm parameters. Furthermore, the proposed algorithm can be extended, including advanced methods such as multiclass SVMs.

S. GATIDIS ET AL. 14. Chuang KS, Tzeng HL, Chen S, Wu J, Chen TJ. Fuzzy c-means clustering with spatial information for image segmentation. Comput. Med. Imaging Graph. 2006; 30(1): 9–15. 15. Bezdek JC, Hall LO, Clarke LP. Review of MR image segmentation techniques using pattern recognition. Med. Phys. 1993; 20(4): 1033–1048. 16. Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011; 2(3): 1–27. 17. Ozer S, Langer DL, Liu X, Haider MA, van der Kwast TH, Evans AJ, Yang Y, Wernick MN, Yetik IS. Supervised and unsupervised methods for prostate cancer segmentation with multispectral MRI. Med. Phys. 2010; 37(4): 1873–1883. 18. Peng Y, Jiang Y, Yang C, Brown JB, Antic T, Sethi I, Schmid-Tannwald C, Giger ML, Eggener SE, Oto A. Quantitative analysis of multiparametric prostate MR images: differentiation between prostate cancer and normal tissue and correlation with Gleason score – a computer-aided diagnosis development study. Radiology 2013; 267(3): 787–796. 19. Shah V, Turkbey B, Mani H, Pang Y, Pohida T, Merino MJ, Pinto PA, Choyke PL, Bernardo M. Decision support system for localizing

20.

21.

22.

23.

prostate cancer based on multiparametric magnetic resonance imaging. Med. Phys. 2012; 39(7): 4093–4103. Jacobs MA, Barker PB, Bluemke DA, Maranto C, Arnold C, Herskovits EH, Bhujwalla Z. Benign and malignant breast lesions: diagnosis with multiparametric MR imaging. Radiology 2003; 229(1): 225–232. Hu X, Wong KK, Young GS, Guo L, Wong ST. Support vector machine multiparametric MRI identification of pseudoprogression from tumor recurrence in patients with resected glioblastoma. J. Magn. Reson. Imaging 2011; 33(2): 296–305. Li M, Cheng Y, Zhao H. Unlabeled data classification via support vector machines and k-means clustering. In Proceedings. International Conference on Computer Graphics, Imaging and Visualization (2004). 2004. IEEE. 183–186. DOI: 10.1109/CGIV.2004.1323982 Liu X, Langer DL, Haider MA, Yang Y, Wernick MN, Yetik IS. Prostate cancer segmentation with simultaneous estimation of Markov random field parameters and class. IEEE Trans. Med. Imaging 2009; 28(6): 906–915.

922 wileyonlinelibrary.com/journal/nbm

Copyright © 2015 John Wiley & Sons, Ltd.

NMR Biomed. 2015; 28: 914–922

MRI data: application to prostate cancer.

Multiparametric medical imaging data can be large and are often complex. Machine learning algorithms can assist in image interpretation when reliable ...
2MB Sizes 0 Downloads 11 Views