ORIGINAL ARTICLE

Magnetic Resonance Imaging of Intraductal Papillomas: Typical Findings and Differential Diagnosis Matthias Dietzel, MD,*† Clemens Kaiser, MD/MBA,†‡ and Pascal A. T. Baltzer, MD†§ Objective: Even upon core biopsy, accurate classification of benign intraductal papillomas (IPs) can be difficult. Accordingly, IPs are still frequently surgically resected. Therefore, accurate classification of IP by magnetic resonance mammography (MRM) would potentially optimize patient management. However, the few investigations assessing MRM of IP included small patient collectives, and overall accuracy is still unknown. We performed this investigation to analyze the morphologic and dynamic MRM profiles of IP in more detail and to identify the overall accuracy of MRM for differential diagnosis of IP versus malignant breast lesions. Methods: Consecutive patients scheduled for MRM (standardized scanning protocols: dynamic T1-weighted gradient echo before/after Gd-DTPA [gadolinium diethylenetriamine pentaacetate; 0.1 mmol/kg body weight]; T2-weighted turbo spin echo) with subsequent surgicopathologic verification were enrolled. For the detailed assessment of morphologic and dynamic profiles, 2 experienced radiologists (>500 MRM examinations; blinded to surgicopathologic verification) performed prospective evaluation of MRM, in consensus, applying 17 predefined MRM descriptors. From this database, all patients showing IP (n = 83) or malignant breast lesions (n = 648) were further evaluated statistically: univariate analyses (association of single descriptors with IP/breast cancer: contingency table statistics) and multivariate analyses were performed to identify accurate descriptor combinations (CHAID [CHi-squared Automatic Interaction Detection]) and overall accuracy of MRM for differential diagnosis of IP versus malignant breast lesions (logistic regression; receiver operating characteristics [ROC], area under the ROC curve). Results: There were 82.4% of MRM descriptors significantly associated with IP (n = 14; P < 0.05). The accuracy of single descriptors (odds ratio [OR], ≤10.6) could be further increased by descriptor combinations (double combination: OR ≤12.7; triple combination: OR ≤15.0). With area under the ROC curve = 0.90, there was a high overall accuracy of MRM for the differential diagnosis of IP versus malignant breast lesions. Conclusions: A detailed assessment of MRM allows precise characterization of benign IPs and accurate differentiation from malignant breast lesions. Key Words: breast, neoplasms, magnetic resonance imaging, papilloma, differential diagnosis (J Comput Assist Tomogr 2015;39: 176–184)

A

mong all benign breast tumors, intraductal papilloma (IP) is probably the most challenging and controversial entity.1,2 It can show various clinical symptoms, accurate radiological diagnosis is currently limited, and even histopathologic examination can show false results.2–5 Even in the newer literature, there is

From the *Department of Neuroradiology, University of Erlangen-Nuremberg, Erlangen; †Institute of Diagnostic and Interventional Radiology I, FriedrichSchiller-University Jena, Jena; and ‡Institute of Clinical Radiology and Nuclear Medicine, University Medical Center Mannheim, Mannheim, Germany; and §Department of Biomedical Imaging and Image-guided Therapy, Medical University Vienna, Vienna, Austria. Received for publication June 24, 2014; accepted October 20, 2014. M.D. and C.K. contributed equally to this study. Reprints: Matthias Dietzel, MD, Department of Neuroradiology, University of Erlangen-Nuremberg, Schwabachanlage 6, D-91054, Germany (e‐mail: [email protected]). There are no disclosures of any author related to this work. Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

176

www.jcat.org

no consensus in sight. Some authors state that “mandatory excision of IP is unnecessarily (…) cautious,”6 whereas others recommend that “surgical excision is still warranted” in breast lesions yielding papilloma at magnetic resonance (MR) imaging–guided, vacuum-assisted biopsy.7 Indeed, therapeutic management of IP still remains rather aggressive, as complete surgical resection is still frequently performed to decrease the risk of false-negative diagnosis.3,8 The definition of IP is based on histopathologic criteria.4 Its hallmarks are arborescent proliferations (“papilla” [Latin, bud]) into the ductal lumen. The IP is then built up by 3 cell layers, whereas the “epithelial cells” form the luminal aspect, and the “myoepithelial cells” form the intermediate aspect of the tumor. Finally, the tumor is circumscribed by the peripheral cell layer, forming the “basement membrane.”4 Because of its unique location within the lactiferous duct, IP can be associated with nipple discharge. Furthermore, depending on size and location, some IPs might be palpable. However, clinical symptoms are not reliable and depend—in addition to other factors—on the size and the location of the tumor.2,4 Intraductal papilloma can show malignant foci more frequently compared with other benign breast tumors, for example, fibroadenomas.4 Such malignant transformation might be very small and can be missed if the tumor is not completely sampled histologically.3,8 Accordingly, data on the diagnostic performance of minimally invasive tissue sampling, for example, core biopsy, show limited accuracy, and false-negative results can occur.3,7,8 Because of this uncertainty, complete surgical removal is still the standard of care in many centers. Altogether, this leads to a remarkable situation: a lesion that requires surgical removal even though it has a definite classification as “benign.” This situation is certainly not optimal, and this is becoming more relevant in daily practice. Because of the increasing application of mammography, the number of lesions that require biopsy and reveal a diagnosis of IP has risen significantly.9 Through noninvasive diagnostic methods, medical imaging might help to solve this dilemma. However, both ultrasound and mammography have limited potential for the assessment of IP.5 Furthermore, IP can be occult in mammography, particularly if located in the central aspect of the parenchyma. In addition, lesion characteristics can be indeterminate on both ultrasound and mammography. Accordingly, Lam et al5 identified the low accuracy of mammography and ultrasound either as standalone techniques or in combination (sensitivity: 69%/56%/61% [mammography/ ultrasound/mammography and ultrasound combined]; specificity = 25%/90%/33%). This is why these authors concluded that mammographic and ultrasound features do not show sufficient potential to accurately differentiate IP from malignant lesions. Magnetic resonance imaging of the breast (MR mammography [MRM]) is an alternative image method for the detection of IP. Magnetic resonance mammography shows excellent soft tissue contrast and characterizes tissue vascularization.10 Of all imaging methods, MRM has the highest sensitivity for the detection of invasive breast cancer. As it enables detailed tissue characterization, MRM can be applied for the differential diagnosis of breast lesions (malignant or benign) and even for subtyping cancers.11 J Comput Assist Tomogr • Volume 39, Number 2, March/April 2015

Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

J Comput Assist Tomogr • Volume 39, Number 2, March/April 2015

Accordingly, MRM appears to be a promising method for the assessment of IP. However, data on this issue are sparse. The published studies investigated small patient collectives using basic MRM characteristics without systematically comparing with control groups.12–18 Accordingly, the accuracy of MRM for the differential diagnosis of IPs remains unknown. This study aims to further investigate the MRM characteristics of IP in a considerably larger collective. First, a detailed catalog of MRM descriptors was applied to characterize IP. Second, such features were correlated with a malignant control group in a univariate and multivariate fashion. Ultimately, this allowed an estimation of the overall accuracy of MRM for the differential diagnosis of IPs.

MATERIALS AND METHODS Study Collective The study population was recruited from a long-term database that included all consecutive MRM examinations performed over a period of 12 years. Indications for MRM were unclear results in previous breast examinations (eg, breast ultrasound, xray, mammography) and patients with suspected malignancy prior to surgical intervention. All patients gave written, informed consent to the examination. This investigation was approved by our local ethical committee. Subsets of this database have been studied previously in a different context (for example, see Dietzel et al11). Data collection was planned before the index test and reference standards were established. Of this collective, all patients who fulfilled the following criteria were prospectively included into a database: (1) absence of breast biopsy/intervention up to 12 months before MRM, (2) absence of and chemo/radiation therapy up to 12 months before MRM, and (3) subsequent surgicopathologic verification (excisional biopsy) after MRM. Items 1 and 2 were necessary to avoid posttherapeutic bias.

Reference Standard Histopathologic verification was performed by experienced, board-certified breast pathologists at the Department of Pathology (University Hospital Jena, Germany). Surgical excisional biopsy was defined as the standard of reference (SOR). This was chosen because minimally invasive techniques (core-needle biopsy, vacuumassisted biopsy, etc) still have a significant risk for tissue undersampling. This is particularly problematic in the case of papillomas, because additional malignant foci can be present in otherwise completely benign and unsuspicious tumors, leading to a falsenegative diagnosis.3,8 Accordingly, the study collective was dichotomized as follows. All IPs without any evidence of malignant foci were defined as IP and formed the primary study collective. The malignant control group (M) consisted of all invasive or in situ cancers in the database.

Technical Specification of MRM Magnetic resonance mammography was acquired at the Institute of Interventional and Diagnostic Radiology (University Hospital Jena). Every patient was examined with a standardized protocol using clinical MR scanners at 1.5 T and dedicated bilateral breast coils. Patients were examined in the prone position, and the standard scan orientation was axial. We started with spoiled dynamic T1-weighted (T1w) gradient echo sequences

MRI of Intraductal Papillomas

(temporal resolution = 60 seconds). Initially, a precontrast scan was acquired. Then, contrast agent (Gd-DTPA [gadolinium diethylenetriamine pentaacetate], Magnevist; Bayer HealthCare, Berlin, Germany) was injected intravenously as a rapid bolus (3 mL/s) followed by a saline flush (30 mL) and a subsequent delay of 30 seconds. Then, 7 postcontrast scans were measured subsequently. Postprocessing provided subtractions of precontrast from postcontrast dynamic images. Technical parameters for dynamic T1w gradient echo sequences were 100 to 110 milliseconds (repetition time), 5 milliseconds (effective echo time), 80 degrees (flip angle), 3 to 4 mm (slice thickness), 350 mm (field of view), and 256–384 pixel (matrix). In addition, 1 T2-weighted (T2w) turbo spin echo (TSE) sequencewas acquired in the same orientation, slice position, and field of view. Technical parameters for T2w scans were 4000 to 8900 milliseconds (repetition time), 200 to 300 milliseconds (effective echo time), 90 degrees (flip angle), 3 to 4 mm (slice thickness), and 256–512 pixel (matrix). All sequences were acquired without fat saturation.

Interpretation of the MRMs The database was analyzed by 2 experienced radiologists, in consensus (>500 examinations). Readers were aware of the clinical history of the patients but were blinded to the SOR. They were specially trained in the assessment of MRM and completed a training course to optimize the evaluation of qualitative descriptors and to decrease interobserver/intraobserver variability. Magnetic resonance mammography training was provided by Werner A. Kaiser, a pioneer of the technique with more than 20 years of clinical experience. Reading conditions were standardized in every examination with regard to the hanging protocol and window setting (levels of center, width, magnification). This allows a synchronous interpretation and comparison of all MR sequences within 1 patient (precontrast vs postcontrast vs subtractions vs T2). In our experience, this approach is essential for an accurate and reliable MRM assessment. In every lesion, a standardized checklist of qualitative MRM descriptors was assessed. Initially, lesion size was evaluated as the largest diameter on contrast-enhanced T1w scans. Then, standard dynamic (washout, plateau, etc) and morphologic (margin, internal structure) features were assessed. To refine the analysis of tissue composition, in addition, detailed descriptors were applied, as described in the following list: Skin Thickening. Skin thickening was assessed on T1w images before the injection of the contrast agent. If smooth skin was delineated with a diameter of less than 5 mm, the feature “skin thickening” was rated as “absent.” This feature would also not be rated positive if the patient had a history of ipsilateral radiation therapy or breast surgery during the last 12 months. Skin thickening is typically seen in highly aggressive breast cancers.11,19 Intact Nipple Line. With T2w images or precontrast T1w images, it is usually possible to delineate a hypointense line dorsal to the nipple. During the dynamic scans, this line usually shows a minor enhancement. Any disruption of the nipple line is considered to be suggestive of invasive cancers.11,19 Prominent Vessels. Tumor growth typically leads to neoangiogenesis. Macroscopically, this process is visualized by vessels leading either directly to the tumor (“adjacent vessel sign”) or to a diffuse hypervascularity of the affected breast (“ipsilateral”). Both features are indicators of invasive tumor growth, but can also be observed in benign lesions.20,21 Necrosis Sign. This feature is diagnosed if a hyperintense center can be delineated within an otherwise hypointense lesion. The necrosis sign is assessed on T2w scans exclusively.22

© 2015 Wolters Kluwer Health, Inc. All rights reserved.

Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

www.jcat.org

177

J Comput Assist Tomogr • Volume 39, Number 2, March/April 2015

Dietzel et al

Hook Sign. This describes a spiculated dendrite arising from the lesion's center that shows direct connection to the pectoral muscle. The hook sign is assessed on T2w scans and should not be rated as positive if there is history of previous breast surgery.23 Root Sign. If a singular irregularity, similar to a spicule, can be delineated within an otherwise smooth-bordered lesion, the root sign is present. It can be visualized on precontrast T1w, as well on T2w scans. The root sign is a typical feature of malignant tumor growth.24 Edema. If an abnormal T2w hyperintensity is noted adjacent to the tumor, “perifocal edema” is diagnosed. If the edema involves a larger area of the affected breast, then “diffuse ipsilateral edema” is present. To avoid false-positive ratings, edema should not be rated positive if there is a history of ipsilateral radiation therapy or breast surgery during the last 12 months.25 Blooming Sign. The blooming sign is defined as a lesion with smooth margins 1 minute after contrast application that becomes increasingly blurry during the following scan. If absent, it is a typical feature of benign lesions.26

Figure 1 summarizes schematic drawings of all detailed descriptors. Further details are beyond the scope of this article and can be found in the cited literature.11,19–26

Statistical Analysis Univariate Analysis The prevalence of each descriptor was documented in the database on a categorical scale. Based on the SOR (IP vs M), contingency tables were constructed (2-sided χ2 tests; α ≤ 5%). To control for possible α accumulation, Bonferroni correction was applied. Finally, estimates of diagnostic accuracy were calculated for single descriptors (sensitivity, specificity, positive/negative likelihood ratio [L+/L−], and odds ratio [OR]).

Multivariate Analysis A decision tree was designed (CHAID: CHi-squared Automatic Interaction Detection). This allows intuitive multidimensional assessment of complex data.19 The CHAID tree was used to identify a useful combination of descriptors for the assessment of IP. The Bonferroni method was again used to adjust significance values. The minimum number of cases in the “Parent” and “Child” nodes was set to 5 (Fig. 2). In the next step, logistic regression was performed to assess the overall accuracy of MRM for the detection and differentiation of IP. The last reference category was set as indicator contrast.

FIGURE 1. Summary of detailed descriptors for the analysis of IP. A shows a healthy right breast in the axial orientation. Note the absence of a lesion and the smooth skin. A T2w and T2w hypointense line dorsal to the mamilla can also be delineated (“intact nipple line”). On the contralateral left side, there is marked “skin thickening.” In B, an oval mass is present within the right breast. It shows one singular irregularity, similar to a spicule, within an otherwise smooth-bordered lesion. This finding is consistent with a positive “root sign.” On the contralateral left side, there is a small lesion dorsal to the mammilla. It shows a “destroyed nipple line,” a feature suggestive of a malignant growth pattern.11,19–26 C summarizes the assessment of vessels. Classic findings include diffuse hypervascularity of the affected breast (right: “prominent ipsilateral vessels”) or vessels leading directly to the tumor (left: “adjacent vessel sign”). D, If a T2w hyperintense lesion shows a T2w hyperintense center, the “necrosis sign” is positive (right breast). If a lesion presents with a spiculated dendrite, arising from the lesion's center and showing direct connection to the pectoral muscle, the “hook sign” is present (left breast). E illustrates the assessment of edema patterns. Classic findings include diffuse T2w hyperintensity within the affected breast (right: “diffuse ipsilateral edema”). With only a T2w hyperintensity in the area surrounding the lesion, “perifocal edema” is present (left breast). F summarizes the assessment of the “blooming sign.” The latter is a mixed dynamic and morphologic criterion. The blooming sign is negative if either lesion’s margins stay constantly sharp during the whole dynamic phase [F.1.], or “unsharp margins” are already present at the first postcontrast scan [F.3]. The blooming sign is positive if a lesion shows sharp margins 1 minute after contrast application, becoming increasingly blurry during the dynamic scan [F.2].

178

www.jcat.org

© 2015 Wolters Kluwer Health, Inc. All rights reserved.

Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

J Comput Assist Tomogr • Volume 39, Number 2, March/April 2015

Backward feature selection was performed to identify significant and independent covariates and to exclude overfitting (criteria for entry/removal: P < 0.05/>0.10). The quality of the model was addressed by receiver operating characteristics, AUC (area under the receiver operating characteristic curve, 95% confidence intervals [CIs]), and Nagelkerke r2.

RESULTS Characterization of Patients and Subgroups Seven hundred thirty-one patients were included in the analysis. Eighty-three were diagnosed with IP (mean age, 56.7 years [range, 26–78 years]). There were 648 patients diagnosed with breast cancer (mean age, 59.0 years [range, 25–91 years]). Invasive ductal cancers were present in 347 (53.5%) and invasive lobular cancers in 108 cases (16.7%). In situ cancers were present in 84 patients (13%). Further subtypes of malignant breast lesions, for example, invasive tubular, medullary, and mucinous cancers, were present in 109 patients (16.8%). Both IP (63.8%) and M (63.0%) typically showed a size between 5 and 20 mm (not statistically significant).

Univariate Analysis There were 14 of 17 descriptors that showed a significant association with IP (internal structure: P = 0.01; all other features: P < 0.001). The prevalence of individual descriptors in both subgroups is summarized in Table 1. Odds ratios showed a range between 2.2 and 10.6. High accuracy was identified for homogenous internal enhancement (OR = 7.2; L+ = 4; sensitivity = 50.6%), smooth margin (OR = 7.7; L+ = 4.2; sensitivity = 51.8%),

MRI of Intraductal Papillomas

internal septation (OR = 6.8; L+ = 6.2; sensitivity 9.6%), or high signal intensity on T2w (OR = 6.4; L+ = 5.5; sensitivity = 16.9%). Table 2 summarizes the estimates of diagnostic accuracy for individual descriptors.

Multivariate Analysis Decision Tree The CHAID growth method identified 6 MRM descriptors as most significant for stratifying breast lesions into IP and M. Of note, the majority of such descriptors (4/6: 66.6%) belonged to the detailed descriptors. Among all descriptors, the “root sign” was identified as the most significant factor in stratifying the lesion (node 1; P < 0.001). If present, the likelihood of IP decreased by 70%. If “washout” was also present and “homogenous enhancement” was absent, the diagnosis of IP could almost certainly be ruled out (specificity = 98.8%; OR = 74.7; L+ = 39.6). If the “root sign” was absent, the likelihood of IP increased by 143%. If the “blooming sign” was absent, the likelihood of IP further increased significantly (P < 0.001; specificity = 84.6%; OR = 12.7; L+ = 4.5). If a lesion also showed smooth margins, diagnostic accuracy could be further increased (P < 0.001; specificity = 95.4%; OR = 15.0; L+ = 9.1). Figure 3 summarizes the results of the CHAID tree.

Logistic Regression If the MRM descriptors were combined into a multivariate model, significant potential was identified for the differential diagnosis of IP (P < 0.001). Applying backward feature selection, an overall accuracy of AUC = 0.90 was verified (CI = 0.87–0.93;

FIGURE 2. CHi-squared Automatic Interaction Detection tree chart for the step-by-step differential diagnosis of benign IPs versus malignant breast lesions (M). The initial study population (node 0, n = 731; IP: n = 83; M, n = 648) is split into child nodes (nodes 1–12) by independent variables (MRM descriptors) showing the highest discriminatory power on the basis of χ2 statistics. After 2 ramifications, the study collective was split into 6 terminal nodes (nodes 7-12), where no further differentiation could be achieved. Bonferroni-corrected P value was 0.002 (washout?), 0.027 (vessels?), and

Magnetic resonance imaging of intraductal papillomas: typical findings and differential diagnosis.

Even upon core biopsy, accurate classification of benign intraductal papillomas (IPs) can be difficult. Accordingly, IPs are still frequently surgical...
3MB Sizes 0 Downloads 7 Views