Original article

Multicenter evaluation of renography with an automated physical phantom Anssi O. Nykänena, Pentti J. Rautiob, Jussi V. Aarnioa and Jari O. Heikkinena Purpose The diversity of the dynamic radionuclide renal imaging (renography) study protocols sets challenges for the overall study quality, therefore raising a need for national quality control. The aim of this study was to encourage the standardization of renography in Finland and to evaluate the development after a previous study performed in 1997. Methods The new Heikkinen phantom was imaged in each of the 20 participating nuclear medicine laboratories. The results were interpreted in the manner of a regular patient study, and reconstructions and printouts were made according to the clinical routines of each laboratory. Four quantitative parameters were calculated and compared between laboratories. The reports were also assessed in a blind test.

Conclusion The results showed that the errors in Tmax and relative uptake were generally within quite acceptable margins, and the variation in quantitative parameters between laboratories was shown to be smaller than 14 years earlier. The reason might be the use of new software packages as well as increased efforts to improve the quality of the studies. Nucl Med Commun 35:977–984 © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins. Nuclear Medicine Communications 2014, 35:977–984 Keywords: audit, multicenter, phantom, quality control, renography a Department of Medical Physics, Etelä-Savo Hospital District, Mikkeli Central Hospital, Mikkeli and bDepartment of Clinical Physiology, North Karelia Central Hospital, Joensuu, Finland

Results The average error in Tmax values ranged from − 5 to 7% (− 29 to + 18% in 1997), in T1/2 from 0 to 35% (− 43 to + 66%), in RCA20 from − 20 to + 28% (− 50 to + 82%) and in relative uptake from − 3 to 5%. The difference from average in relative uptake ranged from − 4 to 5% (− 21 to + 36%).

Correspondence to Jari O. Heikkinen, PhD, Department of Nuclear Medicine, Etelä-Savo Hospital District, Mikkeli Central Hospital, FIN-50100 Mikkeli, Finland Tel: + 358 44 351 2452; e-mail: [email protected]

Introduction

measure the technical performance of the gamma camera. Thus, these protocols do not necessarily reflect what is adequate for renography.

The availability of different acquisition systems, analysis software and multiple quantitative parameters makes for diverse ways of performing dynamic radionuclide renal imaging (renography). The study protocols also vary between facilities [1,2], which raises the need for regular interlaboratory quality control to assure the homogeneity of diagnoses and quality of the entire imaging chain. The quality of many individual parts of the imaging procedure is easy to control. Gamma cameras and other equipment can be monitored using their own set of quality control procedures, and image quantification can be evaluated using patient data [1,3] or software and hybrid phantoms [4,5]. These methods, however, have their limitations. When using patient images for quality control, the true values of studied parameters cannot be known, and thus the accuracy of the measurements cannot be evaluated. Software and hybrid phantoms avoid this particular problem; however, measurements taken in this manner exclude the evaluation of data collection procedures and equipment. As for the quality control protocols for imaging equipment, they often differ between manufacturers and are primarily designed to With contributions from the following hospitals: Joensuu, Kuopio, Vaasa, Hämeenlinna, Seinäjoki, Mikkeli, Kokkola, Lahti, Turku, Helsinki, Jorvi, Jyväskylä, Savonlinna, Lappeenranta, Kotka, Tampere, Pori, Kemi, Kajaani, Rovaniemi. 0143-3636 © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins

Received 14 December 2013 Revised 19 March 2014 Accepted 13 May 2014

The quality of renography can be controlled by regularly evaluating the whole imaging chain from referral to diagnosis with a dynamic physical phantom [6–10], with which identical patient-like renography data can be acquired by different facilities. Using this evaluation method, the true values of the measured parameters can be acquired. Therefore, the whole imaging process can be evaluated and different sources of error identified [2]. Audits of this kind should be made regularly in addition to the normal quality control protocols to assure the homogeneity of imaging and image quality between laboratories. Heikkinen has developed two versions of a physiological dynamic renal phantom [6,7], the latter being almost completely automated. These phantoms are based on moving steel and lead attenuators that regulate the radiation reaching the camera from constant gamma ray sources. Other similar phantom constructions, based on the flow of the radioactive liquid and permeable membranes, have been reported [8–10]. The previous Finnish renal phantom audit [2] in 1997 showed that nuclear renal imaging in Finland is somewhat heterogeneous, and thus needs regular quality DOI: 10.1097/MNM.0000000000000153

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

978

Nuclear Medicine Communications 2014, Vol 35 No 9

assurance. The large variations noted in the quantitative parameters in the previous audit were believed to be caused by varying protocols. This second audit was executed to evaluate possible development in Finnish nuclear renal imaging since the first audit. In this study, a new automated phantom [6] was used. Being automated, the phantom had almost no need for a human operator during measurements, and it therefore reduced the risk for human error and provided more consistent images compared with the previous version. The results of this audit showed that there has been considerable unification in analysis software throughout Finland, and the consistency of the parameter calculation has improved. The consistency and accuracy of most quantitative parameters and the results thereof have also increased. Acquisition protocols, however, were shown to still be very heterogeneous.

Methods The phantom

The phantom used in this study was developed by Heikkinen [6]. It consisted of containers filled with technetium-99m solution that simulated the kidneys, bladder, heart, liver and soft tissues. The functions of the organs were produced with various attenuators (Fig. 1). The kidney function was simulated by removing and adding steel plates between the radioactive kidney containers and the gamma camera. Blood activity was simulated using a rotating lead plate with varying attenuation between the heart container and the gamma camera. The filling of the bladder was simulated by a lead plate, which gradually revealed the bladder container to the camera. To create patient-like background activity to the images, Fig. 1

The phantom in operation. (1) Container of the left kidney, (2) steel attenuator plates simulating kidney function, (3) bladder container, (4) rotating lead attenuator plate simulating heart activity and (5) lead plate simulating body uptake.

the phantom also had technetium-99m containers simulating the soft tissues and the liver. Three simulations were programmed in the phantom. The first simulated normal kidney functionality, the second obstruction and the third dilation of the pelvis. Theoretical time–activity curves of the kidneys in each simulation are shown in Fig. 2. Figure 3, then, shows an image series of the third simulation. The containers of the phantom were filled with the same activities (Table 1) for each acquisition, and varying kidney containers were used for different simulations. The liver activity was reduced manually during simulation 1 to simulate physiological liver activity. The amount of radioactive liquid was reduced at a rate of 2 ml/min from 80 to 40 ml during the first 20 min of the simulation. In the other two simulations, the background around the kidneys remained constant. In the third simulation, an additional pelvis accumulation was simulated using a balloon catheter under the left kidney. The balloon was filled manually during the acquisition at a rate of 0.2 ml/ min from 4 to 24 min. All other functions of the phantom were automated. Study design

All Finnish nuclear medicine departments were invited to participate in the audit. In each department, the gamma camera that was most commonly used for renography was used in the study. An option for multiple cameras per department was available. Only one data set was studied in the audit; however, independent interlaboratory quality control using the phantom data was encouraged. The audit was executed by imaging three phantom simulations in each participating laboratory. Each laboratory used its own standard imaging protocols. The participants calculated quantitative parameters and provided medical diagnoses of the medical states simulated by the phantom. The raw data, calculated parameters and diagnoses were collected and studied by the authors of this study. The studied parameters were time to reach maximum activity (Tmax), time to reach half activity from maximum activity (T1/2), percentage of maximum kidney activity remaining at 20 min (RCA20) and relative uptake. In addition, information of the acquisition system and the analysis software used was requested. The camera systems used and analysis software packages are listed in Table 2 along with the software used in the previous audit. Individual feedback was also provided to the participants so that they could compare their results with the anonymous results of other participants and the true values of the parameters. True values

The true value of the relative function was determined by calculating the weighted integral of the theoretical

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Multicenter evaluation of renography Nykänen et al. 979

Fig. 2

Consistency measurements Simulation 1

The consistency of the data provided by the phantom was measured by the authors before the audit. Five data acquisitions were made by same person using an identical phantom and gamma camera setup. The phantom was prepared with same activity concentrations as in the audit, and special care was taken in phantom placement and camera operation to produce as consistent data as possible. These data were then analyzed using the same method that was used later in the centralized analysis.

Left Right

900

Activity

600

300

Analysis of the results 0 0

600

1200

1800 Left Right

Simulation 2 1200

Activity

900 600 300 0 0

600

1200

Simulation 3

1800 Left Right

900

The data were assessed by calculating the parameters for each data set. An analysis method designed to reduce human-based variation was used in the centralized analysis. Student’s t-test for two correlated samples was used to compare the results from the analysis performed by the authors of this study and results from the participants. The results from the analysis performed by the authors of this study and results from the participants were compared using Student’s T-test for two correlated samples to determine the variation caused by the different analysis protocols and the person performing the analysis. The errors of different methods for calculating relative uptake (Patlak and integral) were also compared when results for both methods were provided. Student’s t-test for two correlated samples was used in this comparison as well. Finally, all results were compared with the results of consistency measurements of the phantom to separate normal phantom-based variation from true image-based or analysis-based variation. A nuclear medicine physician who was familiar with the phantom and the simulations evaluated the diagnostic reports provided by the hospitals in a blind test. Each report was given points from − 5 to 39 (Table 3). The results of the evaluation of the reports were distributed to the participants as feedback, along with instructions on how to improve the quality of reporting.

Activity

600

300

0 0

600

1200

1800

Time Theoretical renal curves of the simulations. The curve does not show the activity of the balloon, which simulates the pelvis of the left kidney in simulation 3.

renal curves of the kidney containers over a time interval of 1–2 min. Renal curves were acquired from the programmed timing of the attenuator plates. The activities of the kidney containers were measured without the attenuators. Other true values were determined from the timing of the attenuators.

Results A total of 21 gamma cameras in 20 hospitals were evaluated for the study. Tmax values were provided from 20 cameras, T1/2 from 11, RCA20 from 14 and relative function from all 21 cameras. Three of the 20 hospitals used only the Patlak plot for determining the relative function, five used the integral method and 12 used both the Patlak plot and the integral method. In 11 cases, the parameters were calculated by technicians and in 10 cases by physicists. A diagnostic report was provided from 20 acquisitions, one of which provided two. True values

The calculated true values for the simulations are shown in Table 4. Some values for the left kidney in the second and third simulations could not be calculated because of abnormal kidney function (the activity never decreases)

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

980

Nuclear Medicine Communications 2014, Vol 35 No 9

Fig. 3

40:144(0)

70:144(0)

111:144(0)

130:144(0)

132:144(0)

144:144(0)

S I N

D E X

0s

40.08s

1m20.08s

2m00.09s

2m40.09s

3m20.010s

144:144(0)

132:144(0)

122:144(0)

109:144(0)

159:144(0)

138:144(0)

4m00.10s

4m40.10s

5m20.11s

6m00.11s

6m40.12s

7m20.12s

220:144(0)

227:144(0)

252:144(0)

316:144(0)

329:144(0)

336:144(0)

8m00.12s

8m40.13s

9m20.13s

10m00.14s

10m40.14s

11m20.14s

346:144(0)

361:144(0)

357:144(0)

380:144(0)

392:144(0)

411:144(0)

12m00.15s

12m40.15s

13m20.16s

14m00.16s

14m40.16s

15m20.17s

429:144(0)

475:144(0)

455:144(0)

443:144(0)

481:144(0)

497:144(0)

16m00.17s

16m40.18s

17m20.18s

18m00.18s

18m40.19s

19m20.19s

481:144(0)

20m00.19s Example image series of simulation 3. The balloon simulating the left renal pelvis is shown as increasing activity in the middle of the images.

Quantitative parameters

audit), in T1/2 from 0 to + 35% (− 43 to + 66% in 1997), in RCA20 from − 20 to + 28% (− 50 to + 82% in 1997) and in relative uptake from − 3 to + 5%. The errors of relative uptake in the 1997 audit are related to the mean of the results in that study as no true value is reported.

Figure 4 shows the average errors in the simulations for each hospital and parameter for the 2011 and 1997 audits. The signs of the errors have been removed in the figure. The average error in Tmax values in the current study ranged from − 5 to + 7% (− 29 to + 18% in the 1997

The difference between the mean value of the results and the true value for each parameter and their respective true values and SDs is shown in Table 5. Tmax, T1/2 and RCA20 values for the left kidney in the second and third

and the unknown count rate of the balloon that simulated the renal pelvis in the third simulation.

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Multicenter evaluation of renography Nykänen et al. 981

Table 1

Volumes and activities of phantom containers

Container

Table 4

Volume (ml)

Activity (MBq)

590 80 11 25

11.4 1.82 1.36 18.2

180 180 180 180 180 180 2

22.7 30.0 59.1 30.0 14.8 30.0 1.36

Background Liver Heart Bladder Kidneys Left 1 Right 1 Left 2 Right 2 Left 3 Right 3 Pelvis

Calculated true values for numerical parameters

Kidney Simulation 1 Left Right Simulation 2 Left Right Simulation 3 Left Right

Tmax (min)

T1/2 (min)

RCA20 (%)

Relative uptake (%)

3.0 3.0

3.5 3.5

21 21

46.7 53.3

– 3.2

– 5.3

– 23

28.9 71.1

– 3.8

– 2.9

– 23

43.1 57.9

RCA20, maximum kidney activity remaining at 20 min.

Relative uptake calculation methods Table 2

Acquisition systems and analysis software of hospitals Gamma camera

Analysis software 2011

Analysis software 1997

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

GE Infinia Hawkeye GE Infinia Hawkeye GE Infinia Hawkeye Philips IRIX Siemens E-Cam Duet GE Infinia Hawkeye Siemens E-Cam Siemens E-Cam Siemens Symbia Siemens E-Cam GE Infinia Hawkeye Siemens E-Cam Philips Adac skylight GE Infinia Hawkeye Siemens E-Cam Philips adac Forte Siemens E-Cam GE Infinia Hawkeye Siemens Symbia GE Infinia Hawkeye GE Infinia Hawkeye

Hermes Hermes Hermes Hermes Hermes Xeleris Hermes Hermes Hermes E-Soft Xeleris Hermes Hermes Hermes Hermes Hermes Hermes Hermes Hermes Hermes Hermes

Hermes – Adac Hermes Hermes Toshiba Hermes Hermes Gamma-11 Gamma-11 Toshiba – Toshiba Toshiba – – Elscint Siemens – Hermes Hermes

Table 3

Point categories for assessing the diagnostic reports

Hospital

Category Radiopharmaceutical and given dose Diuretic and given dose Patient position Reporting of parameters Anatomy Assessment Postvoid image Structural logicality Logicality of conclusions Correctness of conclusions Clear language Layout, division into paragraphs, etc. Others Total

In the first simulation, the Patlak method provided more accurate results for relative uptake than the integral method (P < 0.01), whereas in the other two simulations the integral method provided results closer to the true value (both P < 0.01). This phenomenon was observed in both hospital results and centralized results. Coefficients of variation for the audit results are shown in Fig. 5. Values are shown for hospital results, centralized results and for five consistency measurements acquired before and after the audit. Background subtraction

Lateral background correction regions of interest (ROIs) were used in 14 of 21 analyses and inferolateral in five analyses. The last two data sets (hospitals 5 and 19) were analyzed in the same department and did not use background correction at all. The ROIs used varied greatly in size and angle. None were positioned directly inferior to the kidney, but some facilities used small inferolateral ROIs to avoid drawing the ROI over the liver (Fig. 6a), whereas others used ROIs drawn 180° around the kidney from the top to the bottom (Fig. 6b).

Points 0–2 0–2 0–1 − 2 to 4 0–6 0–4 0–1 0–4 0–4 0–2 0–4 0–2 − 3 to 3 − 5 to 39

simulations are not presented, because the activity curve for those kidneys increases during the whole simulation. For the relative uptake only one value is given for each simulation. The difference between the errors of centralized analysis and the analysis executed in hospitals was statistically significant in the integral method for relative uptake (P < 0.01), whereas the difference was insignificant in other parameters.

Diagnostic reports

The points given for the diagnostic reports are presented in Fig. 7. The mean results were 17.0 (range 4.5–24.75) for simulation 1, 23.0 (range 8–35) for simulation 2 and 20.3 (range 10–27.25) for simulation 3. The maximum number of points for each simulation was 39.

Discussion The aim of this study was to evaluate the consistency and accuracy of gamma camera renography in Finland. Dynamic renal phantom data were acquired and analyzed in participating hospitals, and the true values of the parameters were known. The results were, then, compared with the previous renal audit in Finland [2]. The results showed an increase in accuracy and consistency since the previous audit in most parameters, although the consistency of T1/2 and RCA20 still varied and T1/2 results were in most cases larger than the true value. As the imaging systems have improved since 1997, it was difficult to assess the effect of the feedback from the last

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

982 Nuclear Medicine Communications 2014, Vol 35 No 9

Fig. 4

Tmax11 Tmax97

16.0 12.0 8.0 4.0

40.0 Average error (%)

Average error (%)

20.0

20.0 10.0 0.0

2 19 5 14 11 7 13 21 8 3 20 10 18 12 16 9 15 4 17 1 6

7 1 6 16 14 18 5 20 10 9 13 3 11 15 8 12 4 21 17 19 2

0.0 Hospital

Hospital RCA20 11 RCA20 97

80.0 60.0 40.0 20.0

40.0 Average error (%)

100.0 Average error (%)

T1/2 11 T1/2 97

30.0

0.0

Uptake 11 Uptake 97

30.0 20.0 10.0

8 20 12 21 17 7 4 18 11 13 16 15 14 1 9 6 19 10 2 3 5

18 11 20 5 17 9 8 12 1 21 2 6 7 4 14 16 13 15 10 3 19

0.0 Hospital

Hospital

Average errors of Tmax, T1/2, RCA20 and relative uptake of each hospital. The grey bars represent the errors in the current study and the black bars are the errors of corresponding hospitals in the previous study. RCA20, maximum kidney activity remaining at 20 min.

Differences between true values and mean values of results and SDs of results

Fig. 5

Table 5

True value Tmax (min) Left 1 3.2 Right 1 3.2 Left 2 – Right 2 4.0 Left 3 – Right 3 4.4 T1/2 (min) Left 1 3.5 Right 1 3.5 Left 2 – Right 2 4.8 Left 3 – Right 3 2.3 RCA20 (%) Left 1 21 Right 1 21 Left 2 – Right 2 23 Left 3 – Right 3 23 Relative uptake Patlak (%) Left 1 46.7 Left 2 28.9 Left 3 42.1 Relative uptake integral (%) Left 1 46.7 Left 2 28.9 Left 3 42.1

Difference

SD

Difference

SD

0.0 0.1 – 0.1 – 0.0

0.1 0.1 – 0.1 – 0.1

0.1 0.0 – 0.1 – 0.0

0.1 0.1 – 0.1 – 0.1

0.1 0.7 – 0.7 – 0.3

0.4 0.7 – 0.6 – 0.7

0.1 0.6 – 0.5 – 0.2

0.3 0.5 – 0.7 – 0.8

1.7 3.2 – 1.9 – 3.3

2.3 2.5 – 6.6 – 6.2

0.7 3.3 – 2.0 – 2.9

2.3 2.6 – 4.9 – 6.8

3.6 6.0 8.6

3.0 4.0 3.9

3.6 4.7 9.6

3.0 3.2 4.3

7.3 2.5 7.1

2.7 4.3 3.5

5.6 0.2 5.3

2.0 1.7 2.6

RCA20, maximum kidney activity remaining at 20 min.

20

Centralized results

15 CV (%)

Hospital results

Hospital Centralized Consistency

10 5 0 Tmax

T1/2

RCA20 Uptake, Uptake, integral Patlak

Coefficient of variation (CV) values of five parameters for hospital results, centralized results and consistency measurements. RCA20, maximum kidney activity remaining at 20 min.

audit. The improvement in results could be due to new equipment, improved protocols or both. To make comparison easier, audits should be carried out more regularly. Some of the parameters were not calculated in all laboratories involved in the study because the laboratories were asked to use their normal procedures. Their image sets and calculated parameters were, thus, provided as in a study performed with real patients. This was performed to better assess the variation in the final and overall results. If the laboratories had been provided with

Copyright © Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Multicenter evaluation of renography Nykänen et al. 983

points of error in the calculated value, which should be considered normal variation [11]. However, the errors of 35% in T1/2 and of 28% in RCA20, corresponding to 70–100 s and 6.5 percentage points, respectively, could be significant to the diagnosis.

Fig. 6

(a) Inferolateral background subtraction ROI drawn avoiding the liver. (b) Long lateral ROI drawn over 180° around the kidney. ROI, region of interest.

a list of required parameters, it could have given us more data, but it could also have affected the analysis protocols. The laboratories might have deviated from the standard protocol – for example, they used physicists instead of technicians to calculate the parameters or newly customized printouts. This approach was also adopted when determining the true value of relative uptake. We concluded that the relative uptakes were most accurately described by the weighted integral from 1 to 2 min. We could have calculated different true values for different hospitals according to the integral timing of their protocol, to better assess the quality of their image processing. Instead, we chose to set the true value constant to better compare the difference in final results of the hospitals. The 7% error in Tmax results corresponds to a 13 s error in the measured parameter. This error is quite small and could be related to different procedures in which the time between injecting the tracer and starting the acquisition differed by several seconds. In relative uptake, the error of 5% corresponds to 1.5–2.5 percentage

The centralized results for relative uptake calculated using the integral method were on average 1.9 percentage points closer to the true value than the results calculated in hospitals (P

Multicenter evaluation of renography with an automated physical phantom.

The diversity of the dynamic radionuclide renal imaging (renography) study protocols sets challenges for the overall study quality, therefore raising ...
628KB Sizes 5 Downloads 5 Views