A Systematic Review of Reporting in Randomized Controlled Trials in Dermatologic Surgery: Jadad Scores, Power Analysis, and Sample Size Determination Murad Alam, MD, MSCI,*†‡ Mutahir Rauf, BA,* Sana Ali, BS,* Michael Nodzenski, BA,* and Kira Minkis, MD, PhD*

BACKGROUND Dermatologic surgery is a fruitful research area that has spawned numerous randomized control trials (RCTs). OBJECTIVE To assess the quality of reporting of randomization, blinding, sample size, and power analysis in RCTs published in the journal Dermatologic Surgery. MATERIALS AND METHODS Randomized control trials published in Dermatologic Surgery between 1995 and 2012 were assessed regarding the quality of trial reporting. Data extraction performed independently by 2 data extractors. RESULTS Dramatic increases in the numbers of RCTs in dermatologic surgery were noted in successive 5year periods, from 39 in 1995 to 1999 to 66 in 2000 to 2004 and 131 in 2005 to 2009. The median Jadad score for articles from 1995 to 1999 was 1 and was 2 for articles since 2000. Subjects per study were 20 during 1995 to 1999, 25.5 from 2000 to 2004, and over 30 since 2005. Power analysis with sample size determination was reported in 0 articles during 1995 to 1999; greater than 13% of articles since 2005. Alpha level was specified for 37% of RCTs from 1995 to 1999 and 64% to 70% since 2005. CONCLUSION During the last 20 years, the number of RCTs in Dermatologic Surgery has grown rapidly, almost doubling every 5 years, because the number of subjects per study has also increased and the quality of reporting has significantly improved. The authors have indicated no significant interest with commercial supporters.

A

lthough many types of studies of procedural interventions are valuable, randomized controlled trials (RCTs) are believed to be unusually effective at eliminating unknown confounders. Although effect sizes may be relatively smaller in RCTs, the magnitude of these is often considered more credible because of the rigorous methodology. In dermatologic surgery, RCTs are valuable for identifying effective treatments, safer techniques, and interventions that are more comfortable or better tolerated by patients. Randomized control trials can be placebo-controlled or 2 or more active interventions can be compared. The process of comparing

alternative interventions for the same indication, subsumed within so-called comparative effectiveness research, is believed to particularly be helpful in maximizing the quality of outcomes while minimizing resource utilization and adverse events. Although there has been a proliferation of RCTs in the field of dermatologic surgery in recent years, these RCTs have not systematically been studied regarding quality.1–3 In particular, the degree to which RCTs in dermatologic surgery adhere to standard reporting recommendations for RCTs has not been assessed. The purpose of this study is to use standard measures to assess the quality of

Departments of *Dermatology, †Otolaryngology, and ‡Surgery, Feinberg School of Medicine, Northwestern University, Chicago, Illinois

·

© 2014 by the American Society for Dermatologic Surgery, Inc. Published by Lippincott Williams & Wilkins ISSN: 1076-0512 Dermatol Surg 2014;40:1299–1305 DOI: 10.1097/DSS.0000000000000166

·

·

· 1299

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

RCTS IN DERMATOLOGIC SURGERY

dermatologic surgery RCTs and trace the evolution of these trials over time. Methods Search Criteria and Article Selection A PubMed search conducted on March 15, 2013 limited articles returned to “randomized control trials” on “Humans” published between “1995 and 2012” in the journal “Dermatologic Surgery.” A manual review by 2 independent data extractors was used to confirm that studies from the automated search were correctly classified as RCTs.

following key statistical planning elements were included in the methods section: power analysis with sample size calculation; effect size defined for power calculation; post hoc power analysis; and specification of alpha level. The number of subjects per study was also recorded. Data Extraction Protocol Data extraction was performed independently by 2 data extractors. Disagreements were resolved by a process of forced agreement between the 2 initial raters, with this further approved by a boardcertified dermatologist not involved in initial data extraction.

Study Procedures (Jadad Score) The remaining studies were then evaluated for specific reporting elements considered useful in randomized control trials. Specifically, so-called Jadad scores were computed for each study based on the Jadad scale, a validated three-question measure designed to rate the methodologic quality of RCTs.4 The Jadad scale assigns a value of 0 to 5 to RCTS, conferring up to 2 points to studies that mention randomization (1 point) and do it appropriately (another point); up to 2 points that mention and appropriately perform blinding; and an additional point to studies that provide reasons for withdrawals. Although some have suggested that the Jadad score is a simplistic measure that does not characterize all elements of trial quality, it was used in this study for at least 3 reasons: (1) It is perhaps the most common measure of trial quality across medical and surgical disciplines and subspecialties, with almost 400 publications using this measure indexed on Pubmed as of 2014; (2) it is efficient, and hence practical, to apply Jadad when evaluating a very large number of studies, as was the case in the current investigation; and (3) it offers the prospect of objectivity, as versus more subjective methods, which may have other benefits but are more subject to rater bias. Study Procedures (Power Analysis and Sample Size) In this study, in addition to computation of the Jadad score, it was determined the extent to which the

1300

In some cases, there was a lack of clarity regarding methodology in the published written report of a given study. For instance, information about randomization protocols or computed sample size may have not been included in 1 portion of the Methods section, but rather may have been dispersed in several parts of the article. Or, in another common case, information pertinent to power or randomization procedures was communicated using nonstandard language and was not entirely clear. In cases such as these, the 2 raters expended reasonable efforts to piece together what procedures had actually been performed in the trial, and the raters worked together to clarify apparent inconsistencies and arrive at an interpretation that they both considered valid. The logic motivating this occasionally cumbersome process of reconstruction and clarification derived from the observation, often cited in the clinical trials literature, that if the written report of a trial does a poor job of explaining the specific methods used, it does not follow that appropriate methods were omitted or not used, and some effort should be undertaken to uncover the true methodology. Statistical Analysis All outcomes were analyzed descriptively. Frequencies of key features were provided for all studies, as well as for studies by year groupings (1995–1999; 2000– 2004; 2005–2009; 2010–2012). Medians and ranges were also provided for Jadad scores and subject numbers.

DERMATOLOGIC SURGERY

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

ALAM ET AL

Results The total number of studies returned from the automated search was 326, of which 324 were confirmed as correctly classified by manual review. Disagreements between data extractors occurred for 0.4% of data fields extracted, and all were resolved by the process described in the Methods.

in Figure 1. Study topics are provided in Table 2 and Table 3. Overall, the most common interventions studied included injectables and fillers (93 studies, accounting for 29% of the total), lasers and light devices (74; 25%), noninvasive oral or topical therapies (44; 14%), Mohs surgery (27; 8%), and laser resurfacing (24; 7%).

Summary data are provided in Table 1, and the distribution of Jadad scores by year grouping is shown

The most common indications studied were rhytides, skin laxity and photoaging (238; 73%), surgical

TABLE 1. Overview of Randomized Controlled Trials in Dermatologic Surgery 1995–2012 Time Period 1995–1999 Studies classified as RCTs, n (%) On Pubmed After manual review

2000–2004

2005–2009

2010–2012

Total

39 (100)

66 (100)

131 (100)

90 (100)

326 (100)

38 (97)

66 (100)

131 (100)

89 (99)

324 (99)

Power analysis with sample size calculation included, n (%) Included Not included

0 (0)

5 (8)

22 (17)

12 (13)

39 (12)

38 (100)

61 (92)

109 (83)

77 (87)

285 (88)

Effect size* defined for power calculation, n (%) Yes

0 (0)

3 (5)

8 (6)

5 (6)

16 (5)

No

38 (100)

63 (95)

123 (94)

84 (94)

308 (95)

Post hoc power analysis† included, n (%) Included Not included Alpha level‡ specified, n (%) 0.001

0 (0)

0 (0)

4 (3)

1 (1)

5 (2)

38 (100)

66 (100)

127 (97)

88 (99)

319 (98) 31 (10)

2 (5)

4 (6)

10 (8)

15 (17)

0.01

3 (8)

2 (3)

6 (5)

2 (2)

13 (4)

0.05

9 (24)

32 (48)

76 (58)

40 (45)

157 (48)

Other/not specified

24 (63)

28 (42)

39 (30)

32 (36)

123 (38)

Median (IQR)x

20 (28)

25.5 (55)

30.5 (58)

33 (55)

30 (54)

Range

4–145

8–168

4–439

4–1740

4–1740

1 (1)

2 (2)

2 (2)

2 (2)

2 (2)

0–4

0–5

0–5

0–5

0–5

No. subjects in study

Jadad score Median (IQR) Range

*Effect size refers to the degree of change or difference that a particular study is designed to detect. As sample size and power increase, ever smaller effect sizes may be detectable. For pilot studies, it may be adequate to be powered only to detect a large- or moderate-sized effect size, and larger studies may be powered to detect smaller effects. †Post hoc power analysis refers to the condition in which investigators may not have included a power analysis before embarking on data collection, but once they collected and analyzed their data, they performed a power analysis to show that their study had sufficient sample size to detect the effect size of interest. Post hoc power analysis is better than no power analysis, but power analysis before study initiation is preferred. ‡Alpha level specifies an error margin of the study. In particular, it denotes the likelihood that a particular study finding could have been detected by chance alone. Alpha levels are arbitrary, but in medicine, levels of 0.05 are commonly used. xIQR, interquartile range. This is the range of outcomes bounded by the 25th and the 75th percentiles of all available outcomes, respectively.

40:12:DECEMBER 2014

1301

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

RCTS IN DERMATOLOGIC SURGERY

of RCTs published in the journal Dermatologic Surgery has steadily and dramatically increased from an initial value of 7.6/year to 29.7/year.

Figure 1. Distribution of Jadad scores by year grouping. Boxes represent upper and lower quartiles, inset horizontal lines represent medians, and diamonds represent mean values. From 1995 to 1999, the median score was equal to the first quartile.

scars and wounds (60; 19%), vascular disorders (60, 19%), nonmelanoma skin cancer (50; 15%), pigmentary disorders (38; 12%), and unwanted fat (24; 7%). The summary statistics and Jadad scores for Dermatologic Surgery were comparable with those in the plastic surgery literature, reviewed from 1990 to 2010. In both cases, a small minority of studies provided a complete power analysis and sample size determination, with 19% in the plastic surgery literature meeting this threshold compared with 12% in Dermatologic Surgery. The differences may not be significant, because ascertainment of these parameters from the published written reports of studies can be difficult, and can vary based on the level of rigor required by data extractors. The average Jadad score in the plastic surgery sample was 2.09, virtually the same as the median value of 2 seen in Dermatologic Surgery. Over time, both plastic surgery and dermatology have shown an improvement in trial reporting, with a greater proportion of studies likely to provide relevant information now than 20 years ago.

Discussion Dermatologic surgery is a subspecialty of dermatology that in recent years has spawned an unusually large number of randomized control trials. From the period 1995 to 1999 to the period 2010 to 2012, the number

1302

Before 2000, virtually no randomized control trials in Dermatologic Surgery included a power analysis with sample size calculation; since then, at least 39 studies have met this criterion. Despite the great difficulties inherent in prospectively enrolling patients in randomized trials of surgical procedures, the sample size of studies has also increased, from 20 in the period 1995 to 1999, to 25.5 in 2000 to 2004, to values in the range of 30.5 to 33 more recently. Regarding study quality, the median Jadad scores for studies before 2000 were 1, and since then, the median has risen to 2. Although all the reviewed studies do not include all the information that is recommended to be reported in RCTs, it is important to realize that lack of reporting does not mean that key procedures were not performed. Appropriate methodology may have been used, but in-text methodologic descriptions may have been abbreviated or unclear. Although in this study, reasonable efforts were undertaken by the raters to clarify such inconsistencies and omissions, it is certainly possible that in some cases when valid methodology was used, our raters could not conclude this from the inadequate reporting; this phenomenon would have led to underestimating the proportion of studies that met our criteria. Education of authors may help rectify this issue and ensure that authors get credit for all the careful methods they have used. Although most authors are already reporting randomization and blinding, they may consider providing further details about these processes, and also providing reasons for withdrawals from their studies. That being said, it is important to realize that studies in dermatologic surgery are limited by ethical, practical, and patient-specific constraints. Patients, particularly those seeking cosmetic procedures, may be reluctant to undergo an untried therapy or return for the appropriate number of follow-up visits. Properly conveyed informed consent that includes a careful description of possible risks may also deter patients from enrolment. Finally, resources for studies in dermatologic surgery

DERMATOLOGIC SURGERY

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

ALAM ET AL

TABLE 2. Interventions by Study Topic Study Topic Skin cancer

Cosmetics

Treatment Subtype

No. Studies (%)*

Mohs surgery



27 (8)

Photodynamic therapy Excision

— —

9 (3) 4 (1)

Treatment Type

Cryotherapy



3 (1)

Electrodessication and curretage



2 (1)

Other



23 (7)

Energy devices

Laser

61 (19)

Light

13 (4)

Minimally invasive procedures

Major procedures

Nonskin cancer, noncosmetic

Ultrasound

2 (1)

Combination Radio-frequency

2 (1) 1 (0)

Injectable/filler

93 (29)

Noninvasive oral or topical treatment

44 (14)

Sclerotherapy

12 (4)

Superficial peel/microderm abrasion

10 (3)

Resurfacing with laser

24 (7)

Liposuction

6 (2)

Peels or dermabrasion

3 (1)

Hair transplant

3 (1)

Blepharoplasty

2 (1)

Endovenous laser or radio-frequency surgery

2 (1)

Autologous fat treatment

2 (1)

Ambulatory phlebectomy

2 (1)



38 (12)

Miscellaneous

*Studies frequently had multiple interventions, therefore percentages do not add to 100%.

are limited because this is not a federal funding priority, and corporate support is limited and less likely to underwrite comparative effectiveness studies that compare rival modalities. Despite these limitations, the dermatologic surgery literature continues to improve, paralleling the overall improvement in reporting of RCTs in other dermatologic specialties5,6 and other fields.7 Limitations of this study include the restriction to a single journal, Dermatologic Surgery. This was done to replicate the methodology of the study by Ayeni and colleagues,7 which examined the plastic surgery literature, while having better internal validity. Reporting requirements across different journals differ, and may influence reporting; tracking a single core journal over time eliminates some of these confounders. Additionally, some journals that

publish studies in the field of dermatologic surgery, for instance the Journal of Drugs in Dermatology, the Journal of Cosmetic Dermatology, and the Journal of Cosmetic and Laser Therapy, were founded more recently and, as such, do not span the entire period studied. Since Dermatologic Surgery was the first journal in the field, and continues to publish more RCTs than any other dermatologic surgery journal, it was deemed appropriate for this study. Significantly, in this study, we did not systematically evaluate the provenance of RCTs, specifically the extent to which they were supported by industry or had authors affiliated with relevant corporate entities. One reason for this was that disclosure of authorship and support continues to often not be transparent, with routine practices like ghost authorship sometimes obscuring underlying interests. By definition, it is difficult to

40:12:DECEMBER 2014

1303

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

RCTS IN DERMATOLOGIC SURGERY

dermatologic surgery as a whole than a metric by which individual studies should be judged.

TABLE 3. Indications by Study Topic

Study Topic Skin cancer

Cosmetics

Indication Nonmelanoma skin cancer

50 (15)

Actinic keratosis

10 (3)

Melanoma

2 (1)

Surgical scars/wounds Other

2 (1) 4 (1)

Rhytides, skin laxity, and photoaging Surgical scars/wounds

238 (73) 60 (19)

Vascular disorders

60 (19)

Melanocyte disorders

38 (12)

Unwanted fat Acne vulgaris

24 (7) 18 (6)

Acne scars

16 (5)

Hyperhidrosis

14 (4)

Unwanted hair

12 (4)

Alopecia Nonskin cancer, noncosmetic

No. Studies (%)*

8 (2)

Other

20 (6)

Miscellaneous

38 (12)

*Studies frequently had multiple indications, therefore percentages do not add to 100%.

track ghost authorship and estimate its prevalence because there is no systematic way to know what information regarding authors is being withheld or obscured; usually, information about ghost authorship is only discovered incidentally, and sometimes when litigation is involved. Finally, it should be noted that the lack of adequate reporting of methods, including incomplete specification of randomization protocols, and lack of sample size determination and power analysis, does not necessarily undermine the utility and reliability of trial results. It is possible for trial to be properly randomized and adequately powered even if the information detailing this is not disclosed in writing. And for pilot studies focused on assessing the use of new technologies, of which there are fortunately many in dermatologic surgery, considerations of safety and practicality may necessarily limit sample size without detracting from the utility of the preliminary data that are obtained. Thus, the inclusion of appropriate methodologic descriptions in trial reporting is more a goal for the specialty of

1304

Although reporting of RCTs in Dermatologic Surgery has improved during the past 20 years, there is room for improvement. Possible steps could include (1) a specific request in the instructions for authors that methods for randomization blinding, and dropouts be clearly specified, and that power analysis and sample size determination be performed before initiation of the study; (2) guidance from the journal regarding the selection of appropriate resources, personnel, and software that could be used to facilitate development of such analyses; and (3) direction to journal reviewers that they assess RCTs for the presence of key elements within the methods section, and ensure that the level of detail listed is adequate. Just as reviews of inadequately specified RCTs may be made more stringent, articles that included better disclosure of their methods and a more rational approach to study design should be favored for publication, even if they report negative results or cover subject areas of more limited interest to readers.” Overall, then, RCTs in dermatologic surgery have grown in number and quality over the last 2 decades, but there is further work to be done. The data provided in this study can help authors more clearly report the methods of their studies. In several years, this study may be replicated to see to what extent the dermatologic surgery literature continues to improve. Future reviews of this subject may also analyze data from other journals in the field of dermatologic surgery in more recent periods. Acknowledgments The authors convey their deep appreciation and thanks to William B. Coleman III, MD, whose thoughts and insights led to the initiation of this study, and who was kind to review their article and offer suggestions for improvement. References 1. Alam M, Barzilai DA, Wrone DA. Power and sample size of therapeutic trials in procedural dermatology: how many patients are enough? Dermatol Surg 2005;31:201–5. 2. Alam M, Olson JM, Asgari MM. Needs assessment for cosmetic dermatologic surgery. Dermatol Clin 2012;30:177–87.

DERMATOLOGIC SURGERY

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

ALAM ET AL

3. Alam M. Usefulness of Cochrane intervention reviews for the practicing dermatologic surgeon. Dermatol Surg 2013;39:1345–50.

published in high-impact general and specialized medical journals. PLoS One 2013;8:e84779.

4. Clark HD, Wells GA, Huët C, McAlister FA, et al. Assessing the quality of randomized trials: reliability of the Jadad scale. Control Clin Trials 1999; 20:448–52.

7. Ayeni O, Dickson L, Ignacy TA, Thoma A. A systematic review of power and sample size reporting in randomized controlled trials within plastic surgery. Plast Reconstr Surg 2012;130:78e–86e.

5. Alvarez F, Meyer N, Gourraud PA, Paul C. CONSORT adoption and quality of reporting of randomized controlled trials: a systematic analysis in two dermatology journals. Br J Dermatol 2009;161: 1159–65. 6. To MJ, Jones J, Emara M, Jadad AR. Are reports of randomized controlled trials improving over time? A systematic review of 284 articles

Address correspondence and reprint requests to: Murad Alam, MD, MSCI, Department of Dermatology, 676 N. St. Clair Street, Suite 1600, Chicago, IL 60611, or e-mail: [email protected]

40:12:DECEMBER 2014

1305

Copyright © American Society for Dermatologic Surgery. Unauthorized reproduction of this article is prohibited.

A systematic review of reporting in randomized controlled trials in Dermatologic Surgery: Jadad scores, power analysis, and sample size determination.

Dermatologic surgery is a fruitful research area that has spawned numerous randomized control trials (RCTs)...
164KB Sizes 0 Downloads 8 Views