ORIGINAL RESEARCH

Annals of Internal Medicine

Acupuncture for Menopausal Hot Flashes A Randomized Trial Carolyn Ee, MBBS; Charlie Xue, PhD; Patty Chondros, PhD; Stephen P. Myers, PhD; Simon D. French, PhD; Helena Teede, PhD; and Marie Pirotta, PhD

Background: Hot flashes (HFs) affect up to 75% of menopausal women and pose a considerable health and financial burden. Evidence of acupuncture efficacy as an HF treatment is conflicting.

ety, depression, and adverse events. Participants were assessed at 4 weeks, the end of treatment, and then 3 and 6 months after the end of treatment. Intention-to-treat analysis was conducted with linear mixed-effects models.

Objective: To assess the efficacy of Chinese medicine acupuncture against sham acupuncture for menopausal HFs.

Results: 327 women were randomly assigned to acupuncture (n = 163) or sham acupuncture (n = 164). At the end of treatment, 16% of participants in the acupuncture group and 13% in the sham group were lost to follow-up. Mean HF scores at the end of treatment were 15.36 in the acupuncture group and 15.04 in the sham group (mean difference, 0.33 [95% CI, ⫺1.87 to 2.52]; P = 0.77). No serious adverse events were reported.

Design: Stratified, blind (participants, outcome assessors, and investigators, but not treating acupuncturists), parallel, randomized, sham-controlled trial with equal allocation. (Australia New Zealand Clinical Trials Registry: ACTRN12611000393954) Setting: Community in Australia. Participants: Women older than 40 years in the late menopausal transition or postmenopause with at least 7 moderate HFs daily, meeting criteria for Chinese medicine diagnosis of kidney yin deficiency. Interventions: 10 treatments over 8 weeks of either standardized Chinese medicine needle acupuncture designed to treat kidney yin deficiency or noninsertive sham acupuncture. Measurements: The primary outcome was HF score at the end of treatment. Secondary outcomes included quality of life, anxi-

V

asomotor symptoms (VMSs), or hot flashes (HFs) and night sweats, affect up to 75% of women, last an average of 5 years, and cause a considerable loss of quality of life (1) and financial burden (2). Some women are reluctant to use hormone replacement therapy (HRT), a highly effective treatment, because of such adverse events as cardiovascular disease and breast cancer (1). Other conventional treatments, such as selective serotonin reuptake inhibitors, also cause adverse events and are less effective than HRT (3). Complementary therapies account for $34 billion in out-of-pocket spending in the United States annually (4). More than 50% of women use these therapies for menopausal symptoms, despite little evidence of safety and effectiveness (5). Acupuncturists are the second most consulted therapists by menopausal women (6). Acupuncture is safe (7) and may act on monoamines (8), which are implicated in VMS pathophysiology (1). Although acupuncture for VMSs is more effective than

See also: Summary for Patients . . . . . . . . . . . . . . . . . . . . . . . I-24 Web-Only Supplement 146 © 2016 American College of Physicians

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Limitation: Participants were predominantly Caucasian and did not have breast cancer or surgical menopause. Conclusion: Chinese medicine acupuncture was not superior to noninsertive sham acupuncture for women with moderately severe menopausal HFs. Primary Funding Source: National Health and Medical Research Council. Ann Intern Med. 2016;164:146-154. doi:10.7326/M15-1380 www.annals.org For author affiliations, see end of text. This article was published at www.annals.org on 19 January 2016.

self-care or no treatment (9), results from shamcontrolled trials conflict (10 –12). A recent meta-analysis concluded that acupuncture may be superior to sham procedures for HFs, but methodological flaws and small sample sizes may overestimate this effect (13). At the time of study design, the only trial reporting superiority of acupuncture over sham was a pilot study using noninsertive sham (14). Our objective was to compare the efficacy of Chinese medicine needle acupuncture with noninsertive sham acupuncture in a broader sample of women with menopausal HFs.

METHODS Design Overview Our study protocol has been published (15). This was a stratified, blinded (except therapists), parallel, randomized, sham-controlled trial with equal allocation conducted between September 2011 and October 2014. All participants provided written informed consent at enrollment. No financial compensation was offered. The Human Research Ethics Committee at the University of Melbourne provided ethics approval. Methods were informed by an unpublished feasibility study (n = 27) conducted by Drs. Ee, Pirotta, and Xue. Protocol changes and deviations are described in the Appendix (available at www.annals.org).

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

Setting and Participants Interventions were delivered in 15 acupuncture clinics in Melbourne, Australia, and in areas of Victoria, New South Wales, and Queensland, Australia. We recruited from the community using social media; university student and staff newsletters; newspaper advertisements; media exposure; and strategies through Jean Hailes for Women's Health (www.jeanhailes.org.au), a women's health education and research organization. Women were included if they were postmenopausal (>12 months since their final menstrual period) or in the late menopausal transition (follicularstimulating hormone level ≥25 IU, amenorrhea ≥60 days, and VMSs), had a mean HF score of at least 14 (equal to 7 moderate VMSs daily) (16), or had kidney yin deficiency diagnosed using a structured Chinese medicine history as well as a tongue and pulse examination performed by experienced acupuncturists (Appendix Figures 1 and 2, available at www.annals.org). Kidney yin deficiency, of which night sweats is a cardinal symptom, is a Chinese medicine clinical syndrome diagnosed in 76% to 81% of symptomatic postmenopausal women (17, 18). Women who had had a hysterectomy were included if they were older than 51 years with a follicular-stimulating hormone level of 25 IU or greater. Exclusion criteria were needle acupuncture in the preceding 2 years, age younger than 40 years, previous diagnosis of premature ovarian failure and age younger than 50 years, bilateral salpingooophorectomy, medical reasons for amenorrhea, poorly controlled thyroid disease, VMSs associated with breast cancer, current HRT use, vaginal estrogen therapy in the previous 12 weeks, treatment of VMSs for the previous 12 weeks, relative contraindications to acupuncture (anticoagulation, heart valve disease, or poorly controlled diabetes mellitus), and unwillingness or inability to adhere to trial requirements or to give informed consent. Randomization We randomly allocated participants to receive acupuncture or sham acupuncture. A researcher with no other role in the study generated the computer randomization schedule (Excel 2004 [Microsoft]) with equal allocation and random block sizes of 8 and 12, stratified by acupuncturist. He then created a password-protected electronic Excel spreadsheet containing the concealed allocation schedule. When a participant enrolled, a research assistant (K.N.) activated a randomization function built into the spreadsheet to reveal the next allocation. She informed the acupuncturist of treatment allocation by disclosing the list of points to be used via mobile text, e-mail, or fax. Acupuncturists identified group allocation from differences in treatment protocols between acupuncture and sham groups. Interventions Acupuncture

Two practicing Chinese medical acupuncturists and researchers (C.X. and C.E.) developed a standardwww.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

EDITORS' NOTES Context Up to 75% of menopausal women experience hot flashes (HFs). Although acupuncture is effective for treating these symptoms compared with self-care, data conflict about its efficacy compared with sham acupuncture. Contribution In this randomized, controlled trial, standardized Chinese medicine acupuncture resulted in a reduction in HFs similar to noninsertive sham acupuncture among women who were postmenopausal or in late menopausal transition. Caution Study participants were predominantly Caucasian. Implication Standardized Chinese medicine acupuncture offers no additional benefit over noninsertive sham acupuncture for menopausal HFs.

ized protocol to treat kidney yin deficiency according to Chinese medicine principles, using textbook (19) and literature reviews (14, 20, 21), treatment regimens and point selection from trials that reported acupuncture to be superior to sham acupuncture or self-care (9, 14), and comments from 3 leading researchers in women's health acupuncture. Appendix Tables 1 and 2 (available at www.annals .org) have further details about interventions. In brief, 6 acupuncture points were needled until de qi (defined as numbness, heaviness, pressure, soreness, or tingling) was obtained. De qi, a specific sensation generated by stimulating acupuncture needles, is considered an important component of acupuncture needling (22). Needles were retained for 20 minutes with manual manipulation (twirling and rotating) after 10 minutes. We used 0.32 × 40 –mm sterile, disposable, stainless steel needles (DongBang Acupuncture). Ten treatments were provided at no charge over 8 weeks (twice weekly for 2 weeks and weekly thereafter). Sham Acupuncture

We used the validated Park sham device, a 0.35 × 40 –mm blunt needle supported by a plastic ring and guide tube (base unit) (23) attached to the skin by using a double-sided adhesive ring. The needle telescopes into itself and shortens on manipulation, giving the visual and physical impression of insertion into the skin. The base unit was used for all treatments, including real acupuncture. Although the sensation from the blunt needle tip can result in weak physiologic effects (24, 25), differences in brain activation have been noted between real and blunt needling (26). Needles were bilaterally “inserted”‘ into 3 sites that were not acupuncture points and were away from Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016 147

ORIGINAL RESEARCH points used in the real acupuncture group. Acupuncturists asked about sensation and pretended to manipulate the needle after 10 minutes, but de qi was not sought. The treatment regimen was the same as in the acupuncture group. Other Concurrent Treatments

We discouraged women from starting new cointerventions for HFs during the intervention period. However, participants already receiving non-HRT VMS treatments continued these for the intervention period. We told participants that they had a 50% chance of receiving real or “placebo” acupuncture and that “placebo” needles did not stimulate the same nerves as real needles. Participants could withdraw at any time. Acupuncturists were trained to treat participants professionally but to minimize interactions to brief social conversation and asking about adverse events. Standardization of the Intervention

The acupuncturists had bachelor's degrees in Chinese medicine, at least 5 years of clinical experience, and current registration with the Australian Health Practitioner Regulation Agency. Appendix Table 3 (available at www.annals.org) explains details of training and quality assurance visits done to ensure intervention fidelity. Blinding Participants, outcome assessors, and investigators were blinded to treatment allocation; the acupuncturists and the research assistant who randomly assigned participants were not. The self-reported outcome assessments were blinded. Outcomes and Follow-up The primary outcome was HF score at the end of treatment (EOT) (8 weeks) (16). Participants recorded the number of daily mild, moderate, severe, and very severe HFs for 7 days using a validated HF diary (16). We calculated the HF score using the following equation: ([1 × number of mild HFs] + [2 × number of moderate HFs] + [3 × number of severe HFs] + [4 × number of very severe HFs]) ÷ number of days reported. Thus, an HF score of 14, the minimum to enroll in our trial, may represent 14 mild, 7 moderate, 4.7 severe, or 3.5 very severe HFs per day, or a combination of these. Hot flash scores can include 0 but have no upper limit. Hot flash frequency represents the average number of daily HFs, and severity represents the average severity across all HFs, ranging from 1 (mild) to 4 (very severe). At baseline, 4 weeks, EOT, and 3 and 6 months after EOT, women completed the HF diary, and 2 other validated measures that assess health-related quality of life (Menopause-Specific Quality of Life Questionnaire) (27) and anxiety and depression symptoms (Hospital Anxiety and Depression Scale) (28). At 3 and 6 months after EOT, we inquired about other treatments for VMSs (Appendix Table 4, available at www.annals.org). We assessed treatment expectancy and rationale credibility using the validated Credibility/Expectancy Question148 Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Acupuncture for Menopausal Hot Flashes

naire (29) immediately after the first treatment. We assessed the success of blinding by inserting the question, “Please indicate the treatment that you believe you have received,” into the Credibility/Expectancy Questionnaire, with the answer options of “real acupuncture,” “placebo,” or “not sure.” Adverse events were recorded by participants or acupuncturists throughout the study on a form that included an openended description of the event, assessment of the relationship to acupuncture (unrelated, possibly, probably, or definitely), intensity (mild, moderate, or severe), whether it was serious (potentially fatal, lifethreatening, permanently incapacitating, or resulting in hospitalization), and the outcome (ranging from completely resolved to persistent). At baseline, we collected information on demographic characteristics, risk factors for VMSs, and acupuncture experience using a questionnaire designed and pilot-tested specifically for this study. Questions included age in years; ethnicity according to categories used in a systematic review on prevalence of VMSs around the world (30); previous acupuncture experience (yes or no); previous acupuncture response; highest level of education (31); average weekly household income after taxes, categorized as low, middle, or high income according to Australian Bureau of Statistics data (32); menopausal stage (33); duration of VMSs in years; self-reported height and weight; physical activity levels; history of tubal ligation (31) and depression (34); self-rated health (35); stress levels using the modified Perceived Stress Scale 4 (36); smoking status (37); and alcohol use over the past week (38). Statistical Analysis We conservatively assumed a mean baseline HF score of 14, the lowest possible entry score. We anticipated a 50% reduction in HF score at EOT to 7 in the sham group on the basis of previous HF trials (16). To be clinically important, a 78% reduction in HF score in the acupuncture group was expected, to 3 at EOT (39 – 41). To detect a mean difference of 4 (SD, 11.6 or 0.34 of 1 SD) between groups at EOT (80% power, 5% significance level, and 2-sided test), we needed 266 women (133 per group). Larger group mean differences would be expected if mean HF scores were assumed to be greater than 14. Allowing for the 26% attrition noted in our pilot study, we inflated this figure to 360 women. We used Stata, version 13.1 (StataCorp), for all analyses. An intention-to-treat approach (42) was used in which participants were analyzed in the study groups to which they were allocated, regardless of whether they had received the correct intervention or completed all treatment sessions. To minimize missing data, we used reminder e-mails or phone messages, a courtesy e-mail or letter after 2 weeks, and as many as 3 courtesy phone calls after another 2 weeks. Descriptive statistics were used to compare baseline characteristics for imbalances between groups. For all outcomes, we used linear mixed-effects models (43) using restricted maximum likelihood with random intercepts at the www.annals.org

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

individual level to estimate between-group differences, adjusted for baseline outcome and acupuncturist, for each measurement time point. Estimated group differences (acupuncture ⫺ sham) are reported with 95% CIs and P values. We treated group assignment, acupuncturist, and time (baseline, 4 weeks, EOT, and 3 and 6 months after treatment) as categorical variables using dummy variables, with 2-way interactions between group and time, except baseline, in which group means were constrained to be equal. Distribution of residuals was examined to check for goodness of fit for each outcome. Under the fitted model, missing data were assumed to be missing at random (43). Sensitivity analysis with a pattern-mixture model assessed the robustness of the missing data assumption for HF score at EOT (Appendix) (44). We used the James and Bang blinding indices to assess the success of blinding (45). Role of the Funding Source This study was funded by the National Health and Medical Research Council. The funding source had no role in study design; collection, analysis, and interpretation of data; manuscript preparation; or the decision to submit the manuscript for publication.

RESULTS After completing the screening survey, 61% of women (1295 of 2140) were initially eligible (Appendix Figure 3, available at www.annals.org). Of these 1295 women, 234 declined participation and 845 were excluded, most commonly because they were not postmenopausal or in the menopausal transition (22%), received treatment for VMSs within the past 12 weeks (16%), or were using HRT (9%). Another 170 women were excluded for not meeting criteria for kidney yin deficiency symptoms and 6 based on follicularstimulating hormone testing. Of the women who completed the baseline HF diary, 57% (403 of 706) were eligible and 347 went on to be examined by an acupuncturist. A total of 338 of these women met criteria for signs of kidney yin deficiency, and 327 were randomly assigned. Twenty-two (14%) women in the acupuncture group and 24 (15%) in the sham group did not complete all 10 treatments; of these, 6 women in the acupuncture group and 8 in the sham group contributed EOT data. Recruitment ceased when we exceeded our target of 266 women contributing complete data at EOT; in total, 279 women contributed EOT data. Demographic characteristics, risk factors for VMSs, and baseline outcome measures were similar between groups (Table 1). A higher proportion of women in the sham group had a previous positive acupuncture experience. Participants were mostly Caucasian and welleducated. Mean age was 55 years. On the Credibility/ Expectancy Questionnaire (administered after the first treatment), most women indicated that they expected their VMSs to be somewhat to completely improved by EOT. www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Compared with mean baseline HF score, both groups showed approximately 40% improvement at EOT, which was sustained at 3 and 6 months after treatment (Figure and Appendix Table 5, available at www .annals.org). The adjusted mean difference for HF score at EOT was 0.33 (95% CI, ⫺1.85 to 2.75; P = 0.70), with confidence bounds excluding our hypothesized minimum clinically important difference of 4. Based on sensitivity analyses for missing data patterns (Appendix), conclusions for HF score at EOT would only change in favor of sham acupuncture (intervention effect, 2.54 [CI, 0.14 to 4.96]) if we assumed that women with missing data in the acupuncture group only had a mean HF score 13 points greater (equal to 1 SD) than the observed women at EOT, which we considered implausible. Similarly, conclusions were unlikely to favor acupuncture because the difference in mean HF score between the women with missing and observed data would need to be much greater than 15. We found no evidence to support a difference between groups for mean HF severity and frequency and secondary outcomes for menopause-specific quality of life, anxiety, and depression (Figure and Appendix Table 6, available at www.annals.org). An exception was a small difference in HF severity at EOT (Figure and Appendix Table 5) favoring the sham group, which we considered clinically unimportant and which diminished at 3 and 6 months. Blinding is reported in Appendix Table 7 (available at www.annals.org). In brief, more than 60% of all women were unsure of their treatment allocation; 34% of women in the acupuncture group and 31% in the sham group believed that they received real acupuncture. Thirty-two percent of women correctly guessed their treatment beyond chance in the acupuncture group, and 24% incorrectly guessed their treatment beyond chance in the sham group. No serious adverse events were reported (Table 2). Most events were mild, self-limiting, and intrinsic to acupuncture (such as bleeding and pain).

DISCUSSION The findings from our multicenter, randomized, sham-controlled trial show that an 8-week course of standardized Chinese medicine acupuncture for participants classified with kidney yin deficiency did not reduce menopausal HFs more than noninsertive sham acupuncture. Confidence bounds for estimated mean differences for HF score excluded our hypothesized minimal clinically important effect size of 4; hence, results show no benefit from acupuncture compared with sham. Hot flash scores decreased in both groups by approximately 40% from baseline to EOT and were sustained for 6 months. We found no evidence of an advantage of acupuncture over sham acupuncture on quality of life, anxiety, or depression. It was important to do this study because of the high prevalence and morbidity rate from VMSs and limitations of existing mainstream treatments. Before our study, 2 trials demonstrated acupuncture's effectiveAnnals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016 149

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

Table 1. Baseline Characteristics of Participants, by Group* Characteristic

Acupuncture Group (n ⴝ 163)†

Participants Missing, n (%)†

Sham Group (n ⴝ 164)†

Participants Missing, n (%)†

Mean age (SD), y Ethnicity, n (%) Caucasian Non-Caucasian Highest educational attainment, n (%) Primary or high school Vocational training, diploma, university degree, or higher Average weekly household income, n (%) Low income ($0–599) Middle income ($600–2000) High income (≥$2000) Menopausal stage, n (%) Late postmenopause Early postmenopause Menopausal transition Median duration of VMSs (IQR), y Mean body mass index (SD), kg/m2 Previous acupuncture experience, n (%) Previous acupuncture response, n (%)§ >50% response 30%–49% response 0%–29% response Worsened Other兩兩 Mean credibility score (SD) (30)¶ Mean expectancy score (SD)¶ Smoking status, n (%) Nonsmoker Former smoker Current smoker (occasional or regular) Alcohol use, n (%) Nondrinker 1–5 standard drinks/wk >6 standard drinks/wk Physical activity level, n (%)** Low Moderate High Previous tubal ligation, n (%) History of depression, n (%) Self-rated health (36), n (%) Excellent Very good Good Fair Poor Mean PSS-4 score (SD) (37)†† Mean HF score (SD) Mean HF frequency (SD) Mean HF severity (SD) Mean MENQOL subscore (SD)‡‡ Physical Sexual Vasomotor Psychosocial Mean HADS subscore (SD) Anxiety Depression

55.2 (4.3) – 133 (97.8) 3 (2.2) – 55 (40.4) 81 (59.6) – 43 (41.3) 56 (53.8) 5 (4.8) – 39 (28.9) 63 (46.7) 33 (24.4) 3.75 (2.00–6.00) 26.40 (4.40) 64 (48.5) – 19 (30.6) 22 (35.5) 17 (27.4) 4 (6.5) 0 (0) 6.77 (1.30) 6.96 (1.30) – 66 (48.5) 53 (39.0) 17 (12.5) – 35 (25.7) 72 (52.9) 29 (21.3) – 40 (29.6) 20 (14.8) 75 (55.6) 30 (22.2) 33 (24.4) – 22 (16.2) 69 (50.7) 39 (28.7) 4 (2.9) 2 (1.5) 7.2 (3.0) 26.36 (16.29) 12.88 (7.24) 2.05 (0.47)

27 (16.6) 27 (16.6) – – 27 (16.6) – – 59 (36.2)‡ – – – 28 (17.2) – – – 29 (17.8) 35.0 (21.5) 31 (19.0) 2 – – – – – 21 (12.9) 21 (12.9) 27 (16.6) – – – 27 (16.6) – – – 28 (17.2) – – – 28 (17.2) 28 (17.2) 27 (16.6) – – – – – 27 (16.6) 0 (0) – –

54.8 (4.2) – 137 (93.1) 10 (6.9) – 50 (34.5) 95 (65.5) – 47 (40.9) 62 (53.9) 6 (5.2) – 37 (25.2) 75 (51.0) 35 (23.8) 3.50 (2.00–6.50) 26.48 (5.30) 78 (53.8) – 35 (46.7) 11 (14.7) 24 (32.0) 0 (0) 5 (6.7) 6.87 (1.70) 6.78 (1.70) – 63 (42.9) 66 (44.9) 18 (12.2) – 43 (29.5) 75 (51.4) 28 (19.2) – 46 (31.7) 26 (17.9) 73 (50.3) 25 (17.4) 40 (27.8) – 30 (20.8) 59 (41.0) 45 (31.3) 8 (5.6) 2 (1.4) 72 (3.3) 24.15 (9.78) 11.95 (4.37) 2.05 (0.48)

17 (10.4) 17 (10.4) – – 19 (11.6) – – 49 (29.9)‡ – – – 17 (10.4) – – – 35 (21.3) 19 (11.6) 19 (11.6) 3 – – – – – 21 (12.8) 21 (12.8) 17 (10.4) – – – 18 (11.0) – – – 19 (11.6) – – – 20 (12.2) 20 (12.2) 20 (12.2) – – – – – 22 (13.4) 0 (0) – –

3.89 (1.46) 3.84 (2.12) 6.03 (1.44) 3.38 (1.69)

23 (14.4) 22 (13.5) 21 (12.9) 22 (13.5)

3.60 (1.57) 3.50 (2.32) 5.66 (1.56) 3.23 (1.72)

21 (12.8) 21 (12.8) 21 (12.8) 21 (12.8)

7.63 (4.29) 4.53 (3.36)

22 (13.5) 23 (14.4)

7.52 (4.17) 4.89 (4.05)

21 (12.8) 22 (13.4)

HADS = Hospital Anxiety and Depression Scale; HF = hot flash; IQR = interquartile range; MENQOL = Menopause-Specific Quality of Life Questionnaire; PSS-4 = Perceived Stress Scale 4; VMS = vasomotor symptom. * Percentages may not sum to 100 due to rounding. † 27 and 17 women in the acupuncture and sham groups, respectively, did not return the baseline surveys; of the women who returned surveys, not all answered all items. The total number of missing responses for each item is provided, including the 27 and 17 who did not return surveys. The percentage for each item is calculated from the total number of complete responses. ‡ 31 and 27 women in the acupuncture and sham groups, respectively, selected the answer option “Prefer not to answer” for this question. § Not all women contributed these data because a proportion of women were acupuncture-naive. The denominator for previous acupuncture response proportions is the number of women who stated that they had acupuncture experience (64 in the real group and 78 in the sham group). 兩兩 Cannot remember (n = 2), did not finish treatment so could not tell (n = 2), and “needles would not go in” (n = 1). ¶ Questionnaire scores range from 1–9; 1 = no improvement expected (for expectancy) or not at all useful/logical/confident in recommending treatment (for credibility); 9 = total improvement expected or very useful/logical/confident in recommending treatment. ** Defined according to the number of 20-min sessions of less vigorous exercise/wk. Low = 1– 4; moderate = 5– 6; high = ≥7. †† Questionnaire scores range from 0 –16. ‡‡ Some women did not complete the baseline MENQOL and HADS surveys at baseline because of time constraints, but most contributed data at subsequent time points; if women contributed any data, they were included in the intention-to-treat analysis, which used mixed-effects modeling. 3 women in each group did not contribute any MENQOL/HADS data at any time point because they withdrew shortly after randomization; they are not included in the final analysis. 150 Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

www.annals.org

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

Figure. HF score, frequency, and severity at baseline, 4 wk, EOT (8 wk), and 3 and 6 mo after treatment of acupuncture and sham groups. 30

Acupuncture group Sham group

Mean HF Score (95% CI)

25

20

15

10

5

0

Baseline

4 wk

EOT

3 mo

6 mo

3 mo

6 mo

3 mo

6 mo

Time Point

Mean HF Frequency (95% CI)

15

10

5

0

Baseline

4 wk

EOT Time Point

Mean HF Severity (95% CI)

4

3

2

1

0

Baseline

4 wk

EOT Time Point

Estimated means and 95% CIs are from linear mixed-effects models that adjusted for baseline value of the outcome and acupuncturist. Models assumed equal baseline means by group because the baseline measurements were taken before randomization. Under this model, data are assumed to be missing at random. Mean HF score represents the number of HFs per day, weighted according to severity. Mean HF frequency represents the average number of HFs per day. Mean HF severity represents average severity of HFs, ranging from 1 (mild) to 4 (very severe). EOT = end of treatment; HF = hot flash.

ness compared with self-care (9, 46). The only shamcontrolled trial reporting a benefit from acupuncture was a pilot study using noninsertive sham as a control (14). We sought to address this gap by conducting www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

what we believe is the first adequately powered trial for HFs using noninsertive sham as a control. However, we did not control for the nonspecific effects of acupuncture, such as regular interaction with a therapist. Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016 151

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

Table 2. Adverse Events, by Group Variable

Acupuncture Group (n ⴝ 163)

Sham Group (n ⴝ 164)

Number of women reporting an adverse event, n (%) Adverse events reported, n Adverse events (intensity), n Bleeding or bruising Pain Syncope or presyncope Worsening of symptoms Tingling near acupuncture point Swelling around acupuncture point and itching of whole arm Felt hot; skin sensitivity Nervousness Essential tremor

16 (9.8) 21

4 (2.4) 5

Our findings are consistent with those from a recent Cochrane review (10), which reported that acupuncture was more effective than no treatment and had a moderate effect size but was not efficacious when compared with sham. Although another recent metaanalysis reported moderate standardized effect sizes of acupuncture of ⫺0.35 and ⫺0.44 for HF frequency and severity (13), this analysis pooled data from shamcontrolled trials and trials comparing acupuncture with no treatment. In addition, the shortcomings of the included studies (small sample sizes, high attrition rates, and failure to use intention-to-treat analyses) may have inflated the treatment effects. Strengths of our trial include robust design, adequate power, high retention rate at primary outcome measurement, similar withdrawal rates in both groups, high adherence to treatment, and follow-up to 6 months. We used a broad recruitment strategy, and participant characteristics were balanced between groups. Moreover, we integrated Chinese medicine principles into a rigorous trial design, using welldefined Chinese medicine criteria to diagnose kidney yin deficiency and treatment with standardized acupuncture points. Blinding was successful. The Bang blinding index (45) showed that 31% and 24% of women in the acupuncture and sham groups guessed that they were receiving acupuncture beyond chance. This may indicate women's desire to please the investigators or “wishful thinking” that they had received active treatment and may represent response bias to the question. The first limitation of our trial and acupuncture clinical research more broadly is the lack of an inert sham comparison treatment. Although the Park sham device was the best available sham acupuncture method at the time of study design, its validity as an effective control treatment needs further determination. It creates a needle-prick sensation, essential for the patient to believe that a needle has been inserted; however, this sensation produces minor physiologic effects (26). The interpretation of sham-controlled acupuncture trials must occur within this context. However, what we have successfully examined is the effect of needling compared with pressure from a blunt needle. Second, despite our broad recruitment strategy, our findings can only be generalized to Caucasian Australian women 152 Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

8 (7 mild, 1 moderate) 6 (5 mild, 1 moderate) 2 (mild) 2 (moderate) 1 (mild) 1 (moderate) 1 (moderate) 0 0

0 1 (severe, unrelated to acupuncture) 1 (moderate) 1 (mild) 0 0 0 1 (moderate) 1 (moderate)

with kidney yin deficiency. Nonetheless, 87% of otherwise eligible women met criteria for symptoms of kidney yin deficiency, consistent with previous studies (17, 18). Third, our method of Chinese medicine diagnosis was a simplified version of usual practice and failed to define secondary diagnoses. In addition, our acupuncturists could not be blinded, but we provided comprehensive training and performed quality assurance visits to minimize bias. Finally, our findings cannot be generalized to women with bilateral salpingo-oophorectomy or worsening of VMSs after breast cancer; we excluded these women because they have more severe VMSs (47, 48). Future research should examine the role of acupuncture in breast cancer. We used a standardized acupuncture protocol that targeted women with kidney yin deficiency. Although use of standardized treatment protocols is accepted practice across similar acupuncture trials (49 –51), it does not fully mirror clinical practice, in which individualized treatment is the norm. Furthermore, there is no evidence of benefit from individualized treatments for VMSs (20), whereas the use of standardized prescriptions allows for direct comparison of a specific “dose” of acupuncture and minimizes variation. We followed current best-practice recommendations when designing our protocol (22, 52), performing a thorough literature and textbook review and consulting acupuncture experts. We emphasized the need for de qi, which is considered to contribute to the acupuncture effect (22). Nevertheless, we acknowledge the uncertainty surrounding what constitutes an adequate acupuncture “dose” (52). The findings from 1 pragmatic trial on acupuncture for VMSs suggest that factors other than point selection may be important in treating these symptoms. In this study, acupuncturists were free to diagnose and treat as per usual clinical practice. Further analysis revealed that a core group of 8 acupuncture points were used, regardless of actual Chinese medicine syndrome differentiation, and point selection did not differ between persons who responded and those who did not (18); our standardized prescription comprised 6 of these core points. Future research should examine the relative contribution of components of acupuncture treatment. Trials of VMS treatments generally report a placebo effect of up to 50%, especially if participants have a higher baseline www.annals.org

ORIGINAL RESEARCH

Acupuncture for Menopausal Hot Flashes

VMS burden and receive interventions involving extra attention (53). The complex steps involved in using the sham device and increased focus on symptoms may have contributed to our 40% improvement in both groups. In conclusion, we found no additional benefit from needle insertion over blunt needling for menopausal HFs in women who met criteria for signs of kidney yin deficiency. Unless further high-quality evidence emerges, we cannot recommend skin-penetrating acupuncture as an efficacious treatment of this indication; the effects, if any, of acupuncture on these symptoms seem to be unrelated to needling. From Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia; Royal Melbourne Institute of Technology University, Bundoora, Victoria, Australia; Southern Cross University, Lismore, Queensland, Australia; School of Rehabilitation Therapy, Queen's University, Kingston, Ontario, Canada; and Monash Health, Clayton, Victoria, Australia. Note: Dr. Ee confirms, as primary and corresponding author,

that she had full access to all of the study data and takes responsibility for its integrity and the accuracy of the data analysis. Acknowledgment: The authors thank the research assistants,

Kitty Novy, Mary Kyriakides, and others, for their hard work and dedication to the day-to-day running of the project; Melanie Gibson for assistance with recruitment and survey management; Annie Rahilly and the staff at Jean Hailes for Women's Health for assistance with recruitment; Ben Metcalf for designing the randomization spread sheet; Dr. Zhen Zheng, Professor Caroline Smith, and the other acupuncture experts for providing expert opinions on the treatment protocol; Vincent Cheong for producing the DVD on the Park sham device; Dr. Vicki Kotsirilos for providing a consultation space for Chinese medicine interviews; and Johannah Shergis for replacing Dr. Ee while Dr. Ee was on maternity leave. They also thank Mary-Jo Bevin, George Dellas, Suzy McCleary, John McDonald, Melanie Wells, Tanya Wilson, Richard Zeng, and all other project acupuncturists, as well as the study participants. Grant Support: By the National Health and Medical Research

Council (NHMRC) of Australia (project grant APP 1004406). Dr. Pirotta is supported by an NHMRC Career Development Fellowship. Dr. Ee is supported by an NHMRC Postgraduate Scholarship. Dr. Teede is supported by an NHMRC Practitioner Fellowship. Disclosures: Dr. Pirotta reports grants and other from the NHMRC during the conduct of the study. Authors not named here have disclosed no conflicts of interest. Forms can be viewed at www.acponline.org/authors/icmje/ConflictOfInterest Forms.do?msNum=M15-1380. Reproducible Research Statement: Study protocol: Available

at www.trialsjournal.com/content/15/1/224. Statistical code: Available from Dr. Pirotta (e-mail, [email protected]). Data set: Certain portions of the analytic data set are available from Dr. Pirotta (e-mail, [email protected]). Requests for Single Reprints: Carolyn Ee, MBBS, Department of General Practice, University of Melbourne, 200 Berkeley www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Street, Carlton, 3053 Victoria, Australia; e-mail, ccee@unimelb .edu.au. Current author addresses and author contributions are available at www.annals.org.

References 1. Archer DF, Sturdee DW, Baber R, de Villiers TJ, Pines A, Freedman RR, et al. Menopausal hot flushes and night sweats: where are we now? Climacteric. 2011;14:515-28. [PMID: 21848495] doi:10 .3109/13697137.2011.608596 2. Whiteley J, Wagner JS, Bushmakin A, Kopenhafer L, Dibonaventura M, Racketa J. Impact of the severity of vasomotor symptoms on health status, resource use, and productivity. Menopause. 2013;20: 518-24. [PMID: 23403500] doi:10.1097/GME.0b013e31827d38a5 3. Villaseca P. Non-estrogen conventional and phytochemical treatments for vasomotor symptoms: what needs to be known for practice. Climacteric. 2012;15:115-24. [PMID: 22148909] doi:10.3109 /13697137.2011.624214 4. Nahin RL, Barnes PM, Stussman BJ, Bloom B. Costs of complementary and alternative medicine (CAM) and frequency of visits to CAM practitioners: United States, 2007. Natl Health Stat Report. 2009:1-14. [PMID: 19771719] 5. Sassarini J, Lumsden MA. Non-hormonal management of vasomotor symptoms. Climacteric. 2013;16 Suppl 1:31-6. [PMID: 23848489] doi:10.3109/13697137.2013.805525 6. van der Sluijs CP, Bensoussan A, Liyanage L, Shah S. Women's health during mid-life survey: the use of complementary and alternative medicine by symptomatic women transitioning through menopause in Sydney. Menopause. 2007;14:397-403. [PMID: 17202872] 7. Witt CM, Pach D, Brinkhaus B, Wruck K, Tag B, Mank S, et al. Safety of acupuncture: results of a prospective observational study with 229,230 patients and introduction of a medical information and consent form. Forsch Komplementmed. 2009;16:91-7. [PMID: 19420954] doi:10.1159/000209315 8. Leung L. Neurophysiological basis of acupuncture-induced analgesia—an updated review. J Acupunct Meridian Stud. 2012;5:26170. [PMID: 23265077] doi:10.1016/j.jams.2012.07.017 9. Borud EK, Alraek T, White A, Fonnebo V, Eggen AE, Hammar M, et al. The Acupuncture on Hot Flushes Among Menopausal Women (ACUFLASH) study, a randomized controlled trial. Menopause. 2009;16:484-93. [PMID: 19423996] doi:10.1097/gme.0b013 e31818c02ad 10. Dodin S, Blanchet C, Marc I, Ernst E, Wu T, Vaillancourt C, et al. Acupuncture for menopausal hot flushes. Cochrane Database Syst Rev. 2013;7:CD007410. [PMID: 23897589] doi:10.1002/14651858 .CD007410.pub2 11. Cho SH, Whang WW. Acupuncture for vasomotor menopausal symptoms: a systematic review. Menopause. 2009;16:1065-73. [PMID: 19424092] doi:10.1097/gme.0b013e3181a48abd 12. Lee MS, Shin BC, Ernst E. Acupuncture for treating menopausal hot flushes: a systematic review. Climacteric. 2009;12:16-25. [PMID: 19116803] doi:10.1080/13697130802566980 13. Chiu HY, Pan CH, Shyu YK, Han BC, Tsai PS. Effects of acupuncture on menopause-related symptoms and quality of life in women in natural menopause: a meta-analysis of randomized controlled trials. Menopause. 2015;22:234-44. [PMID: 25003620] doi:10.1097/GME .0000000000000260 14. Nir Y, Huang MI, Schnyer R, Chen B, Manber R. Acupuncture for postmenopausal hot flashes. Maturitas. 2007;56:383-95. [PMID: 17182200] 15. Pirotta M, Ee C, Teede H, Chondros P, French S, Myers S, et al. Acupuncture for menopausal vasomotor symptoms: study protocol for a randomised controlled trial. Trials. 2014;15:224. [PMID: 24925094] doi:10.1186/1745-6215-15-224 16. Sloan JA, Loprinzi CL, Novotny PJ, Barton DL, Lavasseur BI, Windschitl H. Methodologic lessons learned from hot flash studies. J Clin Oncol. 2001;19:4280-90. [PMID: 11731510] Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016 153

ORIGINAL RESEARCH 17. Zell B, Hirata J, Marcus A, Ettinger B, Pressman A, Ettinger KM. Diagnosis of symptomatic postmenopausal women by traditional Chinese medicine practitioners. Menopause. 2000;7:129-34. [PMID: 10746896] 18. Borud EK, Alræk T, White A, Grimsgaard S. The acupuncture treatment for postmenopausal hot flushes (Acuflash) study: traditional Chinese medicine diagnoses and acupuncture points used, and their relation to the treatment response. Acupunct Med. 2009; 27:101-8. [PMID: 19734379] doi:10.1136/aim.2009.000612 19. Maciocia G. Obstetrics and Gynecology in Chinese Medicine. 1st ed. New York: Churchill Livingstone; 1998. 20. Borud E, Grimsgaard S, White A. Menopausal problems and acupuncture.AutonNeurosci.2010;157:57-62.[PMID:20447875]doi: 10.1016/j.autneu.2010.04.004 21. Borud E, White A. A review of acupuncture for menopausal problems. Maturitas. 2010;66:131-4. [PMID: 20060667] doi:10.1016/j .maturitas.2009.12.010 22. White AR, Filshie J, Cummings TM; International Acupuncture Research Forum. Clinical trials of acupuncture: consensus recommendations for optimal treatment, sham controls and blinding. Complement Ther Med. 2001;9:237-45. [PMID: 12184353] 23. Park J, White A, Stevinson C, Ernst E, James M. Validating a new non-penetrating sham acupuncture device: two randomised controlled trials. Acupunct Med. 2002;20:168-74. [PMID: 12512790] 24. Lund I, Lundeberg T. Are minimal, superficial or sham acupuncture procedures acceptable as inert placebo controls? Acupunct Med. 2006;24:13-5. [PMID: 16618044] 25. Lundeberg T, Lund I, Sing A, Na¨slund J. Is placebo acupuncture what it is intended to be? Evid Based Complement Alternat Med. 2011;2011:932407. [PMID: 19525330] doi:10.1093/ecam/nep049 26. Pariente J, White P, Frackowiak RS, Lewith G. Expectancy and belief modulate the neuronal substrates of pain treated by acupuncture. Neuroimage. 2005;25:1161-7. [PMID: 15850733] 27. Lewis JE, Hilditch JR, Wong CJ. Further psychometric property development of the Menopause-Specific Quality of Life questionnaire and development of a modified version, MENQOLIntervention questionnaire. Maturitas. 2005;50:209-21. [PMID: 15734602] 28. Snaith RP. The Hospital Anxiety and Depression Scale. Health Qual Life Outcomes. 2003;1:29. [PMID: 12914662] 29. Devilly GJ, Borkovec TD. Psychometric properties of the credibility/expectancy questionnaire. J Behav Ther Exp Psychiatry. 2000;31: 73-86. [PMID: 11132119] 30. Freeman EW, Sherif K. Prevalence of hot flushes and night sweats around the world: a systematic review. Climacteric. 2007;10: 197-214. [PMID: 17487647] 31. Whiteman MK, Staropoli CA, Benedict JC, Borgeest C, Flaws JA. Risk factors for hot flashes in midlife women. J Womens Health (Larchmt). 2003;12:459-72. [PMID: 12869293] 32. Australian Bureau of Statistics. 6523.0 – Household Income and Income Distribution, Australia, 2011–12. Canberra, Australia: Australian Bureau of Statistics; 2013. Accessed at www.abs.gov.au /AUSSTATS/[email protected]/Lookup/6523.0Main+Features22011-12? OpenDocument on 28 November 2015. 33. Soules MR, Sherman S, Parrott E, Rebar R, Santoro N, Utian W, et al. Executive summary: Stages of Reproductive Aging Workshop (STRAW). Climacteric. 2001;4:267-72. [PMID: 11770182] 34. Freeman EW, Sammel MD, Lin H, Gracia CR, Pien GW, Nelson DB, et al. Symptoms associated with menopausal transition and reproductive hormones in midlife women. Obstet Gynecol. 2007;110: 230-40. [PMID: 17666595] 35. Eriksson I, Unde´n AL, Elofsson S. Self-rated health. Comparisons between three different measures. Results from a population study. Int J Epidemiol. 2001;30:326-33. [PMID: 11369738]

154 Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Acupuncture for Menopausal Hot Flashes 36. Cohen S, Kamarck T, Mermelstein R. A global measure of perceived stress. J Health Soc Behav. 1983;24:385-96. [PMID: 6668417] 37. Whiteman MK, Staropoli CA, Langenberg PW, McCarter RJ, Kjerulff KH, Flaws JA. Smoking, body mass, and hot flashes in midlife women. Obstet Gynecol. 2003;101:264-72. [PMID: 12576249] 38. Hyde Riley E, Inui TS, Kleinman K, Connelly MT. Differential association of modifiable health behaviors with hot flashes in perimenopausal and postmenopausal women. J Gen Intern Med. 2004; 19:740-6. [PMID: 15209587] 39. Wyrwich KW, Spratt DI, Gass M, Yu H, Bobula JD. Identifying meaningful differences in vasomotor symptoms among menopausal women. Menopause. 2008;15:698-705. [PMID: 18369313] doi:10 .1097/gme.0b013e31815f892d 40. Butt DA, Deng LY, Lewis JE, Lock M. Minimal decrease in hot flashes desired by postmenopausal women in family practice. Menopause. 2007;14:203-7. [PMID: 17099324] 41. Gerlinger C, Gude K, Hiemeyer F, Schmelter T, Scha¨fers M. An empirically validated responder definition for the reduction of moderate to severe hot flushes in postmenopausal women. Menopause. 2012;19:799-803. [PMID: 22228322] doi:10.1097/gme.0b013e31 823de8ba 42. White IR, Carpenter J, Horton NJ. Including all individuals is not enough: lessons for intention-to-treat analysis. Clin Trials. 2012;9: 396-407. [PMID: 22752633] doi:10.1177/1740774512450098 43. Fitzmaurice G, Laird N, Ware J. Applied Longitudinal Analysis. 2nd ed. New Jersey: John Wiley and Sons; 2011. 44. Little RJA. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc. 1993;88:125-34. 45. Bang H, Ni L, Davis CE. Assessment of blinding in clinical trials. Control Clin Trials. 2004;25:143-56. [PMID: 15020033] 46. Kim KH, Kang KW, Kim DI, Kim HJ, Yoon HM, Lee JM, et al. Effects of acupuncture on hot flashes in perimenopausal and postmenopausal women—a multicenter randomized clinical trial. Menopause. 2010;17:269-80. [PMID: 19907348] doi:10.1097/gme .0b013e3181bfac3b 47. Keenan NL, Mark S, Fugh-Berman A, Browne D, Kaczmarczyk J, Hunter C. Severity of menopausal symptoms and use of both conventional and complementary/alternative therapies. Menopause. 2003;10:507-15. [PMID: 14627858] 48. Bordeleau L, Pritchard K, Goodwin P, Loprinzi C. Therapeutic options for the management of hot flashes in breast cancer survivors: an evidence-based review. Clin Ther. 2007;29:230-41. [PMID: 17472816] 49. Kim DI, Jeong JC, Kim KH, Rho JJ, Choi MS, Yoon SH, et al. Acupuncture for hot flushes in perimenopausal and postmenopausal women: a randomised, sham-controlled trial. Acupunct Med. 2011; 29:249-56. [PMID: 21653660] doi:10.1136/aim.2011.004085 50. Painovich JM, Shufelt CL, Azziz R, Yang Y, Goodarzi MO, Braunstein GD, et al. A pilot randomized, single-blind, placebo-controlled trial of traditional acupuncture for vasomotor symptoms and mechanistic pathways of menopause. Menopause. 2012;19:54-61. [PMID: 21968279] doi:10.1097/gme.0b013e31821f9171 51. Vincent A, Barton DL, Mandrekar JN, Cha SS, Zais T, WahnerRoedler DL, et al. Acupuncture for hot flashes: a randomized, shamcontrolled clinical study. Menopause. 2007;14:45-52. [PMID: 17019380] 52. White A, Cummings M, Barlas P, Cardini F, Filshie J, Foster NE, et al. Defining an adequate dose of acupuncture using a neurophysiological approach—a narrative review of the literature. Acupunct Med. 2008;26:111-20. [PMID: 18591910] 53. Loprinzi CL, Barton DL. On hot flash mechanism, measurement, and treatment [Editorial]. Menopause. 2009;16:621-3. [PMID: 19436222] doi:10.1097/gme.0b013e3181a85107

www.annals.org

Annals of Internal Medicine Current Author Addresses: Drs. Ee, Chondros, and Pirotta: Department of General Practice, University of Melbourne, 200 Berkeley Street, Carlton, 3053 Victoria, Australia. Dr. Xue: Professor, School of Health Sciences, Royal Melbourne Institute of Technology (RMIT) University, PO Box 71, Bundoora, 3083 Victoria, Australia. Dr. Myers: NatMed Research Unit, Southern Cross University, PO Box 157, Lismore, Queensland, Australia. Dr. French: Associate Professor, School of Rehabilitation Therapy, Queen's University, Louise D. Acton Building, 31 George Street, Kingston, Ontario K7L 3N6, Canada. Dr. Teede: Director, Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Level 1, 43-51 Kanooka Grove, Clayton, 3168 Victoria, Australia. Author Contributions: Conception and design: C. Ee, C. Xue,

P. Chondros, S.P. Myers, S.D. French, H. Teede, M. Pirotta. Analysis and interpretation of the data: C. Ee, C. Xue, P. Chondros, S.P. Myers, S.D. French, H. Teede, M. Pirotta. Drafting of the article: C. Ee, C. Xue, P. Chondros, S.D. French, H. Teede, M. Pirotta. Critical revision of the article for important intellectual content: C. Ee, C. Xue, P. Chondros, S.P. Myers, S.D. French, H. Teede, M. Pirotta. Final approval of the article: C. Ee, C. Xue, P. Chondros, S.P. Myers, S.D. French, H. Teede, M. Pirotta. Provision of study materials or patients: C. Ee, C. Xue. Statistical expertise: P. Chondros. Obtaining of funding: C. Ee, C. Xue, P. Chondros, S.P. Myers, S.D. French, H. Teede, M. Pirotta. Administrative, technical, or logistic support: C. Ee, H. Teede. Collection and assembly of data: C. Ee, P. Chondros, M. Pirotta.

APPENDIX: METHODS AND RESULTS Methods Protocol Changes and Deviations

Changes to Protocol After Commencement. We initially excluded women with any acupuncture experience because we believed that they might be more likely to guess the treatment allocation. However, this excluded many eligible women and slowed recruitment, threatening trial viability. On advice from the sham acupuncture needle inventor, Dr. Jongbae Park, we included participants with acupuncture experience more than 2 years before enrollment, from October 2011. We also excluded women in the menopausal transition because of the fluctuation in hormone levels and symptoms during this phase; however, from July 2012, we included women in the late menopausal transition in light of new guidelines defining this reproductive phase (54). Protocol Deviations. HF Diary Scoring: In October 2011, we noted an error in scoring some HF diaries that overestimated the HF score. Three participants had incorrectly been deemed eligible; of these, 1 was not eligible after Chinese medicine examination and 2 had been randomly assigned, including 1 who had received www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

1 treatment. The error in the scoring was rectified, and the participant who had started treatment was offered continued treatment but declined. All randomly assigned participants were included in the analysis. Acupuncture Experience. At EOT, we noted that 62% of interested participants had not been screened for acupuncture experience due to a clerical error affecting the online screening survey. Of the 200 enrolled women affected, on subsequent clarification, 20 had had acupuncture within the preceding 2 years. Data on blinding were available for 16 of these women and suggested that recent acupuncture treatment did not seem to have influenced participants' ability to guess the allocated treatment. None of the women in either group guessed that they were receiving sham acupuncture; only 3 out of 9 women in the acupuncture group correctly guessed that they had received acupuncture. All randomly assigned participants were included in the analysis. Receiving Wrong Treatment: Two women who were allocated to sham acupuncture actually received acupuncture. They were analyzed in the sham group. Effect on Outcomes: It is unlikely that these deviations had any significant effect on the outcomes of our study. We included all of the protocol violators in our analysis. Because these 20 women were enrolled and randomly assigned to a study group, they were not excluded from the intention-to-treat analysis. There were 10 women in each group. In a sensitivity analysis that excluded these 20 participants (modified intention-totreat analysis), results remained relatively unchanged (not shown). Ethics Approvals. As well as being approved by the Human Research Ethics Committee of the University of Melbourne (1135293 16/6/2011), the trial received additional ethics approvals from Monash University (2011001242), Royal Melbourne Institute of Technology (RMIT) University (1135293), and Southern Cross University (ECN-11-192) in Australia.

Chinese Medicine Diagnosis

The Chinese medicine theoretical framework that guides acupuncture practice is complex and foreign to the West (55). In Chinese medicine, disease is viewed as an outcome of imbalance of qi, yin, or yang, concepts that do not have a precise analogue in western medicine (55). According to Chinese medicine principles, menopause results from the age-related decline of kidney essence, which may be kidney yin or kidney yang (19). Kidney essence seems to loosely relate to the female reproductive system and female hormones. Giovanni Maciocia, a scholar and practitioner in Chinese medicine whose textbooks are used in major acupuncture colleges around the world, states in his book, Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Obstetrics and Gynecology in Chinese Medicine, that “a deficiency of the Kidney-Essence (in its Yin or Yang aspect) is always at the root of menopausal problems” (19). However, a woman may have a variant of kidney yin deficiency, such as combined kidney and liver yin deficiency. The symptoms of kidney yin deficiency in menopause are dizziness, tinnitus, malar flash, night sweating, HFs, “five-palm heat” (which is heat felt in the soles and palms and chest), sore back, dry mouth, dry hair, dry skin, itching, and constipation (19). Of note, kidney yin deficiency is not an exclusive syndrome that is restricted to menopausal women; it may be diagnosed in other patient groups. To ensure reproducibility in future research, Chinese medicine diagnoses in this trial occurred in a structured and standardized manner. The primary aim of the diagnostic procedure was to exclude women who did not have kidney yin deficiency or one of its variants. We aimed to exclude women whose main diagnosis was kidney yang deficiency, a syndrome that is somewhat the opposite of kidney yin deficiency and characterized by cold hands and feet and a slow pulse. However, women with mixed kidney yin and yang deficiency were included as long as the dominant syndrome was kidney yin deficiency. Equivocal cases were discussed with Dr. Xue, a professor of Chinese medicine and an experienced acupuncture clinician and researcher. The Chinese medicine questionnaire used in this trial was developed by researchers at Royal Melbourne Institute of Technology (RMIT) University with a method previously used in other clinical trials at the institution. Although this method has not been formally validated, it allows for greater standardization of the diagnostic procedure, makes our methods explicit and reproducible, and reduces clinical heterogeneity. However, a disadvantage of this standardized and structured method is that it is not exactly consistent with clinical practice, in which a more comprehensive and flexible history is taken. Furthermore, our method does not allow for secondary diagnoses, which is not consistent with the complexity of syndromes that can be present, such as persons with deficiency of both kidney yin and yang. Despite this, our approach is supported by empirical evidence that indicates that there is good interpractitioner agreement about the presence of kidney yin deficiency in menopausal women (17, 18) but less agreement for related or secondary diagnoses (17). Chinese Medicine History. In the Chinese medicine questionnaire, women were asked to rate the frequency and severity of 5 typical symptoms of kidney yin and yang deficiency. The cardinal symptoms for kidney yin deficiency were “sensations of heat in the body with sweating” and “cold limbs” for kidney yang deficiency; these symptoms were given a double score because they are considered pathognomonic. The questionAnnals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

naire was then scored by a qualified Chinese medicine practitioner (either Dr. Ee or Johannah Shergis, a fellow researcher who replaced Dr. Ee while she was on maternity leave from January to July 2013). Women who scored higher for kidney yin deficiency than for kidney yang deficiency were invited to attend the Chinese medicine examination component of diagnosis. Chinese Medicine Examination. The tongue and pulse were examined, with a score given to each component for signs of kidney yin and yang deficiency. The method of assessing the tongue and pulse was standardized (for example, a score of 1 was given for a tongue body that was red in color and had a scant coat, the typical tongue appearance in yin deficiency). Half points were given if only 1 aspect was present. These scores were then added to the symptom scores for kidney yin and yang deficiency, and women who scored higher overall for kidney yin deficiency were eligible to participate in the trial. However, acupuncturists were also given the discretion to use other clinical skills to make a diagnosis, such as observing the patient's spirit, movement, and body shape, if they believed that a different diagnosis should be made. Any concerns about diagnosis were discussed with Dr. Xue. Standardization of the Process. Dr. Ee performed most tongue and pulse examinations until taking maternity leave. Thereafter, the treating acupuncturists performed tongue and pulse examinations. Although tongue and pulse examination is a fundamental skill in Chinese medicine, all acupuncturists received training on how to perform the examination and allocate scores for kidney yin and yang signs. Training was provided by Dr. Ee, who designed and piloted the methods in consultation with Dr. Xue. For quality assurance purposes, the tongues of the first 20 women to receive diagnoses from each acupuncturist were photographed after obtaining written consent. The deidentified photographs were then viewed by either Dr. Ee or Dr. Xue to reconcile the diagnosis with the clinical picture. Thereafter, photographs were only taken if the diagnosis was unclear. Intervention and Control

Acupuncture. Needles in the acupuncture group were needled unilaterally and manipulated manually by lifting, thrusting, twirling, and rotating until de qi was obtained. Standardization of Intervention and Quality Assurance Visits

Training was provided by Dr. Ee, an experienced medical and Chinese medicine acupuncturist and researcher who developed and piloted the trial methods. All providers received a minimum of 2 hours of face-toface training before participating in the research. This www.annals.org

training included an introduction to research methods and a practical demonstration and time to practice use of the Park sham device. Training was complemented by a comprehensive training manual and a DVD demonstrating the use of the Park sham device. A list of frequently asked questions with suggested answers was provided. Dr. Ee performed quality assurance visits with 9 of the acupuncturists to ensure consistency in intervention delivery. During quality assurance visits, Dr. Ee observed a treatment (with consent from the participant), used a checklist (Appendix Table 3) to identify any protocol deviations, and gave corrective feedback (if needed) at the end of the visit. Dr. Myers visited 1 acupuncturist in Queensland, observed trial procedures, and e-mailed Dr. Ee deidentified digital photographs of needle insertion locations. The remaining 5 rurally based acupuncturists were in regular contact with Dr. Ee via e-mail and telephone. Results Sensitivity Analysis for Departures From the Missing Data Assumption

Sensitivity analysis with a pattern-mixture model (44) assessed whether the primary outcome, HF score at EOT, was robust to departures from the missing data assumption. The main analysis used linear mixedeffects regression to estimate the intervention effect on HF score at EOT, adjusted for baseline HF score and acupuncturist. Results from linear mixed-effects models are valid under the assumption that data are missing at random. Let p1 and p0 be the proportion of women with missing data at EOT for the acupuncture and sham groups, respectively and the parameter ␦ denote the difference in mean HF score between the women with missing data and observed data. Analysis for departure from the missing at random assumption was assessed by adding the quantity Δ to the estimated treatment effect for HF score at EOT in the main analysis (42). For the sensitivity analysis, the difference between the missing and observed HF scores was assumed to vary in the same way in both groups (that is, Δ = (p1 − p0)␦), vary only in the acupuncture group (Δ = p1␦), and vary in the sham group only (Δ = −p0␦). A range of values between −15 and 15 was considered for the difference in mean HF score between the women with missing data and those observed at EOT (␦). Because a higher HF score indicates poorer outcome, negative values of ␦ assume that women with missing data have a lower (better) HF score on average than observed women and positive values of ␦ assume that women with missing data have a higher (worse) mean HF score than the mean observed HF score. The main analysis reported in Appendix Table 5 assumed that women with missing data had the same mean HF www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

score as those observed (that is, ␦ = 0 in both study groups). Results from the sensitivity analysis are presented in Appendix Figure 4 for selected parameter values of ␦. The main analysis presented in the Figure and Appendix Table 5 showed that, under the assumption that data are missing at random, there was no evidence to suggest that acupuncture improved the HF score at EOT. The estimated intervention effect for HF score at EOT remained relatively unchanged for the range of values given in Appendix Figure 4 when the difference in missing and observed mean HF scores were assumed to vary the same way in both groups. Because the proportion of women with missing outcome data was similar in both groups (that is, 14% [23 of 163] in the acupuncture group and 13% [22 of 164] in the sham group) and the reasons for withdrawal or nonresponse were similar (Appendix Figure 3), we could reasonably assume that the mean HF scores for women with missing outcome data did not differ considerably in the 2 groups. Appendix Figure 4 shows that departure from the missing at random assumption had a similar effect in the acupuncture and sham groups. This was because the proportion of women with missing outcome data was similar in the 2 groups. Thus, any variation in the estimated intervention effects that did arise in the sensitivity analysis was primarily due to assuming that the difference between missing and observed mean HF score differed between groups. However, strong assumptions would need to be made about the departure from missing at random assumption in the 2 groups to change the conclusions of the study. For example, the results would favor the sham group if it was assumed that women with missing data had a mean HF score 13 points greater than the observed women at EOT in only the acupuncture group, in which the estimated intervention effect for HF score was 2.54 (CI, 0.14 to 4.96). When departure from missing at random was only in the sham group, women with missing data needed to have a mean HF score at least 15 points lower than the women with outcome data at EOT for the results to favor the sham group. Difference in the mean HF score between the women with missing and observed data at EOT in only 1 group would need to be much greater than 15 for the results to favor the acupuncture group (results not shown). Because the SD for the HF score at EOT was 13.2, such large differences in the mean HF score between women with missing and observed data seem implausible.

Secondary Outcomes

Appendix Table 6 shows the estimated means and between-group differences for secondary outcomes of quality of life, anxiety, and depression. Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Other HF Treatments. At 3 months after treatment, similar numbers of women in the acupuncture and sham groups were using other treatments for their HFs (26 for the acupuncture group and 27 for the sham group). By 6 months after treatment, 5 more women in the acupuncture group were using HF treatments; in contrast, 6 women in the sham group had stopped using treatments. 7 women in each group used more than 1 treatment at 3 months. At 6 months, 6 women in the acupuncture group and 1 woman in the sham group were using more than 1 treatment. More women used acupuncture for their HFs at both the 3- and 6-month follow-up in the acupuncture group than in the sham group. Quality Assurance. Two of the researchers, Drs. Xue and Ee, who are experienced Chinese medicine practitioners, independently examined 110 tongue photographs from the 15 acupuncturists and correlated the appearances with the Chinese medicine diagnosis made by treating acupuncturists. There was full agreement between Drs. Xue and Ee, and they had no concerns about the diagnostic process of the treating acupuncturists. All diagnoses by the 15 project acu-

puncturists were correctly correlated with the appearances from photographs.

Blinding

The James blinding index is a variation on the ␬ statistic in which 1 represents perfect blinding. The Bang blinding index for each group represents the proportion of participants making a correct treatment guess beyond chance; 0 represents perfect blinding, a positive index indicates a correct guess, and a negative index indicates a guess in the opposite direction (45). The James blinding index indicated a high level of blinding (0.79 [CI, 0.76 to 0.82]). Appendix Table 7 shows the results for the Bang blinding index for each treatment group. Web-Only References 54. Harlow SD, Gass M, Hall JE, Lobo R, Maki P, Rebar RW, et al; STRAW + 10 Collaborative Group. Executive summary of the Stages of Reproductive Aging Workshop + 10: addressing the unfinished agenda of staging reproductive aging. Fertil Steril. 2012;97:843-51. [PMID: 22341880] doi:10.1016/j.fertnstert.2012.01.128 55. Kaptchuk TJ. Acupuncture: theory, efficacy, and practice. Ann Intern Med. 2002;136:374-83. [PMID: 11874310]

Appendix Figure 1. Chinese medicine questionnaire used to assess kidney yin deficiency.

Question

0= Never

1= Occasionally

2= Most of the Time

3= Always

0= None

1= Mild

2= Moderate

3= Severe

Score

1. Do you have sensations of heat in the body with sweating?* 2. Do you have feelings of heat in the palms, soles, and chest? 3. Do you have a dry mouth or dry, hard stool? 4. Do you have aching and soreness in the lower back and knees? 5. Do you have dizziness or tinnitus? Symptom score

Sum total of all individual scores (maximum = 36)

Tongue

1 = red with scanty coating 0 = other (description)

Pulse

1 = rapid and fine 0 = other (description)

Signs score

Sum total of tongue and pulse score (maximum = 2)

Final score

Symptom score + signs score = maximum of 38

The final scores for kidney yin and yang deficiency were compared, and women who scored higher for kidney yin deficiency were eligible at this point. Scores were filled in only the unshaded areas of the questionnaire. * Scores for this symptom were multiplied by 2 because it is considered a cardinal symptom.

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

www.annals.org

Appendix Figure 2. Chinese medicine questionnaire used to assess kidney yang deficiency.

Question

0= Never

1= Occasionally

2= Most of the Time

3= Always

0= None

1= Mild

2= Moderate

3= Severe

Score

1. Do your limbs feel cold?* 2. Do you have dizziness or vertigo? 3. Do you have ache and soreness in the lower back? 4. Do you have frequency of urination? 5. Are you low in energy, with a pale complexion? Symptom score

Sum total of all individual scores (maximum = 36)

Tongue

1 = pale color with thin coating 0 = other (description)

Pulse

1 = sunken, fine pulse without force 0 = other (description)

Signs score

Sum total of tongue and pulse score (maximum = 2)

Final score

Symptom score + signs score = maximum of 38

The final scores for kidney yin and yang deficiency were compared, and women who scored higher for kidney yin deficiency were eligible at this point. Scores were filled in only the unshaded areas of the questionnaire. * Scores for this symptom were multiplied by 2 because it is considered a cardinal symptom.

Appendix Table 1. Location of Points Used in the Acupuncture Group Acupoint (Standard Abbreviation/Chinese Nomenclature)

Location

Indication

Depth of Insertion

Kidney 6 (KI6/Zhaohai) Kidney 7 (KI7/Fuliu)

In the depression below the tip of the medial malleolus. 2 cun* directly above the acupoint Kidney 3 on the anterior border of the Achilles tendon. (Kidney 3 is located in the depression between the tip of the medial malleolus and the Achilles tendon.) 3 cun directly above the tip of the medial malleolus. When the palm faces upward, the point is on the radial side of the tendon of the flexor carpi ulnaris muscle, 0.5 cun above the transverse crease of the wrist. On the anterior midline, 3 cun below the umbilicus.

Tonifies kidney yin Tonifies kidney yang and stops night sweating

Up to 3 mm Up to 15 mm

Nourishes kidney, heart, and liver yin Together with KI7 stops night sweating

Up to 20 mm Up to 3 mm

Strengthens the uterus, nourishes the kidneys

20-30 mm

On the dorsum of the foot, in the depression distal to the junction of the first and second metatarsal bones.

Subdues rising liver yang

7-12 mm

Spleen 6 (SP6/Sanyinjiao) Heart 6 (HT6/Yinxi)

Conception Vessel 4 (CV4/Guanyuan) Liver 3 (LR3/Taichong)

* A measurement used in locating acupoints that corresponds to the distance between the 2 medial ends of the creases of the interphalangeal joints when the patient's middle finger is flexed.

www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Appendix Table 2. Location of Points Used in the Sham Group* Name Given to Point

Location

Relationship to Meridians and Acupoints

Innervation

Abd 1

2 cun† above and 5 cun lateral to the umbilicus

1 cun lateral to the Spleen meridian

T8/9

Arm 1

Midway between the acupoints Lung 5 and Large Intestine 11 on the cubital crease On the bulge of the rectus femoris, 5 cun above the middle of the superior border of the patella

Thigh 1

C5/6 2 cun lateral and 3 cun proximal to real acupoint Spleen 10

L3

* Needles in the sham group were bilaterally “inserted.” † A measurement used in locating acupoints that corresponds to the distance between the 2 medial ends of the creases of the interphalangeal joints when the patient's middle finger is flexed.

Appendix Table 3. Checklist Used During Quality Assurance Visits Standardizes interaction with participants Warm, courteous manner Avoid discussing symptoms Answer according to FAQs* No special treatment for any one participant Aim for the same interaction with all participants Provides participants with treatment according to the prescribed treatment protocol Position of participant Ask permission before locating points Wash hands before and after treatment Location of points to be used 6 points used for all participants Familiar with use of the Park sham device (the plastic tube; needle should be inserted handle first into the tube) Aware that the Park sham device is to be used for all participants regardless of treatment allocation Insert until de qi (needle sensation) obtained (for real acupuncture group)—practitioner should ask participants to describe or indicate needle sensation for each point, including sham points (the practitioner is to “pretend” needle sensation is correct for sham points/needling) Retain for 20 minutes (practitioner should set timer) Manipulation at 10 minutes Record treatment details on the Case Report Form Administers Credibility and Expectancy Questionnaire at the end of first appointment

FAQ = frequently asked question. * See Supplement (available at www.annals.org).

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

www.annals.org

Appendix Table 4. Treatments for Hot Flashes Used at 3 and 6 Months After Treatment, by Group Treatment for Hot Flashes* HRT Antidepressants Clonidine Gabapentin Herbal treatments Acupuncture

3 Months

6 Months

Acupuncture (n ⴝ 124)

Sham (n ⴝ 121)

Acupuncture (n ⴝ 116)

Sham (n ⴝ 117)

9 8 1 1 7 7

7 11 0 0 9 3

6 5 0 0 13 12

2 3 0 0 11 3

HRT = hormone replacement therapy. * Women may use >1 treatment.

www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Appendix Figure 3. Study flow diagram.

Assessed for eligibility via screening survey (n = 2140) Not eligible (n = 1255) After screening survey: 845 After Chinese medicine history: 170 After FSH testing: 6 Eligible, declined to participate: 234 Sent baseline HFD (n = 885) Excluded (n = 538) Did not complete baseline HFD: 179 Not eligible: 303 Eligible, but declined to participate: 56 Attended Chinese medicine examination (n = 347)

Participants who were randomly assigned (n = 327)

Withdrew (n = 14) Family reasons: 3 No reason given: 6 Did not like acupuncturist: 1 Wanted active treatment: 1 Not actually eligible: 1 Too busy: 1 HFs not improving: 1

Excluded (n = 20) Not eligible after interview: 9 Not enough flushes after repeat HFD: 5 Eligible, but declined to participate: 6

Allocated to acupuncture group (n = 163) Did not commence treatment: 3 Too busy: 1 Not actually eligible: 1 Health reasons: 1 Commenced treatment: 160

Allocated to sham group (n = 164) Did not commence treatment (1 contributed EOT data): 2 Not actually eligible: 1 Too far away: 1 Commenced treatment: 162 Received real acupuncture: 2

4-wk follow-up (n = 149) Followed up: 142 Did not return surveys: 7 No reason given: 6 HFs not improving: 1

4-wk follow-up (n = 148) Followed up: 138 Did not return surveys: 10 Family reasons: 1 Health reasons: 1 No reason given: 6 Too busy: 1 Too far away (did not start treatment: 1

Withdrew (n = 9) No reason given: 3 Health reasons: 2 Not improving: 2 Too far away: 1 Too busy: 1 EOT (n = 140) Followed up: 137 Did not return surveys: 3 Travel: 1 Too busy: 1 No reason given: 1

EOT (n = 145) Followed up: 142 Did not return surveys: 3 Family reasons: 1 Health reasons: 1 No reason given: 1

Withdrew (n = 10) No reason given: 9 Family reasons: 1 3 mo after treatment (n = 130) Followed up: 118 Did not return surveys: 12 Too busy: 1 No reason given: 11

3 mo after treatment (n = 129) Followed up: 117 Did not return surveys: 12 No reason given: 12

6 mo after treatment (n = 130) Followed up: 118 Did not return surveys: 12 No reason given: 11 Worried about side effects: 1

6 mo after treatment (n = 129) Followed up: 119 Did not return surveys: 10 Too busy: 1 No reason given: 9

Analyzed (n = 163) Analyzed for primary outcome: 163 Analyzed for secondary outcomes*: 160

Analyzed (n = 164) Analyzed for primary outcome: 164 Analyzed for secondary outcomes: 161

Withdrew (n = 16) Family reasons: 3 Health reasons: 4 No reason given: 3 Started a new treatment: 2 Too far away: 3 HFs improving: 1

Withdrew (n = 3) Family reasons: 2 No reason given: 1

Withdrew (n = 16) Family reasons: 1 Too far away (did not commence treatment): 1 No reason given: 14

EOT = end of treatment; FSH = follicular-stimulating hormone; HF = hot flash; HFD = hot flash diary. * 3 women in the acupuncture group and 3 in the sham group did not contribute any data.

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

www.annals.org

Appendix Table 5. Mean HF Outcomes and Estimated Between-Group Differences for Acupuncture and Sham Groups* Outcome and Time Point†

Acupuncture (n ⴝ 163)

Sham (n ⴝ 164)

Acupuncture vs. Sham Between-Group Differences (95% CI)

HF score Baseline 4 weeks EOT 3 months 6 months

25.26 (22.36 to 28.16) 17.08 (13.78 to 20.38) 15.36 (12.13 to 18.59) 15.27 (11.92 to 18.62) 15.47 (11.66 to 19.29)

25.26 (22.36 to 28.16) 16.63 (13.51 to 19.74) 15.04 (11.99 to 18.08) 14.99 (11.81 to 18.16) 15.64 (12.36 to 18.92)

– 0.45 (−1.85 to 2.75) 0.33 (−1.87 to 2.52) 0.28 (−2.15 to 2.71) −0.17 (−3.14 to 2.79)

0.70 0.77 0.82 0.91

HF frequency Baseline 4 weeks EOT 3 months 6 months

12.34 (11.03 to 13.66) 8.80 (7.27 to 10.33) 7.87 (6.35 to 9.39) 7.96 (6.42 to 9.50) 7.65 (6.06 to 9.24)

12.34 (11.03 to 13.66) 8.57 (7.14 to 9.99) 8.06 (6.62 to 9.50) 7.55 (6.12 to 8.99) 7.57 (6.14 to 9.00)

– 0.23 (−0.90 to 1.36) −0.19 (−1.32 to 0.94) 0.40 (−0.72 to 1.53) 0.08 (−1.10 to 1.26)

0.69 0.74 0.48 0.90

2.05 (1.93 to 2.17) 1.86 (1.72 to 2.00) 1.85 (1.70 to 1.99) 1.78 (1.63 to 1.94) 1.79 (1.63 to 1.96)

2.05 (1.93 to 2.17) 1.78 (1.63 to 1.92) 1.72 (1.58 to 1.86) 1.75 (1.59 to 1.91) 1.77 (1.61 to 1.92)

HF severity Baseline 4 weeks EOT 3 months 6 months

Mean Value (95% CI)

– 0.08 (−0.03 to 0.20) 0.12 (0.01 to 0.24) 0.03 (−0.12 to 0.18) 0.03 (−0.13 to 0.18)

P Value

0.155 0.035 0.67 0.73

HF = hot flash; EOT = end of treatment. * All estimates are from a linear mixed-effects model adjusted for baseline value of the outcome and acupuncturist. Under this model, data are assumed to be missing at random. Estimated outcome mean at baseline was constrained to be the same in both groups. † Time points 3 and 6 months after EOT.

www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Appendix Table 6. Mean MENQOL Domains, Anxiety and Depression, and Estimated Between-Group Differences for Acupuncture and Sham Groups* Outcome and Time Point† MENQOL (27) Physical score Baseline 4 weeks EOT 3 months 6 months Sexual score Baseline 4 weeks EOT 3 months 6 months Vasomotor score Baseline 4 weeks EOT 3 months 6 months Psychosocial score Baseline 4 weeks EOT 3 months 6 months HADS (62) Anxiety score Baseline 4 weeks EOT 3 months 6 months Depression score Baseline 4 weeks EOT 3 months 6 months

Mean Value (95% CI) Acupuncture (n ⴝ 160)

Sham (n ⴝ 161)

Acupuncture vs. Sham Between-Group Differences (95% CI)

P Value

3.57 (3.22 to 3.93) 2.88 (2.51 to 3.25) 2.84 (2.46 to 3.22) 2.97 (2.58 to 3.36) 3.14 (2.74 to 3.55)

3.57 (3.22 to 3.93) 2.92 (2.55 to 3.30) 2.80 (2.43 to 3.17) 2.95 (2.57 to 3.33) 2.96 (2.56 to 3.35)

– −0.04 (−0.28 to 0.19) 0.04 (−0.21 to 0.29) 0.02 (−0.26 to 0.30) 0.19 (−0.12 to 0.49)

0.71 0.76 0.88 0.23

3.70 (3.19 to 4.21) 3.15 (2.61 to 3.68) 3.09 (2.55 to 3.63) 3.20 (2.65 to 3.75) 3.27 (2.71 to 3.83)

3.70 (3.19 to 4.21) 2.99 (2.46 to 3.52) 3.08 (2.55 to 3.61) 3.07 (2.53 to 3.61) 3.08 (2.54 to 3.63)

– 0.16 (−0.19 to 0.51) 0.01 (−0.35 to 0.38) 0.13 (−0.26 to 0.52) 0.19 (−0.23 to 0.61)

0.38 0.94 0.51 0.38

5.65 (5.28 to 6.02) 4.70 (4.27 to 5.13) 4.33 (3.89 to 4.77) 4.62 (4.15 to 5.08) 4.53 (4.07 to 4.99)

5.65 (5.28 to 6.02) 4.60 (4.18 to 5.03) 4.42 (3.96 to 4.87) 4.48 (4.02 to 4.93) 4.44 (3.98 to 4.91)

– 0.09 (−0.28 to 0.46) −0.09 (−0.50 to 0.33) 0.14 (−0.31 to 0.59) 0.09 (−0.38 to 0.55)

0.62 0.69 0.55 0.72

3.15 (2.77 to 3.53) 2.68 (2.28 to 3.07) 2.65 (2.25 to 3.06) 2.69 (2.27 to 3.12) 2.77 (2.35 to 3.18)

3.15 (2.77 to 3.53) 2.67 (2.28 to 3.07) 2.66 (2.26 to 3.05) 2.67 (2.27 to 3.08) 2.71 (2.31 to 3.12)

– 0.001 (−0.27 to 0.28) −0.004 (−0.31 to 0.31) 0.019 (−0.31 to 0.35) 0.052 (−0.28 to 0.38)

0.99 0.98 0.91 0.76

7.48 (6.47 to 8.50) 6.62 (5.58 to 7.65) 6.34 (5.27 to 7.40) 6.35 (5.26 to 7.45) 6.63 (5.51 to 7.75)

7.48 (6.47 to 8.50) 6.60 (5.56 to 7.64) 6.56 (5.51 to 7.62) 6.46 (5.42 to 7.49) 6.48 (5.41 to 7.55)

– 0.01 (−0.62 to 0.65) −0.23 (−0.90 to 0.45) −0.10 (−0.82 to 0.61) 0.15 (−0.65 to 0.94)

0.97 0.52 0.78 0.72

4.84 (3.96 to 5.72) 4.39 (3.47 to 5.31) 4.39 (3.47 to 5.32) 4.46 (3.51 to 5.40) 4.65 (3.69 to 5.61)

4.84 (3.96 to 5.72) 4.46 (3.54 to 5.38) 4.62 (3.69 to 5.55) 4.55 (3.61 to 5.50) 4.51 (3.59 to 5.44)

– −0.07 (−0.61 to 0.47) −0.23 (−0.79 to 0.34) −0.09 (−0.75 to 0.56) 0.14 (−0.49 to 0.77)

0.80 0.44 0.78 0.67

EOT = end of treatment; HADS = Hospital Anxiety and Depression Scale; MENQOL = Menopause-Specific Quality of Life Questionnaire. * All estimates are from linear mixed-effects models adjusted for baseline value of the outcome and acupuncturist. Estimated outcome mean at baseline constrained to be the same in both groups. 3 women in the real group and 3 in the sham group did not contribute any data for the secondary outcomes due to withdrawal shortly after randomization; they are excluded from the analysis. All other women were included in the mixed modeling even if there were missing data. † Time points 3 and 6 months posttreatment.

Appendix Table 7. Results of the Bang Blinding Index, by Group Group Allocation

Acupuncture (n = 141) Sham acupuncture (n = 139)

Treatment Guess After First Acupuncture Session, n (%)

Bang Blinding Index (95% CI)

Acupuncture

Sham

Unsure

48 (34)* 43 (31)

3 (2) 10 (7)*

90 (64) 86 (62)

0.32 (0.24 to 0.39) −0.24 (−0.32 to −0.16)

* Correct guess.

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

www.annals.org

Appendix Figure 4. Sensitivity analysis for departures from the assumption that data were missing at random for HF score at end of treatment.

Estimated Intervention Effect (Acupuncture − Sham)

5 4 3 2 1 0 −1 −2 −3 −4 −5 −15

−10

−5

0

5

10

15

Mean Difference (Missing − Observed) HF Score in Stated Group Acupuncture group only

Both groups

Sham group only

The estimated intervention effects adjusted for baseline measurement of the HF score with respective 95% CIs are plotted on the y-axis in both groups. Acupuncture and sham groups only for selected parameter values of the difference between missing and observed mean HF score (␦) at end of treatment are plotted on the x-axis. A horizontal reference line is plotted at 0 on the y-axis, where positive values of the estimated intervention effect indicate that the mean HF score in the sham group is lower (better) than in the acupuncture group and negative values indicate that the acupuncture group has a lower (better) mean HF score than the sham group. HF = hot flash.

www.annals.org

Downloaded From: http://annals.org/ by a University of York User on 11/21/2016

Annals of Internal Medicine • Vol. 164 No. 3 • 2 February 2016

Acupuncture for Menopausal Hot Flashes: A Randomized Trial.

Hot flashes (HFs) affect up to 75% of menopausal women and pose a considerable health and financial burden. Evidence of acupuncture efficacy as an HF ...
566B Sizes 1 Downloads 11 Views