XENOBIOTICA,

1992, VOL. 22,

NO.

7, 881-893

Statistical aspects of bioequivalence-a

review

A. W. P I D G E N Department of Clinical Pharmacology, Hoechst U K Ltd, Milton Keynes, U K

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

Received 19 June 1991 :accepted 24 April 1992

1. Over the past 20 years a number of statistical methods have been proposed for use in bioequivalence testing. This review examines these methods and reflects current thinking of regulatory authorities.

2. The standard bioequivalence study is conducted as a controlled, single-dose crossover design in a small number of healthy male adults. Blood and/or urine samples are taken at predetermined times for druglmetabolite assay from which pharmacokinetic parameters are derived and compared statistically. Sample size calculations should be determined by the error variance associated with the primary characteristic to be studied, the significance level, the power of the test and the deviation from the reference product compatible with safety and efficacy. 3. In general, bioequivalence is assessed using three parameters namely, C,,,, t,,% and AUC. Urinary excretion data may also be used if the amount excreted unchanged is significant. These parameters are best obtained using a simple model-independent approach. 4. T h e parameters of C,,, and A U C should be logarithmically transformed prior to analysis. For t,,, , parametric statistical procedures are not appropriate.

5. Classical hypothesis testing using the power approach is not applicable to the practical problem under consideration in bioequivalence trials. 6. Classical 90% confidence limits and the 2 one-sided t-test approach are operationally identical and are the methods of choice for assessing bioequivalence (C,,, and AUC). When t,,, is an important parameter from the clinical point of view then the use of non-parametric confidence intervals is recommended.

Introduction Over the past 20 years a number of statistical methods have been proposed for use in bioequivalence testing. In fact this topic has been the subject of numerous scientific papers. T h e advent of generic prescribing, however, has placed a special focus on the whole subject area with regulatory authorities gradually moving towards a coherent policy. This review sets out to examine these policies especially with respect to the statistical aspects of bioequivalence testing.

Design factors T h e standard bioequivalence study is conducted as a controlled crossover design in a small number of healthy male adults. Single doses of two or more drug products are administered under fasting conditions and food and fluid intake is controlled throughout the study. Blood (and/or urine) samples are taken at predetermined times for druglmetabolite assay from which pharmacokinetic parameters are derived and compared statistically. 0049-8254/92$3.00 0 1992 Taylor & Francis Ltd

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

882

A. W . Pidgen

Although the majority of bioequivalence studies generally involve giving a single administration of each formulation under test, there are situations in which multiple dosing is required when testing bioequivalence (e.g. controlled-release products. As mentioned above, the basic experimental design usually employed in bioequivalence trials is a crossover design. Parallel group designs are rarely used since they do not allow a within-subject comparison and hence require a very large sample size. Balanced incomplete block designs are not recommended for routine use in bioequivalence studies, although they can have some advantages when several formulations are being tested at an early stage of a drug’s development. The washout period between treatments is an extremely important factor in crossover studies. It must be long enough to ensure that all of the drug from the first treatment has been cleared from the body before the administration of the second treatment. In bioequivalence studies we have a direct check for any carryover of drug (or active metabolite) by examining the pre-dose blood sample. If drug from the previous administration has not yet been cleared from the body then it is readily detected. Statistical procedures are also available to test for carry-over (see section on Hypothesis testing). T h e number of subjects required for a bioequivalence study must be estimated at the design (or protocol) stage. This is generally achieved using the theory of hypothesis testing. Four parameters should be considered, namely;

A

= T h e minimum difference between the formulations with respect to an important pharmacokinetic parameter (e.g. A U C ) that is compatible with safety and efficacy. T h e FDA recommends A = f20%.

CI

= T h e probability of rejecting the null hypothesis when it is in reality true (i.e. significance level).

1 - B = T h e probability of detecting A if it really exists (power of the test). o2 = T h e error variance. From standard methodology we obtain:

If c1 is set at 0.05 (for a 5 % significance level) and then this formula reduces to:

B at 0.20 (for a power of

80%)

o2

n215.68 -

A2

Relevant pharmacokinetic parameters T h e principle behind a bioequivalence study is that ‘two (or more) formulations of a drug that give rise to essentially equivalent concentrations of the parent drug (or active metabolite) in the circulating blood (or plasma) viewed as a profile over time will elicit equivalent therapeutic effects’. Therefore it is important to ensure an adequate characterization of the blood level profile. In general, bioequivalence is assessed using three parameters:

1. The maximum concentration of drug found in the blood (or plasma) (Cmax). 2. The time taken to reach the maximum drug concentration (tmaX). 3. T h e total amount of drug absorbed (AUC).

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

Statistical aspects of bioequivalence

883

These parameters are best obtained using a simple model-independent approach. The exclusive use of modelled characteristics is not recommended unless the pharmacokinetic model has been validated for the active substance and products. For each individual, the maximum drug concentration (C,,,) and time to maximum drug concentration (t,,,) are obtained directly from the data. T h e area under the curve (AUC) is estimated by the trapezoidal rule, using an extrapolation procedure (with appropriate rate constant k) if necessary. However, it should be pointed out that regulatory authorities expect sampling to be continued for long enough to ensure that a t least 80% of the AUC is accounted for by observed data. For drugs with a very long half-life, then AUC can be calculated to the last measured time point or to a predetermined time point as described by Urso and Aarons (1983). For bioequivalence studies where multiple dosing is employed, then the AUC is calculated over one dosing interval at steady state, since this is equivalent (in a linear system) to the AUC calculated to infinity after a single dose. Other parameters such as half life ( t 1 / 2 )and mean residue time (MRT) can be calculated if appropriate. When the amount of drug excreted unchanged in the urine is significant then urinary excretion data (Ae,) can also be used as a measure of the amount of drug absorbed. In most cases these features of the blood level profile are related to the therapeutic use of the drug. However, the area under the curve (A U C ) is probably the most important parameter since it is proportional to the amount of drug absorbed. T h e peak concentration (C,,,) may be related to both toxic or therapeutic effects, depending upon the concentration range of the therapeutic window. T h e time to peak drug concentration (t,,,) may be of ‘importance for drugs which require to reach their maximum concentrations (and hence maximum therapeutic effect) as quickly as possible (e.g. analgesics). However, for drugs which exert their therapeutic effect over a period of time then t,,, is of little or no clinical importance. An alternative method which has been applied to bioequivalence data is to compare the formulations (or treatments) solely with respect to the sequence of blood levels generated. T h e trial is then viewed as a repeated measures experiment (see section on Repeated measures).

Statistical analysis The problem Let us assume that we have just completed a bioequivalence study in which we wish to compare two formulations, namely a reference formulation R (e.g. capsule) and a test formulation T (e.g. tablet). T h e main objective of the statistical analysis of such a study is to determine whether the results obtained have established that the two formulations are ‘equivalent’ with respect to the bioavailability parameters C,,,, t,,, and A UC (possibly also t I l 2and Ae,). Let pT and p R be the average values for the test and reference formulations respectively. What conditions must hold for p T and p R to be considered equivalent? Obviously if j i T = p R then p T and p R will be equivalent. But we know that p T and pR may differ and yet the difference will be of no clinical importance.

884

A. W . Pidgen

I will now explore some of the statistical approaches which have been used to tackle this problem.

Hypothesis testing-power approach Up until about 15 years ago, a decision on the bioequivalence of two formulations was based solely upon a test of the null hypothesis (Ho):

Ho: p T - p R = O

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

against the alternative hypothesis (HI):

Acceptance of H, was accepted as proof of bioequivalence, whilst rejection of H, was proof of bioinequivalence. Unfortunately, a number of anomalies arose with this approach including: 1. Large (clinically important) differences between the formulations, which were not statistically significant. 2. Small (clinically unimportant) differences between the formulations which were statistically significant. These problems arose mainly because of the wide variations in sample sizes used in bioequivalence trials. If a small sample size (e.g. n = 6 ) was used in a trial on a drug with a large intra-subject variability then any real differences between the treatments would not be detected statistically. On the other hand, the use of a large sample size (e.g. n = 24) in a trial on a drug with a small intra-subject variability would ensure that any differences between the treatments, however small, would be detected. T h e FDA response to the first of these anomalies was to introduce the so-called 80/20 power rule. This rule specifies that a test of bioequivalence must have at least an 80%power of detecting a 20% difference between p T and p R ;f it exists. This makes certain that the sample size used is adequate to ensure that any differences between the formulations are not masked by the variability of the compound. No such numerical criterion was adopted for the second anomaly and such issues were often decided upon clinical grounds. Analysis of variance (ANOVA) is the standard method of choice for evaluating a bioequivalence study using the power approach. T h e basic assumptions made when applying ANOVA are: 1. The underlying distribution is approximately normal. variances of the test and reference products homogeneous. 3. T h e model effects are additive.

2. The within-subject

are

Statistical procedures are available to test for normality of distribution (Shapiro and Wilk 1965) and homogeneity of variance (Winer 1971). However, it has been suggested (Westlake 1973b, 1980) that plasma concentration versus time data are log-normally distributed. A logarithmic transformation has the effect of

1 . ensuring additivity of the model,

885

Statistical aspects of bioequivalence

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

2. bringing the distribution of data closer to a normal distribution, 3. stabilizing the variances. Experience has shown that for the parameters of C,, and AUC (not t,,,), that following a logarithmic transformation the basic assumptions required for using ANOVA generally hold. It is, however, often instructive to analyse the data twice, once with, and once without, a logarithmic transformation. This generally shows that in spite of the valid theoretical arguments] in the large majority of cases transforming the data has little or no influence on the final statistical outcome. Nevertheless, the new European (CPMP) guidelines state that C,, and A UC data should be logarithmically transformed prior to statistical analysis. A statistical model (Grizzle 1965) has been proposed for the two-period crossover design which allows a test for carry-over (residual) effects to be made. If there is no evidence for carryover, then these effects are deleted from the model. If carryover effects are found then, strictly speaking, the analysis should be based on data from the first period only. T h e probability of finding a significant carry-over effect in a bioequivalence trial is very small, particularly if the washout period has been adequate. Nonetheless, it is worth performing this simple test as a means of reassurance. T h e analysis of variance for a two-period crossover design in ‘n’subjects with carryover (residual) effects deleted is shown in table 1. Using an F-ratio, the null hypothesis of no difference between the formulations is tested at the 5% level of significance. Differences between subjects and between periods can be tested at the same time. Significant differences between subjects will almost always be present; this is quite usual and reflects biological variation between individuals. Significant differences between periods are not expected to be found. If these are present then an investigation should be undertaken to see if these can be explained by differences in study conditions, methods of batching samples for drug assay etc. For the parameter of t,,, the use of ANOVA is not appropriate. T h e values obtained for this parameter come from a discrete data set (i.e. the preselected sampling times) and can be poor estimates of the ‘true’ value (figure 1). T h e accuracy with which the observed t,,, approximates to the actual tmaxis wholly dependent upon the frequency of sampling around the peak value for each drug product. Secondary peaks or the presence of a plateau can also complicate the t,,, interpretation. In view of these comments it is recommended that a nonparametric method be used to test for differences in t,,, between the formulations. Koch (1972) has published a hypothesis test based on ranks, using a Wilcoxon statistic, which is suitable for this purpose (see also Confidence intervals). Table 1 .

Analysis of variance.

Source of variation

Degrees of freedom

Periods Subjects Formulations Error Total

1 n- 1 1 n-2 2n-1

A. W . Pidgen

886

1 [w *-True

I

value

A

Sampled values

Xenobiotica Downloaded from informahealthcare.com by Chulalongkorn University on 01/07/15 For personal use only.

t

C

C

t

Figure 1 .

t

Problems with the interpretation of

tmnx.

Similar arguments to those above may also be used for the parameter of C,,,, particularly if the formulation is very rapidly absorbed and sampling is inadequate. In such cases parameter estimates are often poor and show considerable inter- and intra-subject variability.

Confidence intervals Although the power rule strengthened the hypothesis testing approach, it was generally accepted that the testing of a simple null hypothesis was not relevant to the practical problem under consideration in bioequivalence trials. Quite simply, we do not wish to know whether the test formulation produces identical absorption to the reference formulation, but by how much the formulations differ. To avoid ineffective therapy we do not want to see p R $ p T ; alternatively to avoid potential toxicity we do not want to see p T $ p R . Therefore, it is necessary to specify limits 8, and such that these problems do not occur. Hence, pT and p R will be considered equivalent i f

e2

81

Statistical aspects of bioequivalence--a review.

1. Over the past 20 years a number of statistical methods have been proposed for use in bioequivalence testing. This review examines these methods and...
752KB Sizes 0 Downloads 0 Views