Editorials STATISTICAL INFERENCE-BAYESIAN AND NON-BAYESIAN

This journal has recently published several papers which bear on the argument concerning Bayesian and non-Bayesian modes of statistical inference. Because geneticists using statistical techniques may be interested in this controversy, the points at issue are outlined here with no attempt at mathematical detail or philosophical depth. In doing so, we recognize that there is neither a unique "Bayesian" nor a unique "non-Bayesian" position; partisans of each view often differ considerably among themselves in detail and even philosophy. The main point at issue may seem of little consequence to practical workers and, as we note later, in many statistical applications little difference of eventual action will arise from Bayesian and non-Bayesian approaches. Nevertheless, there are fundamental and unresolved differences of opinion between theoretical statisticians on questions of statistical inference which on occasion can have important practical consequences. The Bayesian question is one of these. Probability and statistics form respectively the deductive and inductive sides of the same science. Starting from an assumed state of the real world (e.g., "this coin is fair" or "the loci in the double backcross mating AB/ab are unlinked"), probability theory arrives, by deductive processes analogous to those of Euclidean geometry, at probabilistic statements about possible outcomes of experiments (e.g., the probability of 10 heads in 10 tosses is 1/1,024 or the probability of 10 nonrecombinants in 10 children is 1/1,024). Few problems of principle arise with such deductive arguments; the main difficulty centers around what a probability really is. Opinions vary widely from one extreme that considers probability a measure of the strength of one's subjective belief, to that which views it as an objective quantity related to a relative frequency in a long series of trials. This difference of opinion is not irrelevant to the Bayesian controversy. Statistical induction gives rise to problems of principle more formidable than those of probability theory. Here the probability 0 for heads on the coin or the recombination fraction R between the loci is unknown. A certain experimental result is observed (e.g., 10 heads in 10 tosses or 10 nonrecombinants in 10 children), and an inductive statement is required concerning the parameter 0 or the parameter R. It is evident that the deductive conclusion of the previous paragraph must be used somehow in making such an inductive statement (thus with 10 heads in 10 tosses, no approach is likely to lead to the conclusion that 0 is close to 0). The difficulties concern the nature of these inductive conclusions and how deductive knowledge is used in making them. 420

EDITORIALS

421

The Bayesian Approach Most Bayesians take the view that a probability is a measure of subjective degree of belief and that such probabilities apply in a meaningful way to parameters as well as random variables. For the Bayesian the statement "the probability that the recombination fraction between these loci lies between .2 and .3 is .95" has meaning. They take the view that just as in the deductive theory a random variable has a probability distribution depending on a parameter, so also in the inductive theory the parameter has a probability distribution depending at least in part on observed values of random variables. To find this probability distribution, two ingredients are required. The first is a "prior" distribution for the parameter, assumed to apply before the experiment is performed and arrived at on the basis of knowledge, experience, or perhaps subjective judgment. The second ingredient is the experimental data, the probability of which is used to modify the prior distribution of the parameter to produce the "posterior" distribution according to the rule: posterior distribution prior distribution X probability of data. This equation derives from Bayes' Theorem, which states that if H1, ... H kare possible hypotheses and D is experimental data, then Pr (Hi given D) = const Pr(Hi) X Pr (D given Hi), where Pr(Hi) is the prior probability of Hi, Pr (D given Hi) is the probability of the data D given Hi is true, and Pr (Hi given D) is the posterior probability of Hi, given the data. The posterior distribution so obtained is used to make inductive statements about the parameter; these might be statements of estimation or perhaps statements (such as that in the previous paragraph) giving an interval within which the parameter lies with a certain probability. We do not pursue here the details of how these inferences are made.

The Non-Bayesian Position Non-Bayesians find two difficulties with this approach. The first is one of principle: they are usually unwilling to ascribe to a parameter the status of a random variable with a probability distribution. To them a parameter is a fixed (although unknown) quantity-in the case of a coin, determined by its physical properties and in the case of linkage, by the locations of genes on chromosomes-and not subject to random variation. The second is a practical difficulty: even admitting the view that a parameter has the status of a random variable, they find difficulty in choosing a prior distribution for it, especially if this involves a degree of subjective judgment or if a priori one is completely ignorant about likely values for the parameter. Finally, they are concerned that in small samples the prior distribution tends to dominate the sampling evidence, so that two rational investigators with different prior distributions can arrive at different conclusions. Bayesians on the other hand see an advantage in the flexible choice of a prior distribution, since they claim that the same formal mathematics can apply to the data of two different experiments about which it is reasonable that two different conclusions are drawn on parallel experimental results. Thus if a new drug works five times in five cases, it is reasonable to expect a high chance of success on a sixth case, whereas if a

422

EDITORIALS

seemingly symmetric coin gives five heads on five tosses, it is nevertheless reasonable to expect a chance of heads close to one half on a sixth toss. Non-Bayesian inference procedures are known to most geneticists. Parameters are estimated, using sample observations only, according to various principles (e.g., maximum likelihood), and tests of hypothesis (perhaps involving significance levels such as 5%) concerning parameter values are carried out. So-called "confidence interval" statements are possible, of the form "we are 95%7 confident that R lies between .2 and .3." (This is the conventionally used version of the more precise statement "the interval (.2, .3) was formed using a procedure which has probability 95%o of including the true value of R.") The difference between this statement and a statement of the Bayesian form may appear over-subtle; nevertheless the two forms of statement reflect quite different statistical philosophies. Note that prior information is not used directly in most non-Bayesian inference procedures; often however, it is used informally in subsequent evaluation of the experimental outcome. Perhaps paradoxically, a user of Bayes' theorem is not necessarily a Bayesian. There are cases where it is agreed that prior probabilities of hypotheses are meaningful and known, particularly in genetic counseling when population gene frequencies are available. In such cases Bayes' theorem is no more than a standard result of conditional probability and the philosophical Bayesian question does not arise. Even in cases where philosophical disagreement does exist, Bayesian and nonBayesian approaches often lead operationally to similar actions, the main exception being when a small sample is taken which suggests parameter values sharply at variance with prior views. The geneticist interested in the possibility of linkage between two loci may not be particularly concerned whether his prior views are formally incorporated into a Bayesian procedure or used less formally in subsequent evaluation of his experimental results. He may not be particularly concerned whether a statement of the Bayesian form "the probability that these loci are unlinked is .95" or of the non-Bayesian form "using a probability level .95, the hypothesis that these loci are unlinked is accepted" is used. But this should not obscure the fundamental differences in principle in the two approaches or the possibility that in practice different conclusions could be reached in using them. In one approach, a parameter is regarded as having a probability distribution, while in the other it is not; in one, a prior distribution of a parameter is essential while in the other such a distribution is often viewed as meaningless. Much time will pass before theoretical statisticians uniformly adopt one or the other philosophy. WARREN J. EWENS University of Pennsylvania Philadelphia

Editorial: Statistical inference--Bayesian and non-Bayesian.

Editorials STATISTICAL INFERENCE-BAYESIAN AND NON-BAYESIAN This journal has recently published several papers which bear on the argument concerning B...
341KB Sizes 0 Downloads 0 Views