Journal o/Cerebral Blood Flow and Metabolism
11:A13&--A139 © 1991 The International Society of Cerebral Blood Flow and Metabolism
Statistical Analysis of Functional N euroimaging Data: Exploratory Versus Inferential Methods
G. Pawlik Departments of Neurology and Psychiatry and PET Laboratory, Max-Planck-Institut fur neurologische Forschung University Hospital, Cologne, Federal Republic of Germany
Summary: In the following, various image processing and
will be outlined briefly. Key Words: Positron emission
analytical techniques, whose efficiency has been demon
tomography-Data transformation-Significance proba
strated empirically by comparison with expert readings of
bility mapping-Cluster analysis-Repeated-measures
hundreds of positron emission tomography (PET) studies,
analysis of variance.
Unlike former technical methods for the assess
degree of psychophysical stimulation and of appar
ment of brain function, modern neuroimaging tech
ent asymmetry due to lateral head tilt can be con
nologies produce activity-coded, tomographic im
trolled to some extent by optimum procedural stan
ages representing pixel-by-pixel estimates of dis
dardization. But this is all that is needed for single
tinct physiological variables simultaneously
case analysis by visual image reading because an
measured in many slices. Some procedures provide
expert using a set of morphological reference im
only univariate results, e.g., the local CMRglc or
ages can mentally transform the images to separate
local CBF. Others yield multivariate estimates for
artifactual and individual anatomical effects from
each pixel. Therefore, depending on the tracer, the
meaningful functional information and to test mul
procedure, and the kinetic model applied, each
tiple physiological hypotheses by comparison with
study concurrently estimates one or several target
standards formed in his/her memory. At least in the
variables in thousands of small volumes of brain
multivariate case or for the analysis of univariate
tissue that are distributed in space according to the
images from groups of subjects, however, numeri
individual anatomy.
cal methods must be resorted to, taking into ac count all the problem factors mentioned above. Un
The ambiguity of interpretation of functional im ages-even under physiological conditions-may
fortunately, their greater efficiency and objectivity
be attributed to a small number of factors: (a) large
are typically achieved at the expense of additional
among-subject variability of brain size and shape,
assumptions and of data reduction resulting in the
(b) large variability of state-dependent brain activ
loss of potentially valuable information.
ity, (c) narrow margins of functional recruitment
The ideal analytical tool would not only provide
relative to baseline activity, (d) overlap of func
descriptive statistics that make sense in functional
tional and anatomical (partial volume) effects, (e)
anatomical terms. Based on standard measures of
heterogeneity of complex regional interaction pat
statistical significance, it would also permit one to
terns, and (1) typically small sample sizes. Only the
arrive at firm conclusions about the structure of re gional effects. This may be attempted using either exploratory or inferential methods. The former are mostly characterized by ease of computation and a
Address correspondence and reprint requests to Dr. G. Pawlik at Departments of Neurology and Psychiatry and PET Labora tory, Max-Planck-Institut fiir neurologische Forschung Univer sity Hospital, Joseph Stelzmann Str. 9, 0-5000 Cologne 41, Ger many. Abb reviations used: ANOVA, analysis of variance; MANOVA, multivariate ANOVA; PET, positron emission to mography; RM-ANOVA, repeated-measures ANOVA; ROI, re gion of interest.
relative lack of requirements as to the construction of customized hypotheses. But contrary to a widely held belief, they often make (and clearly violate) numerous untested assumptions about the nature of the data, and their results are nondescriptive and nontransitive and cannot be substantiated by a re-
A 136
STATISTICAL ANALYSIS OF NEUROIMAGING DATA liable error estimate. Inferential standard methods,
AI37
formed because there is no reliable error term over
by contrast, may be more difficult to apply because
which significance can be assessed. There also are
of the need for setting up specific and at times
exploratory methods for homogeneous test sam
rather complicated hypotheses, but they always at
ples, but they are of limited use, since inferential
tach a meaningful probabilistic statement to their
statistical tests are available for this type of design
descriptive results.
that take into account the essentially multivariate, dependent nature of multiregional image data.
IMAGE DATA TRANSFORMATION With few exceptions (e.g., Levy, 1991), all statis tical methods start from matrices of regional image data obtained according to one of several proposed region-of-interest (ROI) protocols (Bohm et aI.,
1985; Fox et aI., 1985; Herholz et aI., 1985; Pawlik et aI., 1986; Evans et aI., 1988) aiming at improved comparability among subjects. As the number of repetitions (ROIs) typically exceeds the number of replicates (subjects), nonparametric statistical tests can be applied in only a few selected instances. Available parametric procedures, however, are not very robust with regard to violations of the assump tion of normal data distribution and homogeneity of variances. Therefore, the need for transformation should always be tested first, using, e.g., the Box Cox algorithm with
l
test that also provides a co
efficient for power transformation (Pawlik, 1988). If the analysis is to be based on multivariate sim ilarity among regions, this implies a Q-type analysis where the individual regions represent the data points suspended in the space that is spanned by the dependent measures obtained within each of those regions. These dependent variables may not only differ in scale; typically, they also are more or less intercorrelated. Therefore, some of the apparent similarity between certain regions, and the dissim ilarity between others, may be caused by non orthogonality of the coordinate system and by dif ferent lengths of its axes. This bias can be removed by means of a Mahalanobis transformation of the original variates that eliminates the overall correla tion between the variables and standardizes the variance of each variable, thus leaving only the in trinsic similarity among the ROIs (Mardia et aI.,
1979).
Results from univariate single-case studies can be explored by means of significance probability map ping. However, this approach necessitates a prop erly matched reference sample: Single-observation t tests are performed region by region, and the
obtained P values are displayed as a pseudo color-coded region map (Pawlik, 1988). When this procedure is applied to raw data, a low P value sim ply suggests some regional abnormality, without providing any clue as to its physiological interpre tation. It may reflect global or patterned effects. However, according to the principles of full facto rial linear models decomposing the total variance into its additive components, to elucidate the entire interaction structure, the screening power of signif icance probability mapping can be improved sub stantially by performing separate tests on comple mentary topographic aspects of the functional vari able under study: (1) a whole-brain value weighting the contributing ROIs by their individual size, (2) the bihemispheric mean region pattern [e.g.,
r
�
=
mean value of homotopic ROIs minus (1)], and (3) the pattern of regional asymmetries (e.g., right minus-left difference of homotopic ROIs in either hemisphere) corresponding with the group x region x side interaction in a factorial design (see below).
If a single-case study has provided multivariate data for each region, and if it is known from previ ous studies that a certain functional abnormality ex presses itself in some change of the variates that is similar for any location within the brain, maximum likelihood cluster analysis may be used. This method iteratively partitions the ROIs into a "diseased" and a "healthy" stratum. It is quite ef ficient in detecting the affected regions-even with out any external references (Pawlik et aI., 1986). Compared with a panel of experts, this automatic
EXPLORATORY METHODS
classification scheme achieved an overall accuracy
Az (Swets, 1979) of 0.87.
No matter how many variables are measured within each ROI, the data from all regions are
INFERENTIAL METHODS
clearly dependent for procedural and neurophysio logical reasons. Furthermore, regional intercorrela
The results of inferential statistics can only be as
tions are quite heterogeneous (Horwitz et aI., 1986),
good as the hypotheses to be tested. Therefore, to
thus precluding multiple comparisons among ROIs
avoid looking for a needle in a haystack, it is gen
without adjustment for the degree of their statistical
erally advisable to restrict the analysis to those
interdependence. For single-case studies, this im
ROIs pertaining to the scientific problem under in
plies that only exploratory analyses can be per-
vestigation.
J Cereb Blood Flow Metab, Vol. 11, Suppl. 1, 1991
G. PAWLIK
A 138
When only a single variable is quantified in each
comparisonwise error rate must be set to a lower
ROI, or when several measures (e.g., CMRglc and CBF) can be transformed into a single variate (e.g.,
critical level. In general, a set of contrasts is called
cerebral arteriovenous difference for glucose),
times the products of their coefficients equals zero.
orthogonal if and only if the sum of the sample sizes
there is a choice of methodological options. If the
This approach works best when the ROI matrix can
data have a multivariate normal distribution, i.e., if
be arranged in the order of a specific rank hypoth
all regions across the sample are linearly related in
esis. Meaningful results may then be expected, e.g.,
pairs, and if the number of subjects is much larger
from a comparison of each region with the mean of
than the number of ROIs to be analyzed (which may
all higher ranking regions (Helmert contrasts) or
be readily achieved with autoradiography but diffi
with the region next in rank (profile contrasts). The
cult with PET), in general, overall significance test
same procedure works most efficiently in the con
ing by means of multivariate (MANOVA) analysis
duction of multiple comparisons between subsets of
of variance (ANOVA) is most efficient. More often,
subjects, where numerous (up to the degrees of
though, univariate repeated-measures analysis of
freedom of the grouping factor) orthogonal a priori
variance (RM-ANOVA), treating the regional data
contrasts can be set up, without requiring any ad
as repetitions in space without assuming a specific
justment of P values.
correlation structure, is more appropriate because it
RM-MANOVA is the method of choice when
requires only univariate normality and homogeneity
more than one dependent variable is measured in
of the variances across replicates, as mostly af
each ROI and when the requirements of multivari
forded by the Box-Cox transformation. In principle,
ate normality and sufficient sample size are met.
it works like an ordinary factorial ANOVA. How
As with significance probability mapping, the in
ever, while the latter assumes statistical indepen
ferential methods described may be applied most
dence among the levels of a factor, i.e., their co
efficiently by testing separately for global effects
variance matrix must exhibit s phericity, RM
and for specific regional contributions to brain func
ANOV A either decreases the degrees of freedom of
tion, thus eliminating the need for individual ROI
the critical F values of the repeated-measures factor
weighting according to voxel size, in the assessment
according to a correction term for the degree of
of overall effects across regions.
nonsphericity estimated from the sample covari ance matrix (Huynh and Feldt, 1976), or its covari ance structure is directly adjusted for by maximum likelihood estimation of its model parameters (Schluchter, 1988). These adjustments for heteroge neous intercorrelation patterns hardly take a toll on the high efficiency of RM-ANOVA, and therefore they should not be replaced by conservative Bon ferroni-type methods for the protection of the max imum experimentwise error rate
a.
that are re
served to certain multiple, nonorthogonal compari sons. If the latter procedures must be resorted to, e.g., when various groups of subjects are to be com pared in k pairs, either Sidak's method (1967) using a comparisonwise error rate of 1
-
k� or one
of the more recently developed simultaneous test procedures (Holland and DiPonzio Copenhaver,
1988) offers distinct power advantages over the well-known Bonferroni approach. When RM-ANOVA indicates a significant inter action or main effect involving the ROI factor, ap propriate post hoc contrasts may be used for ex planatory purposes (Winer, 1971). Alternatively, in
stead of computing an overall F value, orthogonal region contrasts may be tested right away; but again, their number must not exceed the degrees of freedom of the repeated-measures factor decreased according to the degree of nonsphericity, or their
J Cereb Blood Flow Metab, Vol.
11, SUjJpl. 1, 1991
CONCLUSIONS Powerful statistical methods are available for the analysis of functional neuroimaging data from mul tiple, anatomically standardized ROIs. While single cases can be assessed only by exploratory tech niques, observations on groups of subjects com monly are analyzed most efficiently using advanced RM-ANOV As or adjusted orthogonal contrasts that require neither a specific covariance structure nor unrealistic sample s izes. Exploratory methods should be applied to sample data only for "data snooping" or, on those rare occasions, when appro priate inferential test procedures are missing.
REFERENCES Bohm C, Greitz T, Kingsley D, Berggren BM, Olsson L (1985) A computerized individually variable stereotaxic brain atlas. In: The Metabolism of the Human Brain Studied with Pos itron Emission Tomography (Greitz T, Ingvar DH, Widen L, eds), Raven Press, New York, pp 85-90 Evans AC, Beil C, Marrett S, Thompson CJ, Hakim A (1988) Anatomical-functional correlation using an adjustable MRI based region-of-interest atlas with positron emission tomo graphy. J Cereb Blood Flow Metab 8:513-530 Fox PT, Perlmutter JS, Raichle ME (1985) A stereotactic method of anatomical localization for positron emission tomogra phy. J Comput Assist Tomogr 9:141-153 Herholz K, Pawlik G, Wienhard K, Heiss WD (1985) Computer assisted mapping in quantitative analysis of cerebral posi-
STATISTICAL ANALYSIS OF NEUROIMAGING DATA tron emission tomograms. J Comput Assist Tomogr 9:154161
Holland BS, DiPonzio Copenhaver M (1988) Improved Bonfer roni-type multiple testing procedures. Psychol Bull 104:145149
Horwitz B, Duara R, Rapoport SI (1986) Age differences in in tercorrelations between regional cerebral metabolic rates for glucose. Ann Neurol 19:60--67 Huynh H, Feldt LS (1976) Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. J Ed Stat 1 :69--82 Levy A, Laska E, Brodie 10, Volkow ND, Wolf AP (1991) The spectral signature method for the analysis of PET brain im ages. J Cereb Blood Flow Metab II:AI03-AII3 Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. London, Academic Press Pawlik G (1988) Positron emission tomography and multiregional statistical analysis of brain function: from exploratory meth ods for single cases to inferential tests for multiple group
A139
designs. In: Progress in Computer-Assisted Function Anal ysis (Willems JL, van Bemmel JH, Michel J, eds), Amster dam, North-Holland, pp 401-408 Pawlik G, Herholz K, Wienhard K, Beil C, Heiss WD (1986) Some maximum likelihood methods useful for the regional analysis of dynamic PET data on brain glucose metabolism. In: Information Processing in Medical Imaging (Bacharach SL, ed), Dordrecht, Martinus Nijhoff, pp 298--309 Schluchter MD (1988) Analysis of incomplete multivariate data using linear models with structured covariance matrices. Stat Med 7:317-324
8idak Z (1967) Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc 62:626633
Swets JA (1979) ROC analysis applied to the evaluation of med ical imaging techniques. Invest Radioll3:I09-121 Winer BJ (1971) Statistical Principles in Experimental Design. New York, McGraw-Hili
J Cereb
Blood Flow Metab, Vol. 11, Suppl. 1, 1991