Statistical analysis of functional neuroimaging data: exploratory versus inferential methods.

Journal o/Cerebral Blood Flow and Metabolism

11:A13&--A139 © 1991 The International Society of Cerebral Blood Flow and Metabolism

Statistical Analysis of Functional N euroimaging Data: Exploratory Versus Inferential Methods

G. Pawlik Departments of Neurology and Psychiatry and PET Laboratory, Max-Planck-Institut fur neurologische Forschung University Hospital, Cologne, Federal Republic of Germany

Summary: In the following, various image processing and

will be outlined briefly. Key Words: Positron emission

analytical techniques, whose efficiency has been demon

tomography-Data transformation-Significance proba

strated empirically by comparison with expert readings of

bility mapping-Cluster analysis-Repeated-measures

hundreds of positron emission tomography (PET) studies,

analysis of variance.

Unlike former technical methods for the assess

degree of psychophysical stimulation and of appar

ment of brain function, modern neuroimaging tech

ent asymmetry due to lateral head tilt can be con

nologies produce activity-coded, tomographic im

trolled to some extent by optimum procedural stan

ages representing pixel-by-pixel estimates of dis

dardization. But this is all that is needed for single

tinct physiological variables simultaneously

case analysis by visual image reading because an

measured in many slices. Some procedures provide

expert using a set of morphological reference im

only univariate results, e.g., the local CMRglc or

ages can mentally transform the images to separate

local CBF. Others yield multivariate estimates for

artifactual and individual anatomical effects from

each pixel. Therefore, depending on the tracer, the

meaningful functional information and to test mul

procedure, and the kinetic model applied, each

tiple physiological hypotheses by comparison with

study concurrently estimates one or several target

standards formed in his/her memory. At least in the

variables in thousands of small volumes of brain

multivariate case or for the analysis of univariate

tissue that are distributed in space according to the

images from groups of subjects, however, numeri

individual anatomy.

cal methods must be resorted to, taking into ac count all the problem factors mentioned above. Un

The ambiguity of interpretation of functional im ages-even under physiological conditions-may

fortunately, their greater efficiency and objectivity

be attributed to a small number of factors: (a) large

are typically achieved at the expense of additional

among-subject variability of brain size and shape,

assumptions and of data reduction resulting in the

(b) large variability of state-dependent brain activ

loss of potentially valuable information.

ity, (c) narrow margins of functional recruitment

The ideal analytical tool would not only provide

relative to baseline activity, (d) overlap of func

descriptive statistics that make sense in functional

tional and anatomical (partial volume) effects, (e)

anatomical terms. Based on standard measures of

heterogeneity of complex regional interaction pat

statistical significance, it would also permit one to

terns, and (1) typically small sample sizes. Only the

arrive at firm conclusions about the structure of re gional effects. This may be attempted using either exploratory or inferential methods. The former are mostly characterized by ease of computation and a

Address correspondence and reprint requests to Dr. G. Pawlik at Departments of Neurology and Psychiatry and PET Labora tory, Max-Planck-Institut fiir neurologische Forschung Univer sity Hospital, Joseph Stelzmann Str. 9, 0-5000 Cologne 41, Ger many. Abb reviations used: ANOVA, analysis of variance; MANOVA, multivariate ANOVA; PET, positron emission to mography; RM-ANOVA, repeated-measures ANOVA; ROI, re gion of interest.

relative lack of requirements as to the construction of customized hypotheses. But contrary to a widely held belief, they often make (and clearly violate) numerous untested assumptions about the nature of the data, and their results are nondescriptive and nontransitive and cannot be substantiated by a re-

A 136

STATISTICAL ANALYSIS OF NEUROIMAGING DATA liable error estimate. Inferential standard methods,

AI37

formed because there is no reliable error term over

by contrast, may be more difficult to apply because

which significance can be assessed. There also are

of the need for setting up specific and at times

exploratory methods for homogeneous test sam

rather complicated hypotheses, but they always at

ples, but they are of limited use, since inferential

tach a meaningful probabilistic statement to their

statistical tests are available for this type of design

descriptive results.

that take into account the essentially multivariate, dependent nature of multiregional image data.

IMAGE DATA TRANSFORMATION With few exceptions (e.g., Levy, 1991), all statis tical methods start from matrices of regional image data obtained according to one of several proposed region-of-interest (ROI) protocols (Bohm et aI.,

1985; Fox et aI., 1985; Herholz et aI., 1985; Pawlik et aI., 1986; Evans et aI., 1988) aiming at improved comparability among subjects. As the number of repetitions (ROIs) typically exceeds the number of replicates (subjects), nonparametric statistical tests can be applied in only a few selected instances. Available parametric procedures, however, are not very robust with regard to violations of the assump tion of normal data distribution and homogeneity of variances. Therefore, the need for transformation should always be tested first, using, e.g., the Box Cox algorithm with

l

test that also provides a co

efficient for power transformation (Pawlik, 1988). If the analysis is to be based on multivariate sim ilarity among regions, this implies a Q-type analysis where the individual regions represent the data points suspended in the space that is spanned by the dependent measures obtained within each of those regions. These dependent variables may not only differ in scale; typically, they also are more or less intercorrelated. Therefore, some of the apparent similarity between certain regions, and the dissim ilarity between others, may be caused by non orthogonality of the coordinate system and by dif ferent lengths of its axes. This bias can be removed by means of a Mahalanobis transformation of the original variates that eliminates the overall correla tion between the variables and standardizes the variance of each variable, thus leaving only the in trinsic similarity among the ROIs (Mardia et aI.,

1979).

Results from univariate single-case studies can be explored by means of significance probability map ping. However, this approach necessitates a prop erly matched reference sample: Single-observation t tests are performed region by region, and the

obtained P values are displayed as a pseudo color-coded region map (Pawlik, 1988). When this procedure is applied to raw data, a low P value sim ply suggests some regional abnormality, without providing any clue as to its physiological interpre tation. It may reflect global or patterned effects. However, according to the principles of full facto rial linear models decomposing the total variance into its additive components, to elucidate the entire interaction structure, the screening power of signif icance probability mapping can be improved sub stantially by performing separate tests on comple mentary topographic aspects of the functional vari able under study: (1) a whole-brain value weighting the contributing ROIs by their individual size, (2) the bihemispheric mean region pattern [e.g.,

r

�

=

mean value of homotopic ROIs minus (1)], and (3) the pattern of regional asymmetries (e.g., right minus-left difference of homotopic ROIs in either hemisphere) corresponding with the group x region x side interaction in a factorial design (see below).

If a single-case study has provided multivariate data for each region, and if it is known from previ ous studies that a certain functional abnormality ex presses itself in some change of the variates that is similar for any location within the brain, maximum likelihood cluster analysis may be used. This method iteratively partitions the ROIs into a "diseased" and a "healthy" stratum. It is quite ef ficient in detecting the affected regions-even with out any external references (Pawlik et aI., 1986). Compared with a panel of experts, this automatic

EXPLORATORY METHODS

classification scheme achieved an overall accuracy

Az (Swets, 1979) of 0.87.

No matter how many variables are measured within each ROI, the data from all regions are

INFERENTIAL METHODS

clearly dependent for procedural and neurophysio logical reasons. Furthermore, regional intercorrela

The results of inferential statistics can only be as

tions are quite heterogeneous (Horwitz et aI., 1986),

good as the hypotheses to be tested. Therefore, to

thus precluding multiple comparisons among ROIs

avoid looking for a needle in a haystack, it is gen

without adjustment for the degree of their statistical

erally advisable to restrict the analysis to those

interdependence. For single-case studies, this im

ROIs pertaining to the scientific problem under in

plies that only exploratory analyses can be per-

vestigation.

J Cereb Blood Flow Metab, Vol. 11, Suppl. 1, 1991

G. PAWLIK

A 138

When only a single variable is quantified in each

comparisonwise error rate must be set to a lower

ROI, or when several measures (e.g., CMRglc and CBF) can be transformed into a single variate (e.g.,

critical level. In general, a set of contrasts is called

cerebral arteriovenous difference for glucose),

times the products of their coefficients equals zero.

orthogonal if and only if the sum of the sample sizes

there is a choice of methodological options. If the

This approach works best when the ROI matrix can

data have a multivariate normal distribution, i.e., if

be arranged in the order of a specific rank hypoth

all regions across the sample are linearly related in

esis. Meaningful results may then be expected, e.g.,

pairs, and if the number of subjects is much larger

from a comparison of each region with the mean of

than the number of ROIs to be analyzed (which may

all higher ranking regions (Helmert contrasts) or

be readily achieved with autoradiography but diffi

with the region next in rank (profile contrasts). The

cult with PET), in general, overall significance test

same procedure works most efficiently in the con

ing by means of multivariate (MANOVA) analysis

duction of multiple comparisons between subsets of

of variance (ANOVA) is most efficient. More often,

subjects, where numerous (up to the degrees of

though, univariate repeated-measures analysis of

freedom of the grouping factor) orthogonal a priori

variance (RM-ANOVA), treating the regional data

contrasts can be set up, without requiring any ad

as repetitions in space without assuming a specific

justment of P values.

correlation structure, is more appropriate because it

RM-MANOVA is the method of choice when

requires only univariate normality and homogeneity

more than one dependent variable is measured in

of the variances across replicates, as mostly af

each ROI and when the requirements of multivari

forded by the Box-Cox transformation. In principle,

ate normality and sufficient sample size are met.

it works like an ordinary factorial ANOVA. How

As with significance probability mapping, the in

ever, while the latter assumes statistical indepen

ferential methods described may be applied most

dence among the levels of a factor, i.e., their co

efficiently by testing separately for global effects

variance matrix must exhibit s phericity, RM

and for specific regional contributions to brain func

ANOV A either decreases the degrees of freedom of

tion, thus eliminating the need for individual ROI

the critical F values of the repeated-measures factor

weighting according to voxel size, in the assessment

according to a correction term for the degree of

of overall effects across regions.

nonsphericity estimated from the sample covari ance matrix (Huynh and Feldt, 1976), or its covari ance structure is directly adjusted for by maximum likelihood estimation of its model parameters (Schluchter, 1988). These adjustments for heteroge neous intercorrelation patterns hardly take a toll on the high efficiency of RM-ANOVA, and therefore they should not be replaced by conservative Bon ferroni-type methods for the protection of the max imum experimentwise error rate

a.

that are re

served to certain multiple, nonorthogonal compari sons. If the latter procedures must be resorted to, e.g., when various groups of subjects are to be com pared in k pairs, either Sidak's method (1967) using a comparisonwise error rate of 1

-

k� or one

of the more recently developed simultaneous test procedures (Holland and DiPonzio Copenhaver,

1988) offers distinct power advantages over the well-known Bonferroni approach. When RM-ANOVA indicates a significant inter action or main effect involving the ROI factor, ap propriate post hoc contrasts may be used for ex planatory purposes (Winer, 1971). Alternatively, in

stead of computing an overall F value, orthogonal region contrasts may be tested right away; but again, their number must not exceed the degrees of freedom of the repeated-measures factor decreased according to the degree of nonsphericity, or their

J Cereb Blood Flow Metab, Vol.

11, SUjJpl. 1, 1991

CONCLUSIONS Powerful statistical methods are available for the analysis of functional neuroimaging data from mul tiple, anatomically standardized ROIs. While single cases can be assessed only by exploratory tech niques, observations on groups of subjects com monly are analyzed most efficiently using advanced RM-ANOV As or adjusted orthogonal contrasts that require neither a specific covariance structure nor unrealistic sample s izes. Exploratory methods should be applied to sample data only for "data snooping" or, on those rare occasions, when appro priate inferential test procedures are missing.

REFERENCES Bohm C, Greitz T, Kingsley D, Berggren BM, Olsson L (1985) A computerized individually variable stereotaxic brain atlas. In: The Metabolism of the Human Brain Studied with Pos itron Emission Tomography (Greitz T, Ingvar DH, Widen L, eds), Raven Press, New York, pp 85-90 Evans AC, Beil C, Marrett S, Thompson CJ, Hakim A (1988) Anatomical-functional correlation using an adjustable MRI based region-of-interest atlas with positron emission tomo graphy. J Cereb Blood Flow Metab 8:513-530 Fox PT, Perlmutter JS, Raichle ME (1985) A stereotactic method of anatomical localization for positron emission tomogra phy. J Comput Assist Tomogr 9:141-153 Herholz K, Pawlik G, Wienhard K, Heiss WD (1985) Computer assisted mapping in quantitative analysis of cerebral posi-

STATISTICAL ANALYSIS OF NEUROIMAGING DATA tron emission tomograms. J Comput Assist Tomogr 9:154161

Holland BS, DiPonzio Copenhaver M (1988) Improved Bonfer roni-type multiple testing procedures. Psychol Bull 104:145149

Horwitz B, Duara R, Rapoport SI (1986) Age differences in in tercorrelations between regional cerebral metabolic rates for glucose. Ann Neurol 19:60--67 Huynh H, Feldt LS (1976) Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. J Ed Stat 1 :69--82 Levy A, Laska E, Brodie 10, Volkow ND, Wolf AP (1991) The spectral signature method for the analysis of PET brain im ages. J Cereb Blood Flow Metab II:AI03-AII3 Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. London, Academic Press Pawlik G (1988) Positron emission tomography and multiregional statistical analysis of brain function: from exploratory meth ods for single cases to inferential tests for multiple group

A139

designs. In: Progress in Computer-Assisted Function Anal ysis (Willems JL, van Bemmel JH, Michel J, eds), Amster dam, North-Holland, pp 401-408 Pawlik G, Herholz K, Wienhard K, Beil C, Heiss WD (1986) Some maximum likelihood methods useful for the regional analysis of dynamic PET data on brain glucose metabolism. In: Information Processing in Medical Imaging (Bacharach SL, ed), Dordrecht, Martinus Nijhoff, pp 298--309 Schluchter MD (1988) Analysis of incomplete multivariate data using linear models with structured covariance matrices. Stat Med 7:317-324

8idak Z (1967) Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc 62:626633

Swets JA (1979) ROC analysis applied to the evaluation of med ical imaging techniques. Invest Radioll3:I09-121 Winer BJ (1971) Statistical Principles in Experimental Design. New York, McGraw-Hili

J Cereb

Blood Flow Metab, Vol. 11, Suppl. 1, 1991

MULTI-RESOLUTION STATISTICAL ANALYSIS ON GRAPH STRUCTURED DATA IN NEUROIMAGING.

Wastewater-Based Epidemiology of Stimulant Drugs: Functional Data Analysis Compared to Traditional Statistical Methods.

Statistical methods for the analysis of high-throughput metabolomics data.

Controversy in statistical analysis of functional magnetic resonance imaging data.

Exploratory statistical methods, with applications to psychiatric research.

Commentary: Exploratory data analysis.

BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods.

Cancer symptom clusters: an exploratory analysis of eight statistical techniques.

Methods for identifying subject-specific abnormalities in neuroimaging data.

Informing the Structure of Executive Function in Children: A Meta-Analysis of Functional Neuroimaging Data.

Multinomial analysis of behavior: statistical methods.

Statistical analysis of comparison between laboratory methods.

Automated and nonbiased regional quantification of functional neuroimaging data.

A SPARSE REDUCED RANK FRAMEWORK FOR GROUP ANALYSIS OF FUNCTIONAL NEUROIMAGING DATA.

An exploratory data analysis of electroencephalograms using the functional boxplots approach.

Statistical methods for temporal and space-time analysis of community composition data.

A Functional Approach to Deconvolve Dynamic Neuroimaging Data.

Statistical analysis of enzyme kinetic data.

Big data analysis using modern statistical and machine learning methods in medicine.

Statistical analysis of radioligand assay data.

Statistical analysis methods for meta-analysis of times to emergence.

Statistical methods for incomplete data: Some results on model misspecification.

Statistical Learning Methods for Longitudinal High-dimensional Data.

Statistical Methods for Predicting Malaria Incidences Using Data from Sudan.