This article was downloaded by: [Central Michigan University] On: 13 November 2014, At: 09:51 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of the American Statistical Association Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/uasa20

A Nonparametric Spatial Model for Periodontal Data With Nonrandom Missingness a

b

Brian J. Reich , Dipankar Bandyopadhyay & Howard D. Bondell a

a

Department of Statistics , North Carolina State University , Raleigh , NC , 27695

b

Division of Biostatistics, School of Public Health , University of Minnesota , Minneapolis , MN , 55455 Accepted author version posted online: 28 Apr 2013.Published online: 27 Sep 2013.

To cite this article: Brian J. Reich , Dipankar Bandyopadhyay & Howard D. Bondell (2013) A Nonparametric Spatial Model for Periodontal Data With Nonrandom Missingness, Journal of the American Statistical Association, 108:503, 820-831, DOI: 10.1080/01621459.2013.795487 To link to this article: http://dx.doi.org/10.1080/01621459.2013.795487

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

A Nonparametric Spatial Model for Periodontal Data With Nonrandom Missingness

Downloaded by [Central Michigan University] at 09:51 13 November 2014

Brian J. REICH, Dipankar BANDYOPADHYAY, and Howard D. BONDELL Periodontal disease (PD) progression is often quantified by clinical attachment level (CAL) defined as the distance down a tooth’s root that is detached from the surrounding bone. Measured at six locations per tooth throughout the mouth (excluding the molars), it gives rise to a dependent data setup. These data are often reduced to a one-number summary, such as the whole-mouth average or the number of observations greater than a threshold, to be used as the response in a regression to identify important covariates related to the current state of a subject’s periodontal health. Rather than a simple one-number summary, we set forward to analyze all available CAL data for each subject, exploiting the presence of spatial dependence, nonstationarity, and nonnormality. Also, many subjects have a considerable proportion of missing teeth, which cannot be considered missing at random because PD is the leading cause of adult tooth loss. Under a Bayesian paradigm, we propose a nonparametric flexible spatial (joint) model of observed CAL and the location of missing tooth via kernel convolution methods, incorporating the aforementioned features of CAL data under a unified framework. Application of this methodology to a dataset recording the periodontal health of an African-American population, as well as simulation studies reveal the gain in model fit and inference, and provides a new perspective into unraveling covariate–response relationships in the presence of complexities posed by these data. KEY WORDS:

Attachment level; Dirichlet process; Kernel convolution; Nonnormality; Nonstationarity

1. INTRODUCTION Periodontitis, a form of periodontal disease (PD), is a chronic inflammatory disease of the periodontium triggered by bacterial plaque, and is characterized by gingivitis, destruction of the alveolar bone and periodontal ligament, apical migration of the epithelial attachment resulting in formation of periodontal pockets, and ultimately loosening and exfoliation of teeth. According to the Position Paper of the American Academy of Periodontology (Burt and Research, Science and Therapy Committee of the American Academy of Periodontology 2005), the incidence of severe generalized PD ranges between 5% and 15% for any population, however a vast majority of the adults remain affected by some moderate form leading to severe impairment in the quality of life. Dental hygienists measure the progression of PD in a subject via the clinical attachment level (CAL), one of the most important (clinical) surrogate endpoints (Nicholls 2003) of PD. The motivating example for this work comes from a clinical study conducted at the Medical University of South Carolina (MUSC) on Type-2 diabetic Gullah-speaking AfricanAmericans (henceforth GAAD study, more details appear in Section 2). The underlying statistical question is to estimate the connection between CAL and various determinants (covariates) of PD, such as age, gender, body mass index (BMI status), glycemic control, etc. CAL measured throughout the whole mouth (six locations per tooth excluding the molars) gives rise to a dependent data setup, and quantifying a subject’s latent Both the first and second author contributed equally. Brian J. Reich is Assistant Professor, Department of Statistics, North Carolina State University, Raleigh, NC 27695 (E-mail: brian [email protected]). Dipankar Bandyopadhyay is Associate Professor, Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 (E-mail: [email protected]). Howard D. Bondell is Associate Professor, Department of Statistics, North Carolina State University, Raleigh, NC 27695 (E-mail: [email protected]). The authors thank the Center for Oral Health Research (COHR) at the Medical University of South Carolina, and particularly the study Principal investigator Dr. J. Fernandes for providing the motivating data, and the context behind this work, as well as the editor, associate editor, and referees for their valueable contributions. The research of the authors were supported by Grants P20RR017696-06, R03DE020114, R03DE021762, P01 CA142538-01, R01 ES014843-02, and R01 MH084022-01A1 from the US National Institutes of Health.

disease status from these extensive site-level data can be challenging. The whole-mouth average CAL is routinely used as a response to study covariate effects via regression. However, the whole-mouth average may be far from representative of periodontal health when data are missing not at random, as one would expect higher CAL values to be associated with tooth loss. In addition, there is inherent loss of information by pooling. Rather than aggregating the data, we analyze all available data for each patient. The process of periodontal decay might have a spatial morphology (Reich, Hodges, and Carlin 2007; Reich and Bandyopadhyay 2010), that is, a diseased tooth-site can influence the decay-status of a set of neighboring tooth-sites (and not the whole mouth), and thus the model must account for spatial dependence. In practice, spatial analyses often assume normality of the outcomes; on the contrary, CAL values are often right-skewed, with a majority of the values concentrated around the lower levels but with an important minority having much higher levels (L´opez et al. 2001; Do et al. 2003). In addition to nonnormality, we also suspect nonstationarity, for example, the variance and skewness of periodontal responses for the posteriorly located molars are quite different from the anterior incisors. Furthermore, CAL values are missing if and only if the tooth is missing, and since extreme PD can lead to missing teeth, the missing data mechanism is nonignorable. Model-based methods to estimate the effects of covariates on PD should accommodate the aforementioned concerns. Various spatial methods are available to account for nonnormality (e.g., Gelfand, Kottas, and MacEachern 2005; Griffin and Steel 2006; Reich and Fuentes 2007; Rodriguez and Dunson 2011; Fonseca and Steel 2011; Reich 2012), nonstationarity (e.g., Sampson and Guttorp 1992; Higdon, Swall, and Kern 1999; Fuentes 2002; Schmidt and O’Hagan 2003; Paciorek and Schervish 2006), and data with informative observation locations (Diggle, Manezes, and Su 2010; Reich and Bandyopadhyay 2010; Pati, Reich, and

820

© 2013 American Statistical Association Journal of the American Statistical Association September 2013, Vol. 108, No. 503, Applications and Case Studies DOI: 10.1080/01621459.2013.795487

Downloaded by [Central Michigan University] at 09:51 13 November 2014

Reich, Bandyopadhyay, and Bondell: A Nonparametric Spatial Model for Periodontal Data With Nonrandom Missingness

Dunson 2011). In this article, we combine important features of several of these methods, and tailor them to the special features of PD data. An alternative to non-Gaussian modeling is to identify a suitable transformation and apply a multivariate Gaussian model to the residuals. However, this makes interpretation less clear, and even if the transformed marginal distribution is Gaussian, there is no assurance that the joint distributions are Gaussian (Jara, Quintana, and Mart´ın 2008). Therefore, we prefer a flexible model on the original scale. Another approach is to avoid model-based methods altogether using a generalized estimating equation (GEE)-type analysis (Diggle et al. 2002). However, these methods may lack power compared with valid likelihood-based approaches, especially in high dimensions (in our case, each subject has 168 measure locations, and thus a 168 × 168 covariance matrix must be estimated). Our approach centers our prior on the stationary Gaussian model and allows for flexibility (nonstationary, non-Gaussian) while using subjectmatter knowledge (symmetry of the mouth, spatial proximity, etc.) in the prior. Our spatial model builds on the kernel convolution (KC) approach of Higdon, Swall, and Kern (1999), who approximated a Gaussian process as a linear combination of kernel basis functions, and allowed for nonstationarity by assigning each kernel a different bandwidth. We also build on Reich and Bandyopadhyay (2010) by jointly modeling the responses and the missing data locations in a multivariate spatial model. To allow for nonGaussian responses in the KC framework, we model the distribution of the kernel coefficients using nonparametric Bayesian (Hjort et al. 2010) methods. The distribution is allowed to vary spatially to permit different shapes for the response distribution in different regions of the mouth. Rather than allow the kernel bandwidth and coefficient density to vary arbitrarily through space, we exploit the natural symmetry of the mouth (i.e., the mouth can be partitioned into four similar quadrants) to borrow strength across the mouth to efficiently estimate these complex features. Similarly, we allow the strength of association between the underlying disease status and probability of a missing tooth to vary by tooth type, since the likely causes of missing teeth may be different for molars than incisors. The proposed flexible model that permits nonnormality, nonstationarity, and nonrandom missingness leads to simple conjugate full conditional distributions for all but a few spatial correlation parameters, leading to relatively straightforward Markov Chain Monte Carlo (MCMC) coding and tuning. The remainder of this article proceeds as follows. Sections 2 and 3 present the data and the statistical model, respectively. Computing details using a Bayesian framework are given in Section 4. We study the finite sample performance of our methodology using a simulation study in Section 5, which shows that failing to account for the above features (typical for PD data) can dramatically degrade estimation of fixed effects. In Section 6, we analyze the GAAD dataset where we find strong skewness for molars, higher spatial correlation between incisors than other teeth, and strong evidence of nonrandom missingness. Section 7 provides some concluding remarks. 2. CAL DATA AND EXPLORATORY ANALYSIS The dataset described in Section 1 was collected as part of a clinical study (Fernandes et al. 2009) conducted at MUSC

9

8

821

7

11

Gap 6

12

5 Gap

13

4 3

14

AL(mm)

15

2

>5 5 4 3 2 0−1

18

31 30

19

29

20 28

21 27

22 23

24

25

26

Figure 1. Measurement locations and sample data for one subject. The vertical and horizontal lines separate the mouth into its four quadrants. “Gap” identifies as an example the four sites in the gap between teeth 4 and 5.

primarily aimed at exploring the relationship between PD and diabetes (as determined by Hba1c or glycosylated hemoglobin) in GAAD (13 years or older) residing in the coastal islands of South Carolina. CAL, defined as the depth (in mm) measured from the cemento-enamel junction to the bottom of the gingival sulcus for each site corresponding to a tooth, was measured for six prespecified sites per tooth (excluding the third molars) via a periodontal probe, giving 168 measurement locations. Figure 1 shows the locations of these measurement sites for one subject who has an incisor missing. Of the six sites on each tooth, our model distinguishes between the four in a gap between teeth and the two that are not. Our model also classifies teeth as molars (tooth numbers 2–3, 14–15, 18–19, and 30–31), premolars (4–5, 12–13, 20–21, and 28–29), canines (6, 12, 22, 27), and incisors (7–10, 23–26), and into four quadrants: two on the upper jaw (teeth 2–8 and 9–15) and two on the lower jaw (18–24 and 25–31). For any particular tooth, the “tongue-side” (lingual) locations refer to the three sites adjacent to (or the direction toward) the tongue that are closer to the center of the oral cavity, while the “cheek side” (buccal) refers to the three sites adjacent to the cheeks/lips that are farther away from the center. Several subject-level covariates considered as possible determinants of PD such as age (in years), gender (1 = female, 0 = male), body mass index or BMI (in kg/m2 ), smoking status (1 = smoker, 0 = never smoker), and HbA1c (1 = high, 0 = controlled) were included as fixed effects. The absolute pairwise correlation between these variable is no more than 0.2 for any pair. We also include spatial covariates such as tooth-type indicators, site in jaw, and site in gap (details appear in Section 3). We selected 199 of the 279 subjects having complete covariate information and at least one nonmissing tooth. The density plots (collapsed by tooth type) in Figure 2(a) show that the data’s density varies from fairly symmetric (although

Downloaded by [Central Michigan University] at 09:51 13 November 2014

822

Journal of the American Statistical Association, September 2013

Figure 2. Plots of the observed CAL data. Panel (a) gives the sample frequency of CAL for each tooth type (CAL is rounded to the nearest mm), and panel (b) gives the 42 × 42 sample covariance of CAL for the 3 × 14 = 42 sites the tongue side of upper jaw (i.e., teeth 2–15 in Figure 1, the vertical and horizontal lines separate the two quadrants).

slightly right-skewed, mostly due to the boundary effect) for incisors to considerably right-skewed for molars. Nonnormality persists after log (after adding one to the responses) and square root transformations, therefore we model the untransformed data for interpretability. In addition to nonnormality, there is also evidence of nonstationarity reflecting differential rates of PD for the various types of tooth locations (anteriorly located incisors vs. posteriorly located molars). Figure 2(b) reveals a complex covariance structure. The variance is larger for molars than the other types of teeth. There is evidence of dependence between adjacent sites, especially for molars and incisors, as well as long-range dependence between molars on both sides of the mouth, for example, sites on teeth 3 and 14. Finally, dependence between adjacent sites depends on whether sites are in the gap between teeth, as evident in the lower covariance that appears in every third column/row for sites that are not in the gap between teeth. In the next section, we describe a model that allows for the nonnormality and the complex covariance structure observed in this exploratory data analysis.

3. FLEXIBLE SPATIAL MODEL FOR CAL 3.1 General Framework Let yi (s) be the observation for subject i = 1, . . . , n at spatial location s. For each subject, there are the same N = 168 potential measurement locations s1 , . . . , sN . Denote yi = [yi (s1 ), . . . , yi (sN )]T as the vector of responses for subject i. To account for nonrandomly missing teeth, we jointly model the response and the location of missing teeth. For these data, typical for CAL data, either all the measurements from a tooth are observed or all observations are missing. Therefore, we define the missing tooth process at the tooth level, rather than site level, with δi (t) = 1 if tooth t is missing for subject i and δi (t) = 0 otherwise. We model CAL using a spatial model that exploits the natural symmetry of the mouth. That is, rather than simply modeling

correlation via spatial distance, we account for correlation between teeth of the same type but in different quadrants. For example, due to genetic or hygienic factors, it may be that a subject has low CAL in most of the mouth, but high CAL for molars in all four quadrants. This dependence is accounted for by subject-level random effects. Let Z be the N × q random effect design matrix. We include q = 6 random effects: indicators for the four types of teeth, an indictor of whether the site is located on the upper jaw, and an indictor of whether the site is in a gap between teeth (rather than on the side of a tooth). The fixed effects mentioned in Section 2 are included in the N × p design matrix Xi . These covariates do not vary spatially, and thus the rows of Xi are identical. The design matrix Xi does not include an intercept or spatial covariates such as tooth type, since these effects are captured by the mean of the random effects. The CAL values for subject i are modeled as yi = μi + ε i and μi = Xi b + Zai + θ i ,

(1)

where μi = [μi (s1 ), . . . , μi (sN )]T is the vector of true CAL values for subject i and εi ∼ N(0, σ 2 IN ) is the vector of errors. The true CAL is decomposed as a function of the fixed effects b, subject random effects ai = (a1i , . . . , aqi )T , and spatial random effects θ i = [θi (s1 ), . . . , θi (sN )]T , whose distributions will be discussed in Sections 3.2 and 3.3, respectively. In this model, spatial dependence is split into low-resolution random effects such as those for tooth type and jaw, and high-resolution spatial effects θi (s) to account for small-scale spatial dependence from one site to the next on the same tooth or neighboring teeth. For missing data, we introduce a latent continuous spatial process zi (t), so that δi (t) = I (zi (t) > 0), for t = 1, . . . , T . The latent variable is modeled in terms of the average true (latent) CAL values for subject i at the six locations on tooth t, denoted DTt μi , where Dt is the N-vector with Dt (s) equal 1/6 if site s is on tooth t and zero otherwise. Then zi (t) ∼ N(μ∗i (t), 1) and μ∗i (t) = DTt [Xi b∗ + Za∗i ] + c(t)DTt μi

(2)

Reich, Bandyopadhyay, and Bondell: A Nonparametric Spatial Model for Periodontal Data With Nonrandom Missingness ∗ ∗ T independent over t and i, where b∗ and a∗i = (ai1 , . . . , aiq ) are the fixed and random effects for missing teeth, respectively. Integrating over the latent zi (t) gives the usual probit link

Downloaded by [Central Michigan University] at 09:51 13 November 2014

P[δi (t) = 1] = [μ∗i (t)],

(3)

where  is the standard normal distribution function. The relationship between CAL and missing teeth is controlled by c(t), which is allowed to vary by tooth. If c(t) > 0, then regions with high CAL are more likely to have missing teeth, and vice versa. We allow c(t) to vary by tooth type, but assume that it is constant for all teeth of the same type to borrow strength across teeth. We note that we have not included a spatial random effect analogous to θ i in the missing teeth model. We experimented with this model, but found that these spatial random effects were not well identified after including subject random effects, likely because with only T = 28 teeth for each subject, there is not enough information in the data to identify small-scale spatial dependence within a subject.

823

Gaussianity, the spatial terms are modeled using KC methods. Higdon, Swall, and Kern (1999) showed that an arbitrary Gaussian process θ (s) can be written as a KC of white noise,  K(||s − u||)Z(du), (5) θ (s) = R2

where K is a kernel function and Z is a white-noise process. The kernel function is related to the covariance via  2 cov [θ (s), θ (s + h)] = τZ K(||u||)K(||u + h||)du, (6) R2

where τZ2 controls the variance. For example, the Gaussian kernel function K(||u||) = exp(−||u||2 /ρ 2 ) gives the isotropic squared-exponential covariance function cov[θ (s), θ (s + h)] of the form exp(−||h||2 /ψ 2 ), where ψ is a function ρ. The integral (5) permits the approximation θ (s) =

L 

K(||s − ul ||)γl ,

(7)

l=1

3.2 Low-Resolution Spatial Model ∗ The subject-level random effects aik and aik control the lowresolution spatial trends for subject i. These random effects are iid

iid

∗ modeled as aik ∼ Fk and aik ∼ Fk∗ . Rather than specifying a parametric distribution, we model Fk nonparametrically using a Dirichlet process mixture (DPM) of normals (Ferguson 1973, 1974; Antoniak 1974). The DPM model is commonly used to capture uncertainty in the parametric form of a distribution. Below, we specify the DPM prior for an arbitrary distribution function F with associated density f (y). Using the stick-breaking representation (Sethuraman 1994), the DPM model for F is equivalent to modeling the density as an infinite mixture of normals

f (y) =

∞ 

    iid πj φ y|ηj , τ12 and ηj ∼ N m, τ22 ,

where {u1 , . . . , uL } is a fixed set of knots covering the spatial iid

domain of interest and γl ∼ N(0, τθ2 ). This yields the covariance function cov [θ (s), θ (s + h)] L  = τθ2 K(||s − ul ||)K(||s + h − ul ||),

which approximates (6) for large L and regular-spaced {u1 , . . . , uL }. Higdon, Swall, and Kern (1999) used this approximation to define nonstationary processes by having a different bandwidth for each term in the sum, and using a second spatial process to spatially smooth the bandwidths. Assuming the subjects share the same kernels, that is,

(4)

j =1

where φ(y|m, τ 2 ) is the Gaussian density function with mean m variance τ 2 , and the mixture weights satisfy πj > 0 and and ∞ j =1 πj = 1. The mixture probabilities “break the stick,” that is, the unit interval, into pieces that sum to one. The proportion of the stick iid

attributed to term j is determined by vj ∼ Beta(1, D). The first probability is π 1 = v1 . The remaining  terms are πj = vj (1 −  l

A nonparametric spatial model for periodontal data with non-random missingness.

Periodontal disease progression is often quantified by clinical attachment level (CAL) defined as the distance down a tooth's root that is detached fr...
511KB Sizes 0 Downloads 0 Views