Genetic Epidemiology

RESEARCH ARTICLE Genome-wide Association and Network Analysis of Lung Function in the Framingham Heart Study Shu-Yi Liao,1 Xihong Lin,1 and David C. Christiani1,2 ∗ 1

Harvard School of Public Health, Boston, Massachusetts, United States of America; 2 Harvard Medical School, Boston, Massachusetts, United States of America

Received 14 January 2014; Revised 30 April 2014; accepted revised manuscript 29 May 2014. Published online 8 July 2014 in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/gepi.21841

ABSTRACT: Single nucleotide polymorphisms have been found to be associated with pulmonary function using genomewide association studies. However, lung function is a complex trait that is likely to be influenced by multiple gene–gene interactions besides individual genes. Our goal is to build a cellular network to explore the relationship between pulmonary function and genotypes by combining SNP level and network analyses using longitudinal lung function data from the Framingham Heart Study. We analyzed 2,698 genotyped participants from the Offspring cohort that had an average of 3.35 spirometry measurements per person for a mean length of 13 years. Repeated forced expiratory volume in one second (FEV1 ) and the ratio of FEV1 to forced vital capacity (FVC) were used as outcomes. Data were analyzed using linear-mixed models for the association between lung function and alleles by accounting for the correlation among repeated measures over time within the same subject and within-family correlation. Network analyses were performed using dmGWAS and validated with data from the Third Generation cohort. Analyses identified SMAD3, TGFBR2, CD44, CTGF, VCAN, CTNNB1, SCGB1A1, PDE4D, NRG1, EPHB1, and LYN as contributors to pulmonary function. Most of these genes were novel that were not found previously using solely SNP-level analysis. These novel genes are involving the transforming growth factor beta (TGFB)-SMAD pathway, Wnt/beta-catenin pathway, etc. Therefore, combining SNP-level and network analyses using longitudinal lung function data is a useful alternative strategy to identify risk genes. Genet Epidemiol 38:572–578, 2014. © 2014 Wiley Periodicals, Inc.

KEY WORDS: GWAS; pulmonary function; network analysis; SMAD3; TGFBR2; CD44; CTGF; VCAN; CTNNB1; SCGB1A1; PDE4D; NRG1; EPHB1; LYN

Introduction Chronic obstructive pulmonary disease (COPD) is a progressive lung disease in which impeded airflow makes breathing difficult. COPD is estimated to become the fourth leading cause of death by 2030 [Mathers and Loncar, 2006]. Cigarette smoke is the most important environmental risk factor for COPD, but the development of COPD is not universal in smokers. This phenomenon indicates that other factors contribute to the etiology of the disease. Pulmonary spirometric measurements, including forced expiratory volume in one second (FEV1 ) and ratio of FEV1 to forced vital capacity (FVC) (FEV1 /FVC), are important indicators in the diagnosis of COPD and are heritable traits [Givelber et al., 1998; Wilk et al., 2000]. Although previous studies have primarily focused on cross-sectional analyses of adult lung function, pulmonary diseases such as COPD usually afflict patients in a certain age range. Cross-sectional studies may miss genetic effects by collecting lung function measurements at a time when genetic effects are weak. Supporting Information is available in the online issue at wileyonlinelibrary.com. ∗

Correspondence to: David C. Christiani, MD, MPH, MS, 665 Huntington Avenue,

Building I Room 1401, Boston, Massachusetts 02115, USA. E-mail: [email protected]. edu

Recent genome-wide association studies (GWAS) have focused on individual genetic effects on complex diseases and traits, but individual genes are unlikely to comprehensively underpin the cellular network structure. Lung function is a complex trait that is likely influenced by multiple gene interactions and environmental factors rather than a few individual genes. Therefore, network approaches to the genetics of complex diseases or traits provide attractive tools to better capture this complexity [Silverman and Loscalzo, 2012]. Several methods have been developed to integrate proteomics with genetics. Because using a protein–protein interactions (PPI) network can reduce the number of multiple comparisons [Emily et al., 2009], we used Jia et al.’s [Jia et al., 2011] method to integrate genome-wide association data and PPI networks. The advantage to this approach is that it uses all GWAS data, not only the top single nucleotide polymorphisms (SNPs) analyzed with other approaches. We explored the associations between pulmonary function and genes by building a cellular network to combine SNP level and network analyses using longitudinal lung function data from the Framingham Heart Study. Proving the viability of our comprehensive strategy, our results identified several genes that may be closely associated with pulmonary function.  C 2014 WILEY PERIODICALS, INC.

Figure 1. Participants of the Framingham Heart Study used in this

plus Affymetrix 50K supplemental arrays and was additionally checked for sex accordance and consistency with family structure resulting 9,237 participants. Quality testing was conducted using PLINK software (version 1.06, http://pngu.mgh.harvard.edu/purcell/plink/). For quality control, individuals with genotyping call-rates 5% (n = 34,110), or had a minor allele frequency

Genome-wide association and network analysis of lung function in the Framingham Heart Study.

Single nucleotide polymorphisms have been found to be associated with pulmonary function using genome-wide association studies. However, lung function...
507KB Sizes 0 Downloads 2 Views