Hindawi Publishing Corporation Computational and Mathematical Methods in Medicine Volume 2013, Article ID 843563, 12 pages http://dx.doi.org/10.1155/2013/843563
Research Article Robust Joint Analysis with Data Fusion in Two-Stage Quantitative Trait Genome-Wide Association Studies Dong-Dong Pan,1 Wen-Jun Xiong,2 Ji-Yuan Zhou,3 Ying Pan,4 Guo-Li Zhou,5 and Wing-Kam Fung6 1
Department of Statistics, Yunnan University, Kunming 650091, China Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 3 Department of Biostatistics, School of Public Health and Tropical Medicine, Southern Medical University, Guangzhou 510515, China 4 Department of Biology, Nanjing University, Nanjing 210093, China 5 College of Mathematics and Statistics, Chongqing University, Chongqing 40044, China 6 Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong 2
Correspondence should be addressed to Dong-Dong Pan;
[email protected] and Wing-Kam Fung;
[email protected] Received 6 July 2013; Accepted 29 July 2013 Academic Editor: Qizhai Li Copyright ยฉ 2013 Dong-Dong Pan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Genome-wide association studies (GWASs) in identifying the disease-associated genetic variants have been proved to be a great pioneering work. Two-stage design and analysis are often adopted in GWASs. Considering the genetic model uncertainty, many robust procedures have been proposed and applied in GWASs. However, the existing approaches mostly focused on binary traits, and few work has been done on continuous (quantitative) traits, since the statistical significance of these robust tests is difficult to calculate. In this paper, we develop a powerful ๐น-statistic-based robust joint analysis method for quantitative traits using the combined raw data from both stages in the framework of two-staged GWASs. Explicit expressions are obtained to calculate the statistical significance and power. We show using simulations that the proposed method is substantially more robust than the ๐นtest based on the additive model when the underlying genetic model is unknown. An example for rheumatic arthritis (RA) is used for illustration.
1. Introduction Genome-wide association studies (GWASs) have identified a large number of genomic regions (especially single-nucleotide polymorphisms (SNPs)) with a wide variety of complex traits/diseases. In a GWAS, two most common types of data, qualitative (or binary) and quantitative (or continuous) traits, are analyzed and two contentious points are often faced; one is how to construct the test statistic considering the genetic model uncertainty and the other is how to evaluate the statistical significance for controlling the false positive rates efficiently (e.g., [1, 2]). Considering these issues, a lot of work has been done on the binary trait in the past 10 years (e.g., [3โ7]). Computer algorithms have also been developed to calculated the significance level of robust tests in GWASs, taking into account the genetic model uncertainty [8]. However, few work has been done on continuous traits, only
recently So and Sham [9] proposed a MAX3 based on score test statistics, and Li et al. [10] gave a MAX3 based on ๐น-test statistics. Note that these tests just focus on single-marker analysis in one-stage analysis. Although the costs of whole-genome genotyping are decreasing with the high-throughput biological technology, the total costs for a GWAS are still very expensive due to the thousands of sampling units and huge amounts of singlenucleotide polymorphisms. In order to save the costs, the two-stage design and the corresponding statistical analysis where all the SNPs are genotyped in Stage 1 on a portion of the samples and the promising SNPs with small ๐-values (e.g., ๐ข1 ) = ๐พ.
(9)
(10)
(6)
(11)
We now give the detail to calculate the cut-off point and power above. The left side of (10) can be further expressed as Pr๐ป0 (๐น1MAX > ๐ข1 , ๐น๐ฝMAX > ๐ข๐ฝ )
,
โ1 ๓ธ ๓ธ [X๐ท(X๐ท X๐ท) X๐ท
Y๓ธ
๐ 1 1 ( โ ๐ฆ๐ + ๐ฆ๐ ) , โ ๐2 ๐=๐10 +๐11 +1 ๐=๐1 +๐20 +๐21 +1
Pr๐ป1 (๐น1MAX > ๐ข1 , ๐น๐ฝMAX > ๐ข๐ฝ ) .
โ1
2
=
๐
๐ฆ2 =
where ๐ข๐ฝ is the cut-off point for the joint statistic. Once we have ๐ข1 and ๐ข๐ฝ , the power is calculated by
๓ธ ๓ธ X๐ด ) X๐ด โ 1๐ (1๓ธ ๐ 1๐ ) 1๓ธ ๐ ] Y Y๓ธ [X๐ด (X๐ด
(๐๐ฝ๐ด )
๐ +๐ +๐
10 11 1 20 21 1 ( โ ๐ฆ๐ + โ ๐ฆ๐ ) , ๐1 ๐=๐10 +1 ๐=๐1 +๐20 +1
Pr๐ป0 (๐น1MAX > ๐ข1 , ๐น๐ฝMAX > ๐ข๐ฝ ) = ๐ผ/๐,
, โ1
๐น๐ฝ๐ด
๐ +๐
๐ฆ1 =
๐ +๐
10 1 20 1 ( โ๐ฆ๐ + โ ๐ฆ๐ ) , ๐0 ๐=1 ๐=๐1 +1
Since the genome-wide significance level is ๐ผ/๐, in order to control the false positive rate, we have
Y๓ธ [I๐ โ W(W๓ธ W) W๓ธ ] Y/ (๐ โ 6) 2 (๐๐ฝ๐
)
๐ฆ0 =
(5)
We now give the proposed robust joint analysis. In the framework of two-stage design GWAS of quantitative traits, the SNPs with ๐-values less than ๐พ will be genotyped on the remaining ๐2 subjects in Stage 2. Following the previous notation for Stage 1, corresponding to the recessive, additive, and dominant models, the genotype data in Stage 2 are denoted ๓ธ by G2๐
= (0๓ธ ๐20 +๐21 , 1๓ธ ๐22 ) , G2๐ด = (๐๐1 +1 , ๐๐1 +2 , . . . , ๐๐ )๓ธ , and
โ1
ร (โ๐ [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]) ,
(4)
โ 6)
= 1 โ Pr๐ป0 (๐น1MAX โค ๐ข1 ) โ Pr๐ป0 (๐น๐ฝMAX โค ๐ข๐ฝ )
(12)
+ Pr๐ป0 (๐น1MAX โค ๐ข1 , ๐น๐ฝMAX โค ๐ข๐ฝ ) . For controlling the type I error rate and calculating the power, we need to know the distribution or the asymptotic ๓ธ distribution of (๐1๐
, ๐1๐ด, ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท, RSS1 , RRS๐ฝ ) under both ๐ป0 and ๐ป1 .
4
Computational and Mathematical Methods in Medicine Note that whether ๐ป0 or ๐ป1 holds, RSS1 and RSS๐ฝ
๓ธ and (๐1๐
, ๐1๐ด , ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท) are mutually independent (the
proof is given in Appendix A). Denote the correlation ๓ธ matrix of (๐1๐
, ๐1๐ด, ๐1๐ท) by V1 = (V๐๐ )3ร3 , whose entries are V11 = V22 = V33 = 1, V12 = V21 = โ๐12 (2๐10 + ๐11 )/โ(๐10 + ๐11 )[๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ], V13 = V31 = โ๐10 ๐12 /โ(๐10 + ๐11 )(๐11 + ๐12 ), and V23 = V32 = โ๐10 (2๐12 + ๐11 )/โ(๐11 + ๐12 )[๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ], respecโ )3ร3 be the correlation matrix tively. Similarly, let V๐ฝ = (V๐๐ ๓ธ
โ โ โ โ โ = V22 = V33 = 1, V12 = V21 = of (๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท) with V11 โ โ๐2 (2๐0 + ๐1 )/โ(๐0 + ๐1 )[๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ], V13 = โ โ โ V31 = โ๐0 ๐2 /โ(๐0 + ๐1 )(๐1 + ๐2 ), and V23 = V32 = โ๐0 (2๐2 + ๐1 )/โ(๐1 + ๐2 )[๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]. Then, 2 , and we can derive that RRS1 /๐2 โผ ๐๐21 โ3 , RSS๐ฝ /๐2 โผ ๐๐โ6
V1 ๐ ๓ธ ๓ตจ๓ตจ )) , (๐1๐
, ๐1๐ด , ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท) ๓ตจ๓ตจ๓ตจ โผ ๐6 (06 , ๐2 ( ๓ตจ๐ป0 ๐๓ธ V๐ฝ (13)
Under ๐ป1 , for a given odds ratio OR = exp(๐ฝ1 ) for subjects with two copies of risk allele corresponding to recessive model or one copy of risk allele corresponding to additive or dominant models, we have the following: (i) when the true genetic model is recessive, V1 ๐ ๓ธ ๓ตจ๓ตจ )) , (๐1๐
, ๐1๐ด, ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด , ๐๐ฝ๐ท) ๓ตจ๓ตจ๓ตจ โผ ๐6 (๐๐
, ๐2 ( ๓ตจ๐ป1 ๐๓ธ V๐ฝ (15) ๓ธ
where ๐๐
= (๐1๐
๐
, ๐1๐
๐ด, ๐1๐
๐ท, ๐๐ฝ๐
๐
, ๐๐ฝ๐
๐ด, ๐๐ฝ๐
๐ท) with ๐1๐
๐
= โโ ๐1๐
๐ด =
where ๐ = (๐๐๐ )3ร3 is the correlation matrix between ๓ธ ๓ธ (๐1๐
, ๐1๐ด , ๐1๐ท) and (๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท) , with ๐11 = Corr (๐1๐
, ๐๐ฝ๐
) = โ ๐12 = =
๐ (๐10 + ๐11 ) ๐12 , ๐1 (๐0 + ๐1 ) ๐2
โ๐1 (๐10 + ๐11 ) [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]
๐13 = Corr (๐1๐
, ๐๐ฝ๐ท) =
โ๐1 (๐10 + ๐11 ) ๐0 (๐1 + ๐2 )
,
,
โ๐1 [๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ] (๐0 + ๐1 ) ๐2
๐ [๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ] , ๐1 [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]
,
โ๐2 (2๐0 + ๐1 ) ๐ฝ1 โ๐ [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ] โโ๐0 ๐2 ๐ฝ1 โ๐ (๐1 + ๐2 )
,
,
V1 ๐ ๓ธ ๓ตจ๓ตจ (๐1๐
, ๐1๐ด, ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด , ๐๐ฝ๐ท) ๓ตจ๓ตจ๓ตจ โผ ๐6 (๐๐ด , ๐2 ( )) , ๓ตจ๐ป1 ๐๓ธ V๐ฝ (17) ๓ธ
where ๐๐ด = (๐1๐ด๐
, ๐1๐ด๐ด , ๐1๐ด๐ท, ๐๐ฝ๐ด๐
, ๐๐ฝ๐ด๐ด , ๐๐ฝ๐ด๐ท) with
โ๐๐10 (๐11 + 2๐12 ) โ๐1 [๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ] ๐0 (๐1 + ๐2 ) โ๐๐10 ๐12 โ๐1 (๐11 + ๐12 ) (๐0 + ๐1 ) ๐2
,
โ๐๐10 (๐11 + 2๐12 ) โ๐1 (๐11 + ๐12 ) [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]
๐33 = Corr (๐1๐ท, ๐๐ฝ๐ท) = โ
๐๐10 (๐11 + ๐12 ) . ๐1 ๐0 (๐1 + ๐2 )
,
๐1๐ด๐
=
, ๐1๐ด๐ด = โโ
๐32 = Corr (๐1๐ท, ๐๐ฝ๐ด) =
โ๐1 (๐11 + ๐12 )
(ii) when the true genetic model is additive,
โ๐๐12 (2๐10 + ๐11 )
๐31 = Corr (๐1๐ท, ๐๐ฝ๐
) =
โโ๐10 ๐12 ๐ฝ1
,
(๐ + ๐1 ) ๐2 = โโ 0 ๐ฝ1 , ๐
๐๐ฝ๐
๐ท =
๐23 = Corr (๐1๐ด, ๐๐ฝ๐ท) =
๐1๐
๐ท =
๐๐ฝ๐
๐ด =
,
โ๐๐12 ๐10
๐22 = Corr (๐1๐ด, ๐๐ฝ๐ด) = โ
โ๐1 [๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ]
๐๐ฝ๐
๐
๐21 = Corr (๐1๐ด, ๐๐ฝ๐
) =
โ๐12 (2๐10 + ๐11 ) ๐ฝ1
(16)
Corr (๐1๐
, ๐๐ฝ๐ด ) โ๐๐12 (2๐10 + ๐11 )
(๐10 + ๐11 ) ๐12 ๐ฝ1 , ๐1
๐1๐ด๐ท = (14) ๐๐ฝ๐ด๐
=
โโ๐12 (๐11 + 2๐10 ) ๐ฝ1 โ๐1 (๐10 + ๐11 )
,
๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ๐ฝ1 , ๐1 โโ๐10 (๐11 + 2๐12 ) ๐ฝ1 โ๐1 (๐11 + ๐12 ) โโ๐2 (๐1 + 2๐0 ) ๐ฝ1 โ๐ (๐0 + ๐1 )
,
,
Computational and Mathematical Methods in Medicine
5 Table 1: Power comparison (๐ = 2000, ๐พ = 1 ร 10โ4 , ๐ผ = 0.05, and ๐ = 5 ร 105 ).
๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ๐ฝ1 , ๐ โโ๐0 (๐1 + 2๐2 ) ๐ฝ1 = , โ๐ (๐1 + ๐2 )
๐๐ฝ๐ด๐ด = โโ ๐๐ฝ๐ด๐ท
๐
(18) (iii) when the true genetic model is dominant, V1 ๐ ๓ธ ๓ตจ๓ตจ )) , (๐1๐
, ๐1๐ด, ๐1๐ท, ๐๐ฝ๐
, ๐๐ฝ๐ด, ๐๐ฝ๐ท) ๓ตจ๓ตจ๓ตจ โผ ๐6 (๐๐ท, ๐2 ( ๓ตจ๐ป1 ๐๓ธ V๐ฝ (19)
MAF
0.15 0.30 0.30 0.45 0.15 0.40 0.30 0.45 0.15 0.50 0.30 0.45
REC AFJ MAXFJ 7.5๐ โ 5 0.005 0.052 0.285 0.487 0.785 1.1๐ โ 4 0.009 0.086 0.470 0.711 0.938 1.0๐ โ 4 0.010 0.121 0.639 0.856 0.987
ADD AFJ MAXFJ 0.426 0.365 0.811 0.759 0.893 0.854 0.651 0.589 0.945 0.922 0.979 0.968 0.802 0.751 0.987 0.980 0.997 0.995
DOM AFJ MAXFJ 0.610 0.618 0.698 0.784 0.449 0.647 0.826 0.837 0.887 0.938 0.677 0.859 0.933 0.941 0.965 0.986 0.826 0.953
๓ธ
where ๐๐ท = (๐1๐ท๐
, ๐1๐ท๐ด , ๐1๐ท๐ท, ๐๐ฝ๐ท๐
, ๐๐ฝ๐ท๐ด , ๐๐ฝ๐ท๐ท) with ๐1๐ท๐
๐1๐ท๐ด
=
=
โโ๐12 ๐10 ๐ฝ1 โ๐1 (๐10 + ๐11 )
,
๐
โ๐10 (๐11 + 2๐12 ) ๐ฝ1 โ๐1 [๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ]
๐1๐ท๐ท = โโ ๐๐ฝ๐ท๐
๐๐ฝ๐ท๐ด =
Table 2: Power comparison (๐ = 2000, ๐พ = 2 ร 10โ4 , ๐ผ = 0.05, and ๐ = 5 ร 105 ).
=
๐10 (๐11 + ๐12 ) ๐ฝ1 , ๐1 โโ๐2 ๐0 ๐ฝ1
โ๐ (๐0 + ๐1 )
(20)
,
โ๐0 (๐1 + 2๐2 ) ๐ฝ1 โ๐ [๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ]
๐๐ฝ๐ท๐ท = โโ
,
,
๐0 (๐1 + ๐2 ) ๐ฝ1 . ๐
We develop a method for simplifying the calculations of Pr๐ป0 (๐น1MAX โค ๐ข1 ) and Pr๐ป0 (๐น๐ฝMAX โค ๐ข๐ฝ ) and Pr๐ป0 (๐น1MAX โค ๐ข1 , ๐น๐ฝMAX โค ๐ข๐ฝ ). The details are included in Appendix B, and the calculations of Pr๐ป1 (๐น1MAX โค ๐ข1 ) and Pr๐ป1 (๐น๐ฝMAX โค ๐ข๐ฝ ) and Pr๐ป1 (๐น1MAX โค ๐ข1 , ๐น๐ฝMAX โค ๐ข๐ฝ ) are essentially similar.
3. Results 3.1. Power Comparison. We conduct simulation studies to evaluate the performance of the proposed method under three commonly used genetic models (recessive, additive, and dominant models). We mainly compare the power of two approaches; one is the proposed method in this paper, and the other is the joint analysis based on the ๐น-test statistics ๐น1๐ด and ๐น๐ฝ๐ด. For convenience, we refer to the proposed method as MAXFJ and AFJ for the other one. We choose the sample size ๐ = 2000, and ๐ = 5 ร 105 . The proportion of subjects genotyped in Stage 1 has three levels ๐ = 0.3, 0.4, 0.5. We set the genome-wide significance level as ๐ผ = 0.05 and that the significance level per SNP as ๐ผ/๐ = 1 ร 10โ7 . In Stage 1, the ๐-value threshold for SNPs selected for followup is set to be 1 ร 10โ4 and 2 ร 10โ4 . We assume that the Hardy-Weinberg
MAF
0.15 0.30 0.30 0.45 0.15 0.40 0.30 0.45 0.15 0.50 0.30 0.45
REC AFJ MAXFJ 1.3๐ โ 4 0.006 0.066 0.340 0.556 0.833 1.2๐ โ 4 0.011 0.101 0.529 0.765 0.957 1.7๐ โ 4 0.012 0.133 0.683 0.888 0.992
ADD AFJ MAXFJ 0.489 0.426 0.852 0.806 0.922 0.891 0.709 0.651 0.961 0.943 0.987 0.979 0.838 0.793 0.992 0.987 0.998 0.997
DOM AFJ MAXFJ 0.676 0.681 0.754 0.828 0.516 0.706 0.866 0.876 0.916 0.956 0.732 0.892 0.951 0.958 0.975 0.991 0.860 0.967
equilibrium holds in the general sample population, and then there are on average ๐ ร (1 โ MAF)2 , 2๐ ร MAF ร (1 โ MAF), and ๐ ร MAF2 individuals with genotype gg, Gg, and GG, respectively, where the minor allele frequency is set to be 0.15, 0.30 and 0.45. To make the power comparison more distinctly, we specify different genetic effect parameters ๐ฝ1 under three genetic models as follows: ๐ฝ1 = 0.5 for the recessive model, ๐ฝ1 = 0.3 for the additive model, and ๐ฝ1 = 0.4 for the dominant model. The power results are displayed in Tables 1 and 2 for ๐พ = 1 ร 10โ4 and ๐พ = 2 ร 10โ4 , respectively. They indicate that MAXFJ is more efficiency robust than AFJ across various inheritance models. As expected, AFJ is more powerful than MAXFJ under the additive model. However, MAFJ performs much more powerful than AFJ when the true genetic model is recessive. For instance, in Table 2, with ๐ = 0.4 and MAF = 0.3, the powers of AFJ and MAXFJ are 0.101 and 0.529, respectively. In summary, MAXFJ is substantially more powerful than AFJ in two-staged GWAS of quantitative traits, when the model for AFJ is misspecified. 3.2. An Illustration Example: Rheumatoid Arthritis. Rheumatoid arthritis (RA) is an autoimmune disease (resulting in a chronically systemic inflammatory disorder) which mainly attacks synovial joints. About 1% of the common adult population worldwide is affected by RA [16]. It has been
6
Computational and Mathematical Methods in Medicine 2.5
2.5
2.0
2.0
1.5 Density
Density
1.5
1.0
1.0
0.5
0.5
0.0
0.0 6
7
8
9
10
11
6
7
8 9 โlog10 P
โlog10 P
(a)
10
11
(b)
Figure 1: The histogram and density of โlog10 ๐ when ๐ = 0.3 (the left subgraph corresponds to MAXFJ while the right one for AFJ).
pointed out that the genetic variants might play a major role in RA susceptibility [17]. Genetic Analysis Workshop 16 (GAW16) based on the North American Rheumatoid Arthritis Consortium (NARAC) is a GWAS testing association with RA using about 5 ร 105 SNPs [18โ20]. It included 868 individuals who were RA positive (cases) and also had continuous trait anticyclic citrullinated peptide (anti-CCP) measures and 1194 controls sampled from the New York Cancer Project (NYCP) without RA which had no anti-CCP measures. Huizinga et al. [21] pointed out that a greater antiCCP would be linked to better prediction of increased risk developing RA. Chen et al. [22] showed that SNP rs2476601 located in PTPN22 had the most significant association with RA. Here, we only focus on SNP rs2476601 and apply two joint analysis methods (AFJ and MAXFJ) to evaluate its statistical significance. The minimum of anti-CCP among 868 cases was affected to each control, and a log transformation of anti-CCP was applied in the analysis. Then, we considered ๐ = 0.3, 0.4, 0.5 three simulation circumstances. For ๐ = 0.3, thirty percent of individuals were randomly sampled from all cases and controls and were used as the data from Stage 1, and the rest of individuals were treated as the data of Stage 2. The ๐-values of AFJ and MAXFJ were calculated, respectively. We repeated the above procedure 1,000 times and saved the corresponding ๐-values. A base-10 logarithm transformation and an opposite transformation were successively applied to these ๐-values, and the histogram and density of these transformed data were obtained (Figure 1). Similarly, we
conducted the simulation and calculation for ๐ = 0.4 and 0.5, and the corresponding histogram and density were presented in Figures 2 and 3. Examination of Figures 1โ3 showed that the ๐-values of MAXFJ are more stable than those of AFJ and the estimated density curves of MAXFJ are more closer to the symmetrical normal distribution while the estimated density curves of AFJ are rather skewed, which indicated that MAXFJ possesses more robust performance when the real genetic models are unknown.
4. Discussion We have developed a feasible two-stage design and the corresponding robust joint analysis approach for quantitative trait GWASs. The method is based on the ๐น-statistics over three different genetic models. The denominator of the used ๐นstatistic, which is constructed without assuming any genetic model, is different from the commonly used one. This adoption can reduce the computation intensity. Taking advantage of an ingenious design matrix, we successfully construct the common denominator of three ๐น-test statistics for the joint analysis with combined raw data from both stages. The statistical significance (๐-value) for the proposed joint analysis method can be calculated with the derived analytic expressions on the basis of the asymptotic distributions, which greatly reduce the complexity and computational intensity compared with the resampling-type permutation and bootstrap procedures. Our numerical results demonstrate
Computational and Mathematical Methods in Medicine
7
2.5
2.0
2.0
1.5
1.5
Density
Density
2.5
1.0
1.0
0.5
0.5
0.0
0.0 6
7
8 9 โlog10 P
10
6
11
7
(a)
8 9 โlog10 P
10
11
(b)
Figure 2: The histogram and density of โlog10 ๐ when ๐ = 0.4 (the left subgraph corresponds to MAXFJ while the right one for AFJ). 2.5
2.0
2.0
1.5
1.5
Density
Density
2.5
1.0
1.0
0.5
0.5
0.0
0.0 6
7
8 9 โlog10 P (a)
10
11
6
7
8
9
10
11
โlog10 P
(b)
Figure 3: The histogram and density of โlog10 ๐ when ๐ = 0.5 (the left subgraph corresponds to MAXFJ while the right one for AFJ).
8
Computational and Mathematical Methods in Medicine
that this novel approach has the greater efficiency robustness for genetic model uncertainty than the ๐น-statistic-based joint analysis which assumes the additive genetic model. In this work, we did not investigate the power of joint analysis based on other existing robust association methods for quantitative traits such as So and Shamโs method. We find that it is very difficult to extend So and Shamโs method (score test-based MAX3) to two-staged GWASs with quantitative outcomes, since it is almost impossible to derive the joint distribution of score tests from two stages. For simplicity, here we do not take into account the effects of covariates in the considered two-stage design. However, in real application, the proposed method can be easily applied to the situation including one or more covariates as shown by the original MAXF by Li et al. [10]. It is important to stress that we combine the raw data from two stages to construct the joint statistic, unlike the joint analysis for binary traits using the weighted sum of two statistics in Stages 1 and 2 [12]. Furthermore, one basic assumption in this paper is that the effect sizes of genetic variants between two stages are identical (i.e., no heterogeneity exists), which is the natural and reasonable precondition for the data fusion strategy. In addition, the population-based genetic association studies may be affected by the population stratification, and this needs future research to examine it.
Based on the design matrix W=(
๐น๐ฝ๐
=
(RSS0 โ RSS๐ฝ ) โ (RSS๐
โ RSS๐ฝ ) RSS0 โ RSS๐
= , RSS๐ฝ / (๐ โ 6) RSS๐ฝ / (๐ โ 6)
๐น๐ฝ๐ด =
RSS0 โ RSS๐ด (RSS0 โ RSS๐ฝ ) โ (RSS๐ด โ RSS๐ฝ ) = , RSS๐ฝ / (๐ โ 6) RSS๐ฝ / (๐ โ 6)
๐น๐ฝ๐ท =
RSS0 โ RSS๐ท (RSS0 โ RSS๐ฝ ) โ (RSS๐ท โ RSS๐ฝ ) = . RSS๐ฝ / (๐ โ 6) RSS๐ฝ / (๐ โ 6) (A.4)
On the one hand, according to X๓ธ X O W๓ธ W = ( 1 1 ๓ธ 3ร3 ) , O3ร3 X2 X2
A. The Derivation of the Asymptotic Properties of ๐น๐ฝ๐
, ๐น๐ฝ๐ด, ๐น๐ฝ๐ท
(W W)
๓ธ
where ๐ = (๐ฝ0 , ๐1 , ๐2 , ๐ฝ0โ , ๐1โ , ๐2โ ) and ๐ โผ ๐๐ (0๐ , ๐2 I๐ ). Denote 0 0 1 0
โ1 0 0 0
1 0 C๐ท = ( 0 0
0 1 0 1
0 0 1 โ1
โ1 0 0 0
0 โ1 0 0
0 0 ), โ1 0
1 0 C๐ด = ( 0 0
0 1 0 2
0 0 1 โ1
โ1 0 0 0
0 โ1 0 0
0 0 ), โ1 0
1 0 C0 = (0 0 0
0 1 0 1 0
0 0 1 0 1
โ1 0 0 0 0
0 โ1 0 0
0 โ1 0 0 0
โ1
๓ธ
(A.1)
0 1 0 1
(A.5)
it follows that
Consider the linear model for the combined raw data from Stage 1 and Stage 2 as follows:
1 0 C๐
= ( 0 0
(A.3)
for the expanded full model above, we can get the ordinary ฬ = (W๓ธ W)โ1 W๓ธ Y, and least square estimator of ๐ by ๐ the residual sum of square is given by RSS๐ฝ = Y๓ธ [I๐ โ W(W๓ธ W)โ1 W๓ธ ]Y. Furthermore, we denote the residual sum of squares under the following constraints: C๐
๐ = 04 , C๐ท๐ = 04 , C๐ด ๐ = 04 , and C0 ๐ = 05 , by RSS๐
, RSS๐ท, RSS๐ด, and RSS0 , respectively. After some algebras, we can obtain
Appendix
Y = W๐ + ๐,
X1 O๐1 ร3 ) O๐2 ร3 X2
๓ธ
โ1
โ1
=(
O3ร3
โ1 ) ,
(X2๓ธ X2 )
O3ร3 โ1
๓ธ
W(W W) W = (
0 0 ), โ1 0
(X1๓ธ X1 )
X1 (X1๓ธ X1 ) X1๓ธ O๐2 ร๐1
O๐1 ร๐2
โ1
X2 (X2๓ธ X2 ) X2๓ธ
). (A.6)
So, we can get that โ1
โ1
Y๓ธ [I๐ โ W(W๓ธ W) W๓ธ ] Y = Y๓ธ 1 [I๐1 โ X1 (X1๓ธ X1 ) X1๓ธ ] Y1 โ1
+ Y๓ธ 2 [I๐2 โ X2 (X2๓ธ X2 ) X2๓ธ ] Y2 . (A.7)
0 0 โ1) . 0 0 (A.2)
That is, RSS๐ฝ = RSS1 +RSS2 . Furthermore, RSS๐ฝ /(๐โ6) is also the unbiased estimator of the variance of the residual ๐2 based on the independence and unbiasedness of RSS1 and RSS2 . On the other hand, RSS0 โ RSS๐ฝ and RSS๐
โ RSS๐ฝ are both independent of RSS๐ฝ , so (๐๐ฝ๐
)2 = RSS0 โ RSS๐
= (RSS0 โ RSS๐ฝ )โ(RSS๐
โRSS๐ฝ ) is also independent of RSS๐ฝ , and (RSS0 โ 2 RSS๐
)/๐2 โผ ๐12 , RSS๐ฝ /๐2 โผ ๐๐โ6 . Consequently, we have ๐น๐ฝ๐
โผ ๐น1,๐โ6 . Similarly, we can get ๐น๐ฝ๐ด โผ ๐น1,๐โ6 and ๐น๐ฝ๐ท โผ ๐น1,๐โ6 .
Computational and Mathematical Methods in Medicine
B. The Detailed Calculation of Pr๐ป0 (๐น1MAX โค ๐ข1 ) and Pr๐ป0 (๐น๐ฝMAX โค ๐ข๐ฝ ) and Pr๐ป0 (๐น1MAX โค ๐ข1 , ๐น๐ฝMAX โค ๐ข๐ฝ ) Denote ๐ค0 = โ(๐10 + ๐11 )๐12 /(๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ) and ๐ค1 = โ๐10 (๐11 + ๐12 )/(๐10 (๐11 + 4๐12 ) + ๐11 ๐12 ). For a given ๐ > 0,
๓ตจ๓ตจ ๐๐ด ๓ตจ๓ตจ ๓ตจ๓ตจ ๐๐ท ๓ตจ๓ตจ ๓ตจ๓ตจ ๐๐
๓ตจ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ Pr๐ป0 (๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐, ๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐, ๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐) ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ = โฎ ๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง0 ๐๐ง1 ,
0
๐
๐๐ง0 โซ
โ๐ โ V13 ๐ง0 ] ๐ โ V13 ๐ง0 [ ) โ ฮฆ( )] [ฮฆ ( 2 2 โ1 โ V13 โ1 โ V13 ] [
ร ๐ (๐ง0 ) ๐๐ง0 , (๐โ๐ค0 ๐ง0 )/๐ค1
๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง1
โ๐
๐
(๐ โ ๐ค0 ๐ง0 ) /๐ค1 โ V13 ๐ง0 [ ) [ฮฆ ( 2 ๐(1โ๐ค1 )/๐ค0 โ1 โ V13 [ โฮฆ (
โ๐ โ V13 ๐ง0 โ1 โ
2 V13
] )] ๐ (๐ง0 ) ๐๐ง0 . ] (B.5)
Thus,
[ = 2 [โซ
๐
๐๐ง0 โซ
๐(1โ๐ค1 )/๐ค0
ฮฉ0
[
โ๐
๐(1โ๐ค1 )/๐ค0
=โซ
โฎ ๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง0 ๐๐ง1
๐๐ง0 โซ ๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง1
๐
๐
๐๐ง0 โซ ๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง1 โ๐
0
=โซ
ฮฉ0
+โซ
๐(1โ๐ค1 )/๐ค0
๐(1โ๐ค1 )/๐ค0
โฎ ๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง0 ๐๐ง1
0
โซ
โซ
where ฮฉ0 = {(๐ง0 , ๐ง1 ) : |๐ง0 | โค ๐, |๐ค0 ๐ง0 + ๐ค1 ๐ง1 | โค ๐, |๐ง1 | โค ๐} and ๐(๐ง0 , ๐ง1 ; ฮฃ0 ) is the bivariate normal density function ๓ธ for (๐1๐
/๐, ๐1๐ท/๐) with mean 02 and variance-covariance matrix ฮฃ0 = ( V113 V113 ). Taking advantage of the symmetry of bivariate normal distribution, the above twofold integration can be only calculated at the right half space of ฮฉ0 and then multiplied by 2, which is
= 2 [โซ
Then, it follows that
(B.1)
ฮฉ0
๐(1โ๐ค1 )/๐ค0
9
(๐โ๐ค0 ๐ง0 )/๐ค1
โ๐
๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) ๐๐ง1 ] .
๐(1โ๐ค1 )/๐ค0
0
+โซ
ฮฆ(
๐
๐(1โ๐ค1 )/๐ค0
(B.2)
2 โ1 โ V13
ฮฆ(
) ๐ (๐ง0 ) ๐๐ง0
(๐ โ ๐ค0 ๐ง0 ) /๐ค2 โ V13 ๐ง0 2 โ1 โ V13
)
ร ๐ (๐ง0 ) ๐๐ง0 ๐
Based on the property of conditional distributions of the multivariate normal distribution, we have
๐ โ V13 ๐ง0
โโซ ฮฆ( 0
โ๐ โ V13 ๐ง0 โ1 โ
2 V13
] ) ๐ (๐ง0 ) ๐๐ง0 ] . ] (B.6)
๐ (๐ง0 , ๐ง1 ; ฮฃ0 ) = ๐ (๐ง0 ) ๐ (๐ง1 | ๐ง0 ; V13 ) ,
(B.3)
where ๐(๐ง0 ) is the probability density function of ๐(0, 1) and ๐(๐ง1 | ๐ง0 ; V13 ) is the density function of the conditional 2 ). That is, normal distribution ๐(V13 ๐ง0 , 1 โ V13
๐ (๐ง1 | ๐ง0 ; V13 ) =
1 โ1 โ
2 V13
๐(
๐ง1 โ V13 ๐ง0 2 โ1 โ V13
).
(B.4)
Denote ๐ค0โ = โ(๐0 + ๐1 )๐2 /(๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ) and ๐ค1โ = โ๐0 (๐1 + ๐2 )/(๐0 (๐1 + 4๐2 ) + ๐1 ๐2 ). For any given ๐1 , ๐2 > 0, ๓ตจ๓ตจ ๐๐
๓ตจ๓ตจ ๓ตจ๓ตจ ๐๐ด ๓ตจ๓ตจ ๓ตจ๓ตจ ๐๐ท ๓ตจ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ ๓ตจ Pr๐ป0 (๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ๓ตจ๓ตจ 1 ๓ตจ๓ตจ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐
๓ตจ๓ตจ ๓ตจ๓ตจ ๐ด ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ท ๓ตจ๓ตจ ๓ตจ๓ตจ ๐๐ฝ ๓ตจ๓ตจ ๓ตจ๐ ๓ตจ ๓ตจ๐ ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ โค ๐2 , ๓ตจ๓ตจ๓ตจ ๐ฝ ๓ตจ๓ตจ๓ตจ โค ๐2 , ๓ตจ๓ตจ๓ตจ ๐ฝ ๓ตจ๓ตจ๓ตจ โค ๐2 ) ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๐ ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ๓ตจ = โฎ ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง0 ๐๐ง1 ๐๐ง2 ๐๐ง3 , ฮฉ1
(B.7)
10
Computational and Mathematical Methods in Medicine 1 V13 ๐11 ๐13
where ๓ตจ ๓ตจ ๓ตจ ๓ตจ ฮฉ1 = {(๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ) : ๓ตจ๓ตจ๓ตจ๐ง0 ๓ตจ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ๓ตจ๐ค0 ๐ง0 + ๐ค1 ๐ง1 ๓ตจ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ ๓ตจ๓ตจ ๓ตจ ๓ตจ ๓ตจ โ ๓ตจ ๓ตจ โ ๓ตจ ๓ตจ๓ตจ๐ง1 ๓ตจ๓ตจ โค ๐1 , ๓ตจ๓ตจ๓ตจ๐ง2 ๓ตจ๓ตจ๓ตจ โค ๐2 , ๓ตจ๓ตจ๓ตจ๐ค0 ๐ง2 + ๐ค1 ๐ง3 ๓ตจ๓ตจ๓ตจ โค ๐2 , ๓ตจ๓ตจ๓ตจ๐ง3 ๓ตจ๓ตจ๓ตจ โค ๐2 } (B.8)
ฮฃ1 = (
๐1 (1โ๐ค1 )/๐ค0
โ๐1 (1โ๐ค1 )/๐ค0
๐ฟ2 = โซ
๐1 (1โ๐ค1 )/๐ค0
โ๐1 (1โ๐ค1 )/๐ค0
๐ฟ3 = โซ
๐1 (1โ๐ค1 )/๐ค0
โ๐1 (1โ๐ค1 )/๐ค0
๐ฟ4 = โซ
๐1
๐ฟ5 = โซ
โ๐2 (1โ๐ค1โ )/๐ค0โ
โ๐1
โ๐2
๐1
๐2
โ๐1
๐2 (1โ๐ค1โ )/๐ค0โ
๐๐ง0 โซ
๐๐ง0 โซ
โ๐1 (1โ๐ค1 )/๐ค0
โ๐1
โ๐1 (1โ๐ค1 )/๐ค0
โ๐1
โ๐1
โ๐2 (1โ๐ค1โ )/๐ค0โ
๐1
๐1
โ(๐1 +๐ค0 ๐ง0 )/๐ค1
๐๐ง0 โซ
(๐2 โ๐ค0โ ๐ง2 )/๐ค1โ
โ๐2
๐1
๐2
โ(๐1 +๐ค0 ๐ง0 )/๐ค1
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
๐๐ง2 โซ ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 , โ๐2
๐2
๐๐ง2 โซ
๐๐ง2 โซ
(๐2 โ๐ค0โ ๐ง2 )/๐ค1โ
โ๐2
๐2 (1โ๐ค1โ )/๐ค0โ
โ๐2 (1โ๐ค1โ )/๐ค0โ
โ๐2 (1โ๐ค1โ )/๐ค0โ
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
๐2
๐๐ง2 โซ ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 , โ๐2
๐๐ง2 โซ
๐2
โ(๐2 +๐ค0โ ๐ง2 )/๐ค1โ
โ๐2
๐๐ง1 โซ
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
โ(๐2 +๐ค0โ ๐ง2 )/๐ค1โ
๐๐ง1 โซ
๐๐ง1 โซ
1
๐2
๐๐ง1 โซ
โ(๐1 +๐ค0 ๐ง0 )/๐ค1
โ๐1 (1โ๐ค1 )/๐ค0
๐2
๐2 (1โ๐ค1โ )/๐ค0โ
๐2 (1โ๐ค1โ )/๐ค0โ
๐๐ง0 โซ
๐๐ง0 โซ
๐๐ง2 โซ
โ๐2
โ๐1
(B.9)
โ๐2
โ(๐2 +๐ค0โ ๐ง2 )/๐ค1โ
๐๐ง1 โซ
).
๐2
๐๐ง2 โซ
๐๐ง1 โซ
(๐1 โ๐ค0 ๐ง0 )/๐ค1
โ 1 V13
๐๐ง2 โซ ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
โ๐2 (1โ๐ค1โ )/๐ค0โ
โ๐1
โ๐1
๐1
๐ฟ7 = โซ
๐ฟ9 = โซ
โ๐2 (1โ๐ค1โ )/๐ค0โ
๐1
๐๐ง0 โซ
๐1 (1โ๐ค1 )/๐ค0
๐ฟ8 = โซ
โ๐1
(๐1 โ๐ค0 ๐ง0 )/๐ค1
๐1 (1โ๐ค1 )/๐ค0
๐ฟ6 = โซ
๐2 (1โ๐ค1โ )/๐ค0โ
(๐1 โ๐ค0 ๐ง0 )/๐ค1
๐1 (1โ๐ค1 )/๐ค0
๐1
๐1
๐๐ง0 โซ ๐๐ง1 โซ
๐11 ๐31
Note that โฎฮฉ ๐(๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 )๐๐ง0 ๐๐ง1 ๐๐ง2 ๐๐ง3 is the sum 1 of nine integrations as follows:
๐๐ง0 โซ ๐๐ง1 โซ
๐๐ง0 โซ ๐๐ง1 โซ
1 ๐31 ๐33
โ ๐13 ๐33 V13
and ๐(๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) is the multivariate normal density ๓ธ function for (๐1๐
/๐, ๐1๐ท/๐, ๐๐ฝ๐
/๐, ๐๐ฝ๐ท/๐) with mean 04 and variance-covariance matrix
๐ฟ1 = โซ
V13
๐2
๐2 (1โ๐ค1โ )/๐ค0โ
๐๐ง2 โซ
(๐2 โ๐ค0โ ๐ง2 )/๐ค1โ
โ๐2
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 ,
๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง3 . (B.10)
Moreover, we can obtain that ๐ฟ 2 = ๐ฟ 3 , ๐ฟ 4 = ๐ฟ 7 , ๐ฟ 5 = ๐ฟ 9 , and ๐ฟ 6 = ๐ฟ 8 based on the symmetry of the integration domain for (๐ง0 , ๐ง1 ) and (๐ง2 , ๐ง3 ), respectively. We have ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) = ๐ (๐ง0 ) ๐ (๐ง1 | ๐ง0 ; V13 ) ร ๐ (๐ง2 | ๐ง0 , ๐ง1 ; V13 , ๐11 , ๐31 )
๐ (๐ง2 | ๐ง0 , ๐ง1 ; V13 , ๐11 , ๐31 ) =
โ1 โ
2 ((๐11
1
2 ) / (1 โ V2 )) โ 2๐11 ๐31 V13 + ๐31 13
ร ๐ [ ((๐ง2 โ [๐ง0 (๐11 โ V13 ๐31 ) (B.11)
โ ร ๐ (๐ง3 | ๐ง0 , ๐ง1 , ๐ง2 ; V13 , V13 , ๐11 , ๐13 , ๐31 , ๐33 ) ,
where ๐(๐ง2 | ๐ง0 , ๐ง1 ; V13 , ๐11 , ๐31 ) is the conditional normal density function as
2 +๐ง1 (๐31 โ ๐11 V13 )]) / (1 โ V13 )) โ1
2 + 2๐ ๐ V + ๐2 ) / (1 โ V2 ))) ] , ร(โ1 โ ((๐11 11 31 13 31 13
(B.12)
Computational and Mathematical Methods in Medicine
11
โ and ๐(๐ง3 | ๐ง0 , ๐ง1 , ๐ง2 ; V13 , V13 , ๐11 , ๐13 , ๐31 , ๐33 ) is the conditional normal density function given by
ร ๐(
โ ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ๐ง3 โ (๐13 , ๐33 , V13
๓ธ
โ ) ฮโ1 (๐ , ๐ , Vโ )๓ธ โ1 โ (๐13 , ๐33 , V13 13 33 13
(B.13)
โ , ๐11 , ๐13 , ๐31 , ๐33 ) ๐ (๐ง3 | ๐ง0 , ๐ง1 , ๐ง2 ; V13 , V13
=
)
with ฮ the submatrix of ฮฃ1 formed by first three rows and three columns. โ โ ๓ธ )ฮโ1 (๐13 , ๐33 , V13 ) . Then, we have Denote (๐โ )2 = (๐13 , ๐33 , V13
1 โ ) ฮโ1 (๐ , ๐ , Vโ )๓ธ โ1 โ (๐13 , ๐33 , V13 13 33 13
๐ฟ1 ๐1 (1โ๐ค1 )/๐ค0
=โซ
โ๐1 (1โ๐ค1 )/๐ค0
๐1
1
โ๐1
2 โ1 โ V13
๐ (๐ง0 ) ๐๐ง0 โซ
๐(
๐ง1 โ V13 ๐ง0 2 โ1 โ V13
๐2 (1โ๐ค1โ )/๐ค0โ
โ1
) ๐๐ง1 โซ
โ๐2 (1โ๐ค1โ )/๐ค0โ
๓ธ โ โ1 [ฮฆ ((๐2 โ (๐13 , ๐33 , V13 ) ฮ (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) )
โ1
๓ธ
โ โ ฮฆ ((โ๐2 โ (๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) )] ๐๐ง2 ,
๐ฟ2 ๐1 (1โ๐ค1 )/๐ค0
๐1
=โซ
โ๐1 (1โ๐ค1 )/๐ค0
๐ (๐ง0 ) ๐๐ง0 โซ
โ๐1
โ
โ
โ1 โ๐2 (1โ๐ค1 )/๐ค0 ๐ง โ V13 ๐ง0 ๓ธ โ ๐( 1 )๐๐ง1 โซ [ฮฆ ((๐2 โ(๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) ) 2 2 โ๐2 โ1 โ V13 โ1 โ V13
1
โฮฆ ((โ
โ1 ๐2 + ๐ค0โ ๐ง2 ๓ธ โ โ(๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) ))] ๐๐ง2 , ๐ค1โ
๐ฟ4 ๐1
=โซ
๐1 (1โ๐ค1 )/๐ค0
๐ (๐ง0 ) ๐๐ง0 โซ
(๐1 โ๐ค0 ๐ง0 )/๐ค1
โ๐1
โ
โ
โ1 ๐2 (1โ๐ค1 )/๐ค0 ๐ง โ V13 ๐ง0 ๓ธ โ โ1 ๐ (1 )๐๐ง1 โซ [ฮฆ ((๐2 โ (๐13 , ๐33 , V13 ) ฮ (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) ) โ โ 2 2 โ๐2 (1โ๐ค1 )/๐ค0 โ1 โ V13 โ1 โ V13
1
โ1
๓ธ
โ โ ฮฆ ((โ๐2 โ(๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) )] ๐๐ง2 ,
๐ฟ5 ๐1
=โซ
๐1 (1โ๐ค1 )/๐ค0
๐ (๐ง0 ) ๐๐ง0 โซ
(๐1 โ๐ค0 ๐ง0 )/๐ค1
โ๐1
โ
โ
โ1 โ๐2 (1โ๐ค1 )/๐ค0 ๐ง โ V13 ๐ง0 ๓ธ โ โ1 ๐( 1 ) ๐๐ง1 โซ [ฮฆ ((๐2 โ(๐13 , ๐33 , V13 ) ฮ (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) ) 2 2 โ๐2 โ1 โ V13 โ1 โ V13
1
โ ฮฆ ((โ
โ1 ๐2 + ๐ค0โ ๐ง2 ๓ธ โ โ (๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1 โ (๐โ )2 ) )] ๐๐ง2 , โ ๐ค1
๐ฟ6 ๐1
(๐1 โ๐ค0 ๐ง0 )/๐ค1
=โซ
๐1 (1โ๐ค1 )/๐ค0
๐ (๐ง0 ) ๐๐ง0 โซ
โ๐1
โ
โ1 ๐2 ๐2 โ ๐ค0 ๐ง2 ๐ง โV ๐ง ๓ธ โ โ1 ๐ ( 1 13 0) ๐๐ง1 โซ [ฮฆ (( โ(๐13 , ๐33 , V13 ) ฮ (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1โ(๐โ )2 ) ) โ ๐ค1 2 2 ๐2 (1โ๐ค1โ )/๐ค0โ โ1 โ V13 โ1โV13
1
โ1
๓ธ โ โฮฆ ((โ๐2 โ (๐13 , ๐33 , V13 ) ฮโ1 (๐ง0 , ๐ง1 , ๐ง2 ) ) (โ1โ(๐โ )2 ) )) ] ๐๐ง2 .
(B.14)
Acknowledgments
Finally, โฎ ๐ (๐ง0 , ๐ง1 , ๐ง2 , ๐ง3 ; ฮฃ1 ) ๐๐ง0 ๐๐ง1 ๐๐ง2 ๐๐ง3 ฮฉ1
= ๐ฟ 1 + 2 (๐ฟ 2 + ๐ฟ 4 + ๐ฟ 5 + ๐ฟ 6 ) .
(B.15)
The work was partially supported by the National Natural Science Foundation of China (no. 11225103; 11161054; 11171293; 81072386); the Key Fund of Yunnan Province (no. 2010CC003); The Youth Program of Applied Basic Research
12 Programs of Yunnan Province (2013FD001); the Foundation of Yunnan University (no. 2012CG018).
References [1] B. Freidlin, G. Zheng, Z. Li, and J. L. Gastwirth, โTrend tests for case-control studies of genetic markers: power, sample size and robustness,โ Human Heredity, vol. 53, no. 3, pp. 146โ152, 2002. [2] J. D. Storey and R. Tibshirani, โStatistical significance for genomewide studies,โ Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 16, pp. 9440โ9445, 2003. [3] K. Song and R. C. Elston, โA powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies,โ Statistics in Medicine, vol. 25, no. 1, pp. 105โ126, 2006. [4] G. Zheng and J. L. Gastwirth, โOn estimation of the variance in Cochran-Armitage trend tests for genetic association using case-control studies,โ Statistics in Medicine, vol. 25, no. 18, pp. 3150โ3159, 2006. [5] Q. Li, K. Yu, Z. Li, and G. Zheng, โMAX-rank: a simple and robust genome-wide scan for case-control association studies,โ Human Genetics, vol. 123, no. 6, pp. 617โ623, 2008. [6] Q. Li, G. Zheng, X. Liang, and K. Yu, โRobust tests for singlemarker analysis in case-control genetic association studies,โ Annals of Human Genetics, vol. 73, no. 2, pp. 245โ252, 2009. [7] Y. Zang and W. K. Fung, โRobust tests for matched case-control genetic association studies,โ BMC Genetics, vol. 11, article 91, 2010. [8] Y. Zang, W. K. Fung, and G. Zheng, โSimple algorithms to calculate asymptotic null distributions of robust tests in casecontrol genetic association studies in R,โ Journal of Statistical Software, vol. 33, no. 8, pp. 1โ24, 2010. [9] H. C. So and P. C. Sham, โRobust association tests under different genetic models, allowing for binary or quantitative traits and covariates,โ Behavior Genetics, vol. 41, no. 5, pp. 768โ 775, 2011. [10] Q. Li, W. Xiong, J. B. Chen et al., โA robust test for quantitative trait analysis with model uncertainty in genetic association studies,โ Statistics and Its Interface. In press. [11] J. M. Satagopan, E. S. Venkatraman, and C. B. Begg, โTwo-stage designs for gene-disease association studies with sample size constraints,โ Biometrics, vol. 60, no. 3, pp. 589โ597, 2004. [12] A. D. Skol, L. J. Scott, G. R. Abecasis, and M. Boehnke, โJoint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies,โ Nature Genetics, vol. 38, no. 2, pp. 209โ213, 2006. [13] K. Yu, N. Chatterjee, W. Wheeler et al., โFlexible design for following up positive findings,โ American Journal of Human Genetics, vol. 81, no. 3, pp. 540โ551, 2007. [14] R. Sladek, G. Rocheleau, J. Rung et al., โA genome-wide association study identifies novel risk loci for type 2 diabetes,โ Nature, vol. 445, no. 7130, pp. 881โ885, 2007. [15] D. Pan, Q. Li, N. Jiang, A. Liu, and K. Yu, โRobust joint analysis allowing for model uncertainty in two-stage genetic association studies,โ BMC Bioinformatics, vol. 12, article 9, 2011. [16] A. J. Silman and J. E. Pearson, โEpidemiology and genetics of rheumatoid arthritis,โ Arthritis Research & Therapy, vol. 4, supplement 3, pp. S265โS272, 2002.
Computational and Mathematical Methods in Medicine [17] A. J. MacGregor, H. Snieder, A. S. Rigby et al., โCharacterizing the quantitative genetic contribution to rheumatoid arthritis using data from twins,โ Arthritis & Rheumatism, vol. 43, no. 1, pp. 30โ37, 2000. [18] C. I. Amos, W. V. Chen, M. F. Seldin et al., โData for Genetic Analysis Workshop 16 Problem 1, association analysis of rheumatoid arthritis data,โ BMC Proceedings, vol. 3, supplement 7, article S2, 2009. [19] F. Xia, J. Y. Zhou, and W. K. Fung, โA powerful approach for association analysis incorporating imprinting effects,โ Bioinformatics, vol. 27, no. 18, pp. 2571โ2577, 2011. [20] G. Zheng, C. O. Wu, M. Kwak, W. Jiang, J. Joo, and J. A. C. Lima, โJoint analysis of binary and quantitative traits with data sharing and outcome-dependent sampling,โ Genetic Epidemiology, vol. 36, no. 3, pp. 263โ273, 2012. [21] T. W. J. Huizinga, C. I. Amos, A. H. M. van der Helm-Van Mil et al., โRefining the complex rheumatoid arthritis phenotype based on specificity of the HLA-DRB1 shared epitope for antibodies to citrullinated proteins,โ Arthritis & Rheumatism, vol. 52, no. 11, pp. 3433โ3438, 2005. [22] L. Chen, M. Zhong, W. V. Chen, C. I. Amos, and R. Fan, โA genome-wide association scan for rheumatoid arthritis data by Hotellings ๐2 tests,โ BMC Proceedings, vol. 3, supplement 7, article S6, 2009.
MEDIATORS of
INFLAMMATION
The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Gastroenterology Research and Practice Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Hindawi Publishing Corporation http://www.hindawi.com
Diabetes Research Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
International Journal of
Journal of
Endocrinology
Immunology Research Hindawi Publishing Corporation http://www.hindawi.com
Disease Markers
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at http://www.hindawi.com BioMed Research International
PPAR Research Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Journal of
Obesity
Journal of
Ophthalmology Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Evidence-Based Complementary and Alternative Medicine
Stem Cells International Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Oncology Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Parkinsonโs Disease
Computational and Mathematical Methods in Medicine Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
AIDS
Behavioural Neurology Hindawi Publishing Corporation http://www.hindawi.com
Research and Treatment Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Oxidative Medicine and Cellular Longevity Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014