Psychophysiology, 51 (2014), 1335–1336. Wiley Periodicals, Inc. Printed in the USA. Copyright © 2014 Society for Psychophysiological Research DOI: 10.1111/psyp.12357

COMMENTARY

Are we doing enough to extract genomic information from our data?

GUNTER SCHUMANN MRC-SGDP Centre, Institute of Psychiatry, King’s College, London, London, UK

Abstract Genome-wide studies have been successful in identifying stable associations of single genes, albeit at the price of a high false negative rate. The promise of endophenotypes to increase power of genome-wide association studies has only been partially fulfilled. To optimize the investigation of genetic influences on behavioral (endo-)phenotypes, the development of novel phenotypical characterizations and methods to describe the relation between genotype and phenotype are needed. This will require the development of innovative analytical strategies, as well as corroborative approaches linking association studies with functional characterizations. The sole reliance on canonical genome-wide significance thresholds is not sufficient to describe the complex relation of genotype and phenotype. Descriptors: GWAS, Endophenotype, Corroborative, Multivariate, Significance threshold

more in terms of networks (see Whelan et al., 2012), as opposed to the vector model applied in the endophenotype hypothesis. This conceptual difference might underlie the fact that the observed increase in the power to detect genotype phenotype associations using endophenotypes has remained more modest than many of us had expected. While this does not rule out the presence of some genetically highly informative endophenotypes, it might be interesting to direct our thoughts to more cybernetic phenotypic models, which reflect some of the principles of network interactions. This approach might be supported by a more targeted genomewide analysis of genetic effects than is achievable by the current mass univariate approach. Among the many possibilities to describe multivariate features (see Nymberg, Jia, Ruggeri, & Schumann, 2013) and their relation to genomics, one attractive option might be an investigation aimed at identifying patterns between genomics and (high-dimensional) phenotypes such as brain structure/function, which have a strong linear relationship. This can be achieved using multivariate methods including (regularized) canonical correlation analysis (see Chi et al., 2013). It would be expected that such an approach optimizes fit between genomics and a behavioral or brain-related phenotype, thus maximizing the variance explained by genetic influences. However, even in the framework of “traditional” genomewide regression analyses, there might be room to reassess the expectations and role of genetic analyses in describing biological processes that underlie psychopathology. In response to a plethora of candidate gene-based studies, which proved difficult to replicate, the genetic community has settled on a stringent but increasingly arbitrary threshold of p = 5 × 10−8 to define “genome-wide significance.” This threshold was based on Bonferroni correction of the number of single nucleotide

The authors have carried out an impressive amount of work to thoroughly assess genome-wide associations of relevant endophenotypes in the Minnesota Twin Family Study. Their effort has been phenomenal; the ensuing results are provoking. Provoking because of the scarcity of genome-wide significant results. Is there really so little genetic information in our data? From an orthodox genetic viewpoint, the relative paucity of genome-wide significant results is to be expected, as the sample size is moderate for these kinds of studies, which usually derive their power from meta-analyses of tens of thousands of individuals. In this sense, an important value of the endeavor lies in the fact that a data resource has been created that will allow researchers to ask more targeted questions, as well as contribute to meta-analytical data analyses. In the case of the sequencing paper, there is obviously value in using the data for purposes of imputation and normative calibration. Nevertheless, one of the promises of the endophenotypes was that they “represent simpler clues to genetic underpinnings than the disease syndrome itself, which can result in more straightforward and successful genetic analysis” (Gottesman & Gould, 2003). While intuitively appealing, the endophenotype concept rests on hierarchical and rather monocausal and directional assumptions, which perhaps might not reflect the more integrated nature of brain and behavioral processes, as we have discovered since this concept was developed in the late 1960s and early 1970s. Since then, we have learned a lot about cross-talk between systems, their feedback loops, and interdependencies, which now tend to be conceptualized

bs_bs_banner

Address correspondence to: Prof. Gunter Schumann, MD, Chair in Biological Psychiatry, MRC-SGDP Centre, Institute of Psychiatry, King’s College, London, 16 De Crespigny Park, London SE5 8AF, UK. E-mail: [email protected] 1335

1336

G. Schumann

polymorphisms (SNPs) captured in the early GWAS chips. While the establishment of a genome-wide significance threshold was a necessary quality control measure, it is now becoming evident that the few true positive findings generated using the p = 5 × 10−8 cut-off come at a high price of numerous false negative results (see Stacey et al., 2012). Contributing to this problem is the fact that Bonferroni correction assumes independence among the tests considered. However, given that SNPs in linkage disequilibrium (LD) are nonindependent markers, Bonferroni correction may be overly conservative. There are now several methods available that take into account LD structure (relatedness) of GWAS SNPs (see Li, Yeung, Cherny, & Sham, 2012), an approach that is theoretically more accurate than the Bonferroni assumption of unrelated events. It is rare that a phenotype related to a frequent disorder can be comprehensively described upon the basis of genetic analyses alone. This is testified among other examples by the limited phenotypic variance explained by genetic variations and the issue of

“missing heritability,” which might also arise from complex interactions between gene products, and with the environment, which are not captured by linear genetic analyses. This points to the need for a corroborative approach where the role of a gene, or a network of genes, is tested not only on the basis of its association with the phenotype, but also using functional in vitro and in vivo studies, which on balance inform about the plausibility and nature of the relation of a gene to the phenotype (see Schumann et al., 2011). In such a model, exclusive reliance on a significance threshold that accepts the generation of a large number of false negative data might not be imperative, and can in fact be misleading. A discussion among those members of the scientific community who are interested in a systemic integration and clinical application of their research findings might be helpful in identifying standards that balance the necessary rigor of genetic analyses with the weight of corroborative data to comprehensively describe the biological underpinnings of an (endo-)phenotype.

References Chi, E. C., Allen, G. I., Zhou, H., Kohannim, O., Lange, K., & Thompson, P. M. (2013). Imaging genetics via sparse canonical correlation analysis. Paper presented at the 2013 IEEE 10th International Symposium on Biomedical Imaging (ISBI). Gottesman, I. I., & Gould, T. D. (2003). The endophenotype concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160, 636–645. Li, M.-X., Yeung, J. M. Y., Cherny, S. S., & Sham, P. C. (2012). Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Human Genetics, 131, 747–756. Nymberg, C., Jia, T., Ruggeri, B., & Schumann, G. (2013). Analytical strategies for large imaging genetic datasets: Experiences from the IMAGEN study. Annals of the New York Academy of Sciences, 1282, 92–106.

Schumann, G., Coin, L. J., Lourdusamy, A., Charoen, P., Berger, K. H., Stacey, D., . . . Elliott, P. (2011). Genome-wide association and genetic functional studies identify autism susceptibility candidate 2 gene (AUTS2) in the regulation of alcohol consumption. Proceedings of the National Academy of Sciences, 108, 7119–7124. Stacey, D., Bilbao, A., Maroteaux, M., Jia, T., Easton, A. C., Longueville, S., . . . Schumann, G. (2012). RASGRF2 regulates alcohol-induced reinforcement by influencing mesolimbic dopamine neuron activity and dopamine release. Proceedings of the National Academy of Sciences, 109, 21128–21133. Whelan, R., Conrod, P. J., Poline, J.-B., Lourdusamy, A., Banaschewski, T., Barker, G. J., . . . Garavan, H. (2012). Adolescent impulsivity phenotypes characterized by distinct brain networks. Nature Neuroscience, 15, 920–925.

Are we doing enough to extract genomic information from our data?

Genome-wide studies have been successful in identifying stable associations of single genes, albeit at the price of a high false negative rate. The pr...
79KB Sizes 1 Downloads 9 Views