CHAPTER FIVE

Genetics of Alzheimer’s Disease Vincent Chouraki, Sudha Seshadri1 Department of Neurology, Boston University School of Medicine, Boston, MA, USA; Framingham Heart Study, Framingham, MA, USA 1Corresponding author: e-mail address: [email protected]

Contents 1.  Introduction246 2.  Heritability of Alzheimer’s Disease 248 3.  Genetic Causes of Early-Onset Alzheimer’s Disease 249 4.  Genetic Risk Factors for Late-Onset Alzheimer’s Disease 251 4.1  Frequent Variants 251 4.1.1  Apolipoprotein E 4.1.2  Candidate Gene Approaches 4.1.3  Genome-Wide Association Study Signals

251 255 257

4.2  Rare Variants 271 4.3  Structural Variants 273 4.4  Gene–Environment Interactions 274 5.  Discussion275 5.1  Insight into Pathophysiology of AD 275 5.2  Utility of Genetics for Risk Prediction 276 5.3  New Drug Targets 277 5.4 Conclusion 277 References278

Abstract Alzheimer’s disease (AD) represents the main form of dementia, and is a major public health problem. Despite intensive research efforts, current treatments have only marginal symptomatic benefits and there are no effective disease-modifying or preventive interventions. AD has a strong genetic component, so much research in AD has focused on identifying genetic causes and risk factors. This chapter will cover genetic discoveries in AD and their consequences in terms of improved knowledge regarding the disease and the identification of biomarkers and drug targets. First, we will discuss the study of the rare early-onset, autosomal dominant forms of AD that led to the discovery of mutations in three major genes, APP, PSEN1, and PSEN2. These discoveries have shaped our current understanding of the pathophysiology and natural history of AD as well as the development of therapeutic targets and the design of clinical trials. Advances in Genetics, Volume 87 ISSN 0065-2660 http://dx.doi.org/10.1016/B978-0-12-800149-3.00005-6

Copyright © 2014 Elsevier Inc. All rights reserved.

245

246

Vincent Chouraki and Sudha Seshadri

Then, we will explore linkage analysis and candidate gene approaches, which identified variants in Apolipoprotein E (APOE) as the major genetic risk factor for late-onset, “sporadic” forms of AD (LOAD), but failed to robustly identify other genetic risk factors, with the exception of variants in SORL1. The main focus of this chapter will be on recent genome-wide association studies that have successfully identified common genetic variations at over 20 loci associated with LOAD outside of the APOE locus. These loci are in or near-novel AD genes including BIN1, CR1, CLU, phosphatidylinositol-binding clathrin assembly protein (PICALM), CD33, EPHA1, MS4A4/MS4A6, ABCA7, CD2AP, SORL1, HLA-DRB5/DRB1, PTK2B, SLC24A4-RIN3, INPP5D, MEF2C, NME8, ZCWPW1, CELF1, FERMT2, CASS4, and TRIP4 and each has small effects on risk of AD (relative risks of 1.1–1.3). Finally, we will touch upon the ongoing effort to identify less frequent and rare variants through whole exome and whole genome sequencing. This effort has identified two novel genes, TREM2 and PLD3, and shown a role for APP in LOAD. The identification of these recently identified genes has implicated previously unsuspected biological pathways in the pathophysiology of AD.

1.  INTRODUCTION At the start of the twentieth century, less than 5% of the population was over the 60 years of age, but this proportion has risen steadily, and in 2050, the total size of the population of 60 years of age or more is expected to reach 2 billion worldwide (United Nations, 2010). As a consequence, the number of persons suffering from one or more age-related neurological diseases will increase dramatically. One of these neurological conditions, dementia, is defined in the international classification of diseases as “a syndrome due to disease of the brain—usually of a chronic or progressive nature—in which there is disturbance of multiple higher cortical functions, including memory, thinking, orientation, comprehension, calculation, learning capacity, language, and judgment. Consciousness is not clouded. The impairments of cognitive function are commonly accompanied, and occasionally preceded, by deterioration in emotional control, social behavior, or motivation. This syndrome occurs in a large number of conditions primarily or secondarily affecting the brain”. In 2010, 35.6 million persons were demented worldwide. It has been shown that the prevalence and incidence of dementia after 60 years of age increase exponentially, doubling in each successive 5-year age category. Based on these numbers, it has been projected that the number of persons with dementia will reach 65.7 million in 2030, and 115.4 million in 2050 (Prince, Jackson, & Alzheimer’s Disease International, 2009). As such, dementia represents a major public health problem and research exploring new predictive, preventive, and curative avenues is urgently needed.

Genetics of Alzheimer’s Disease

247

The neurodegenerative condition, Alzheimer’s disease (AD), is the leading cause of dementia and is thought to explain 50–70% of all dementia. Clinically, it is characterized by an insidious onset, progressive deterioration of cognition, prominently affecting the domain of episodic memory (memory for specific details of recent events) but also resulting in loss of insight, judgment, language, changes in perception, praxis (ability to do daily tasks), behavior, sleep, and in late stages, physical functioning. The diagnosis of AD is based on a characteristic clinical picture and the exclusion of other causes of dementia based on a clinical history and examination, and when indicated, additional blood tests, brain imaging, electroencephalographic examination (EEG), or lumbar puncture; the latter tests the cerebrospinal fluid for absence of infections and presence of an AD pattern of protein changes (low amyloid-β (Aβ) and elevated tau and phospho-tau levels). In most persons, clinical AD symptoms begin after 60 years of age, and diagnostic certainty for this late-onset AD (LOAD) remains probabilistic, based on the clinical profile and evolution (McKhann et al., 1984), although a specific profile of genetic and imaging biomarkers can raise the probability of AD (McKhann et al., 2011). However, in a few persons who belong to families suffering from an early-onset form of AD (EOAD), genetic testing for specific mutations can raise the probability close to 100%. A definitive diagnosis of AD is only possible postmortem and relies on a combined assessment of the clinical picture and results of a pathological examination of the brain. Two major pathological hallmarks are observed in the AD brain. The first, amyloid plaques, are extracellular deposits of Aβ peptides, whereas the second are intraneuronal neurofibrillary tangles, composed of hyperphosphorylated tau protein. In 2014, there is still no curative treatment for AD and drugs to slow progression yield very marginal benefits. There is no effective preventive intervention although some promising lifestyle and drug interventions are being studied. Management of patients with AD is expensive, for the whole health system and for the affected individual and family. It is estimated that more money is spent on AD care each year than the annual operating budgets of Exxon Mobil or Walmart (Wimo, Prince, & Alzheimer’s Disease International, 2010). AD has a strong genetic component and it is hoped that agnostic searches for genes associated with AD will point to novel pathophysiological mechanisms and thus lead to novel predictive biomarkers and drug targets for AD. In this chapter, we will briefly review the development of AD genetics with an emphasis on more recent discoveries.We will survey how the disease tends to

248

Vincent Chouraki and Sudha Seshadri

aggregate in families, how the evolution of genetics and genetic epidemiology, from linkage analysis through genome-wide association studies to whole exome sequencing, has led to the identification of several causal mutations, and frequent and rare genetic susceptibility factors. We will further review how identification of these mutations has shaped emerging novel hypotheses regarding the mechanisms leading to clinical disease and the search for therapeutic drugs. Finally, we will survey the possible utility of these genetic variants to improve the differential diagnosis of AD from other causes of dementia, to better predict risk of developing AD, and to identify specific therapies that might be most effective or least toxic in a given patient.

2.  HERITABILITY OF ALZHEIMER’S DISEASE After age, a family history of AD is the most important AD risk factor. The risk of developing AD is more than doubled in first degree relatives of patients with AD, compared to the general population (Lautenschlager et al., 1996) and it has been estimated that a 65 year old has 5 times greater risk of developing AD by 87 years of age (49% vs 10%) if he or she has a first degree relative with AD (Breitner, Silverman, Mohs, & Davis, 1988). This risk varies according to ethnicity, with some studies suggesting higher risks among African-American (Green et al., 2002) and Caribbean Hispanic (Devi et al., 2000) populations. Studies of dizygotic and monozygotic twin pairs are useful to evaluate the relative contributions of genetic and environmental factors to a disease. In AD, heritability estimates range between 60% and 80% (Bergem, Engedal, & Kringlen, 1997; Gatz et al., 1997; Pedersen, Posner, & Gatz, 2001). Interestingly, the environmental component is not negligible, and as with other complex genetic disorders, gene–environment interactions are expected to influence disease risk. Two main forms of AD are recognized, familial EOAD and “sporadic” LOAD. Familial forms represent less than 1% of all cases of AD. They are characterized by early-onset, before the 60 years of age, a strong familial aggregation, and Mendelian transmission, mostly autosomal dominant. Those forms are most often caused by rare mutations, with complete penetrance, located in three genes, APP, PSEN1, and PSEN2, coding for amyloid peptide precursor, and presenilin 1 and 2, respectively. Sporadic forms are the most frequent type of AD. The age of onset is usually after 60 years, without a well-defined mode of transmission, and with only modest familial aggregation. Given the strong genetic component

Genetics of Alzheimer’s Disease

249

documented in AD, these late-onset forms should be called “complex” or “non-Mendelian” rather than sporadic. Their late-onset also implies interactions with environmental risk factors, although such interactions have so far proven challenging to identify (Traynor & Singleton, 2010).

3.  GENETIC CAUSES OF EARLY-ONSET ALZHEIMER’S DISEASE Linkage analyses have played an important role in the identification of genes causing AD. The method of linkage analysis utilizes the joint segregation of the disease and genetic markers, within families, in population isolates (that have had a few common founders several generations earlier), or in the population in general. It utilizes data on disease or trait status along with genetic makers on one or all chromosomes (restriction fragment length polymorphisms, microsatellite markers, single nucleotide polymorphisms (SNPs), and structural variation have all been used as markers) to test independence of the transmission between the genetic markers and the disease. It seeks to identify loci where the transmission is not independent but appears linked to the disease more often than would be predicted by chance alone, a probability expressed as a LOD (logarithm of odds) score. If a genetic marker is close to the causal variant(s) for a particular disease, it will be preferentially transmitted with the disease within large families. Using familial data typically implies that we are able to study only a small number of recombination events, which in turn limits the resolution, with hundreds of genes being present at each identified locus.Thereafter, studies in other families and studies of the biology of putative genes are required to identify the causal gene at each identified locus. Moreover, this approach works best for diseases with Mendelian transmission, where rare mutations on a single or a small number of genes have a strong effect on the risk of disease in the population. Linkage analyses also have some advantages over association analyses in identifying genes that harbor multiple rare and private variants causing a disease. The identification of the first mutations causing AD followed the discovery of Aβ peptides in the “senile” plaques and the observation that these peptides are present in the brains of both AD patients and persons with Down syndrome, a trisomy of chromosome 21 in which nearly all survivors develop a superimposed dementia beyond the 40 years of age (Glenner & Wong, 1984a, 1984b). It was therefore hypothesized that mutations on a gene located on chromosome 21 might also cause AD in persons without Down’s syndrome (Glenner & Wong, 1984a) and indeed, such a gene was identified by linkage analysis.

250

Vincent Chouraki and Sudha Seshadri

In 1987, a linkage peak on chromosome 21 was identified in some familial forms of AD (St George-Hyslop et al., 1987) and the APP gene, coding for the Aβ peptide precursor, was identified as a promising candidate gene at this locus (Goldgaber, Lerman, McBride, Saffiotti, & Gajdusek, 1987; Kang et al., 1987; Robakis et al., 1987; Tanzi et al., 1987). This finding was confirmed when specific mutations in APP were associated with EOAD in families (Chartier-Harlin et al., 1991; Goate et al., 1991; Mullan et al., 1992; Murrell, Farlow, Ghetti, & Benson, 1991). Mutations in the APP gene proved insufficient to explain all forms of familial EOAD, and additional linkage analyses showed that EOAD was genetically heterogeneous (Schellenberg et al., 1988; St George-Hyslop et al., 1990). Additional linkage peaks were identified on chromosome 14 (Schellenberg, Bird, et al., 1992), and in families of Volga-Germans who had migrated to the Americas, on chromosome 1 (Levy-Lahad, Wijsman, et al., 1995). In 1995, corresponding disease-causing mutations were identified in the PSEN1 (Sherrington et al., 1995) and PSEN2 (Levy-Lahad, Wasco, et al., 1995) genes. PSEN1 accounts for the largest proportion of EOAD. These discoveries led to formulation of the “amyloid cascade” hypothesis, which remains the leading mechanistic hypothesis of AD and the focus of attempts at early diagnosis and preventive and therapeutic interventions (Hardy and Higgins, 1992; Hardy and Selkoe, 2002). APP is a transmembrane protein of poorly known function, which can be processed along one of two physiological pathways that compete with one another, and in which clathrin-mediated endocytosis and intracellular processing play an essential role (Haass, Kaether, Thinakaran, & Sisodia, 2012; Zhang, Ma, Zhang, & Xu, 2012). In the nonamyloidogenic pathway, APP is cleaved by proteases bearing an alpha-secretase activity, and then by a protein complex called gammasecretase, of which presenilins 1 and 2 are an integral part.This pathway leads to the production of secreted alpha-APP, p3 peptide, and the intracellular domain of APP (AICD). In the amyloidogenic pathway, APP is cleaved by a beta-secretase, BACE1, and then by the gamma-secretase complex. This leads to the formation of secreted beta-APP, Aβ peptides, and AICD. Thus what determines whether or not an amyloidogenic Aβ peptide is generated is whether the initial cleavage of APP was by an alpha- or a beta-secretase. Further, variability in the gamma-secretase cleavage site results in different species of Aβ peptides, with varying lengths and biochemical properties. The most frequent species are Aβ42 and Aβ40. The amyloid cascade hypothesis postulates that disease-causing mutations in APP, PSEN1, and PSEN2 result in a relative increase in aggregation-prone Aβ42 species, leading to the formation

Genetics of Alzheimer’s Disease

251

of neurotoxic oligomers of Aβ (Benilova, Karran, & De Strooper, 2012).This, in turn, is thought to trigger a cascade of events, leading to the hyperphosphorylation of tau, formation of neurofibrillary tangles, synaptic loss, neuronal death, and clinical onset of cognitive decline and dementia. In LOAD, reduced clearance and degradation of Aβ, rather than increased formation, is postulated to be the causal mechanism (Guénette, 2003). There were, as of March 2014, 40, 197, and 25 mutations reported in the APP, PSEN1, and PSEN2 genes, respectively. A list of all reported mutations is maintained in the Alzheimer Disease and Frontotemporal Dementia Mutation Database (http://www.molgen.ua.ac.be/ADMutations) (Cruts, Theuns, & Van Broeckhoven, 2012). All these mutations show autosomal dominant transmission with complete penetrance, with the exception of a single recessive mutation in APP (Di Fede et al., 2009). Mutations in PSEN1 are the most frequent of these mutations and their biological effects are being studied in detail by investigators who are prospectively characterizing mutation carriers and noncarriers within families with repeated cognitive, biomarker, and brain imaging studies (Bateman et al., 2012). No other gene has been associated with familial forms of AD. Moreover, until recently, none of these mutations had been involved in LOAD. However, a recent study in the Icelandic population identified a mutation in the APP gene that protects from late-onset AD and is also associated with better cognitive performance in nondemented allele carriers compared to agematched persons who lacked the protective allele (Jonsson et al., 2012). The study of the biological mechanisms underlying EOAD continues to teach us about the pathophysiological processes underlying AD. Currently, this is spearheaded by studies within the Dominantly Inherited Alzheimer Network collaboration (http://www.dian-info.org/) and the Alzheimer Prevention Initiative, which is studying preventive interventions in a large Colombian kindred of Basque descent with a PSEN1 mutation (http://­banneralz. org/research-plus-discovery/alzheimers-prevention-initiative.aspx).

4.  GENETIC RISK FACTORS FOR LATE-ONSET ALZHEIMER’S DISEASE 4.1  Frequent Variants 4.1.1  Apolipoprotein E Apolipoprotein E (APOE) is a component of several lipoproteins, such as high and very low density lipoproteins, and chylomicrons. The main role of APOE is to transport lipids and cholesterols throughout the body. It is also

252

Vincent Chouraki and Sudha Seshadri

a ligand for low density lipoprotein (LDL) receptors and mediates the binding, internalization, and catabolism of lipoproteins in cells (Mahley, 1988). APOE is the major apolipoprotein expressed in the brain, where its rate of production is second only to the liver, its main site of production (Elshourbagy, Liao, Mahley, & Taylor, 1985; Utermann, Menzel, Langer, & Dieker, 1975). In addition to its function in cholesterol and lipid transport, APOE also has a role in mediating synaptogenesis, synaptic plasticity, and neuroinflammation (Holtzman, Herz, & Bu, 2012). The APOE receptors in the brain include LDL receptors on normal astrocytes and the LDL receptorrelated protein that is present on normal neurons and in the senile plaque, and seems to mediate the effects of APOE in the brain (Rebeck, Reiter, Strickland, & Hyman, 1993). APOE is comprised of three major isoforms that are determined by cysteine-to-arginine substitutions at positions 112 and 158 of the amino acid sequence (Weisgraber, Rall, & Mahley, 1981).These isoforms correspond to specific genetic variations at two SNPs (rs429358 and rs7412, respectively), within exon 4 of the gene (Utermann, Hees, & Steinmetz, 1977, Utermann, Langenbeck, Beisiegel, & Weber, 1980; Zannis, & Breslow, 1980; Zannis, & Breslow, 1981; Zannis, Just, & Breslow, 1981) and are called APOE 2 (cys112, cys158), 3 (cys112, arg158), and 4 (arg112, arg158), with the corresponding alleles designated ε2, ε3, and ε4 (Zannis et al., 1982). APOE ε3 is the most common allele, whereas ε4 and ε2 have allele frequencies of 14% and 7% (Figure 5.1). Ignatius et al. (1986) and Ignatius, Gebicke-Haerter, Pitas, and Shooter (1987) showed that a 37 kDa protein previously shown to be expressed during neuron injury and repair (Müller, Gebicke-Härter, Hangen, & Shooter, 1985; Politis, Pellegrino, Oaklander, & Ritchie, 1983; Skene & Shooter, 1983) was likely to be APOE. Diedrich et al. (1991) demonstrated that APOE was overexpressed both in scrapie, a prion disease, and in AD. In 1991, PericakVance et al. (1991) described a linkage peak and association with AD on the short arm of chromosome 19 in families with LOAD. Previous suggestive linkage at the APOCII locus, which is close to APOE, had also been reported by Schellenberg et al. (1987) and Schellenberg, Boehnke, et al. (1992) in EOAD. Moreover, Namba, Tomonaga, Kawasaki, Otomo, and Ikeda (1991) and Wisniewski and Frangione (1992) established the presence of APOE in amyloid plaques. In 1993, three papers reported an association of the ε4 allele of APOE with both EOAD and LOAD. First, Strittmatter et al. (1993) showed that APOE could bind to Aβ with high avidity and that frequency of the APOE ε4 allele was higher in familial LOAD. Saunders et al. (1993)

Genetics of Alzheimer’s Disease

253

Figure 5.1 The APOE Locus. This figure shows the genetic location of the APOE and TOMM40 genes on chromosome 19, their structure in term of exons/introns and the linkage disequilibrium between the SNPs present in the CEU population of the 1000 Genomes Project. The two SNPs constituting the APOE ε allele and the variant recently reported in TOMM40 are also represented. This figure was generated on the ensembl. org Web site, exported as svg and modified using Inkscape.

254

Vincent Chouraki and Sudha Seshadri

extended and confirmed the association of the APOE ε4 allele to both familial and sporadic forms of LOAD. Polymorphisms in the APOCII gene were also studied but their association with AD was not found to be statistically significant. Finally, Corder et al. (1993) reported a gene dose association between the ε4 allele and risk of AD in families with AD; in these families, persons with an APOE ε4 allele had an earlier age at onset of clinical dementia. Poirier et al. (1993) further confirmed the association in a case–control study of sporadic AD. These results were followed by a series of reports confirming the results in other data sets (Amouyel, Brousseau, Fruchart, & Dallongeville, 1993; Myers et al., 1996; Noguchi, Murakami, & Yamada, 1993). Mayeux et al. (1993) also hinted at differences in strength of association between APOE ε4 and AD according to ethnicity. In addition to the association of APOE ε4 and risk of AD, Corder et al. (1994) and Royston et al. (1994) reported a decreased frequency of ε2 allele in AD cases. These associations have since been consistently replicated, making the APOE ε4 allele the most important genetic risk factor for LOAD. It is a risk factor and unlike the APP, PSEN1, and PSEN2 mutations is not sufficient to cause the disease; neither is it necessary, and about 50% of all persons with AD do not carry an APOE ε4 allele. Carriers of one copy of the ε4 allele have a two- to fivefold increase in relative risk of AD compared to their peers, with this relative risk decreasing with increasing age, approaching one (no excess risk) in centenarians. This observation of “outliving risk” suggests that protective genetic factors might allow some persons with genetic susceptibility to AD to greatly delay or avoid clinical disease. Persons with two copies of the allele have 12–15 times the risk observed in noncarriers (Bertram, McQueen, Mullin, Blacker, & Tanzi, 2007; Farrer et al., 1997; Genin et al., 2011). Residual lifetime risks (assuming current average life expectancies) for APOE ε4 noncarriers, heterozygotes, and homozygotes have been estimated at 11% and 14%, 23% and 30%, and 51% and 60% respectively, in men and women of 85 years of age (Genin et al., 2011). APOE is associated with hyperlipidemia, atherosclerosis, and a shorter life expectancy (Schächter et al., 1994; Wilson et al., 1994) but this does not account for its impact on risk of AD, since the ε2 allele which also causes hypercholesterolemia lowers risk of AD. Indeed, despite two decades of research, the pathophysiological pathways linking APOE to AD remain unclear. APOE seems to play a role in brain development and repair throughout life and the ε4 allele has been associated with smaller gray matter volumes in infants, worse outcome after head trauma and with accelerated brain aging as manifested by greater amyloid deposition, poorer cognitive function, and greater

Genetics of Alzheimer’s Disease

255

cognitive decline in carriers (Davies et al., 2014; Dean et al., 2014; Jordan et al., 1997; Kutner, Erlanger, Tsai, Jordan, & Relkin, 2000). It is thought that APOE acts as a chaperone protein facilitating clearance of Aβ, with the ε4 allele being less efficient in this role, likely related to structural changes of the protein due to the cysteine-to-arginine substitutions (Zhong & Weisgraber, 2009). Other direct receptor-mediated effects independent of the amyloid pathway might also be involved (Holtzman et al., 2012; Mahley, & Huang, 2012; Liu, Kanekiyo, Xu, & Bu, 2013; Zlokovic, 2013). The APOE locus also contains other genes that might represent good candidates for causing LOAD. Recently, an intronic poly-T repeat polymorphism (detected by variant rs10524523, also called “523”) in the translocase of outer mitochondrial membrane 40 homolog (TOMM40) gene was observed to be associated with age at onset of AD in ε3/3 homozygotes (Roses et al., 2010), with the longer polymorphism (>30 repeats) at the 523 locus associated with earlier disease onset and with worse verbal memory, smaller brain volumes among adult ε3/3 children of AD patients. An attempt to replicate this observation of earlier age at onset of AD, in a very large data set of over 10,000 cases and 10,000 controls was unsuccessful, but this was a case–control data set where age at onset could not be reliably ascertained in a large proportion of the cases (Jun et al., 2012). The APOE region has been shown to modulate brain expression of both APOE and TOMM40 (Bekris, Lutz, & Yu, 2012; Linnertz et al., 2014), and in an examination of pathological burden, genetic variation in TOMM40 was associated with parenchymal amyloid burden and AD pathology although it was not associated with cerebral (vascular) amyloid angiopathy (Valant et al., 2012).There is strong linkage disequilibrium at this locus (Figure 5.1) and this, together with the large impact of the APOE locus on the risk of LOAD, has made it difficult to separate the effect of APOE ε4 from this variant (Chu et al., 2011; Cruchaga et al., 2011; Davies et al., 2014; ­Schiepers et al., 2012). Thus, further work is needed to definitively include or exclude a contribution of TOMM40 to AD pathogenesis. Other putative AD genes, such as EXOC3L2/MARK4 reported on chromosome 19, close to the APOE locus, may have also represented false-positive associations (Seshadri et al., 2010). 4.1.2  Candidate Gene Approaches Following the discovery of the association between APOE ε4 and LOAD, a number of studies sought associations between variants in additional biologically plausible candidate genes and risk of LOAD. The candidate genes

256

Vincent Chouraki and Sudha Seshadri

studied were selected either based on their localization within a linkage peak for AD (for a meta-analysis of linkage studies in AD, see Butler et al. (2009)) or their possible role in the pathophysiology of AD. Simultaneously, since linkage analysis is less powerful to identify common genetic variants with low penetrance, the search for genetic risk factors underlying complex diseases shifted toward association studies. In genetic association studies, the allele frequencies of genetic markers are compared in a sample of persons affected with the disease, called cases, and a sample of unaffected persons, called controls. The detection of a significant association between the genetic marker and the disease suggests that this marker is either directly involved in the disease (causal variant), or is in linkage disequilibrium with a causal variant. SNPs are the most frequently used markers in association studies today, although restriction fragment length polymorphisms and microsatellite markers had been used in the past. Between 1996 and 2005, more than a 1000 scientific articles describing over 500 candidates of AD genes were published. Candidate gene studies in AD have described mutations in numerous pathways including those involving tau phosphorylation, vacuolar sorting proteins, metalloproteins, glucose and insulin metabolism, nitrous oxide synthesis, oxidative stress, growth factors, inflammation- and lipid-related pathways. Starting in 2007, Bertram et al. (2007) created the AlzGene database (www.alzgene.org) to collate all published results in an ongoing manner, assess the strength of evidence for each putative candidate gene according to reproducible criteria, and for all alleles with data from at least four discrete studies, to perform and present metaanalysis results. This effort confirmed that most genetic variations besides APOE had only small effects (∼20% increase or decrease in AD risk) and unfortunately could not be validated in replication studies. Although heterogeneity in phenotype, genotype, or environment could partly explain these inconsistencies (Ertekin-Taner, 2010), the dominant reason for these disappointing results was the small sample sizes (typically less than 1000 cases and controls) leading to initial false-positive results that failed to replicate. Other drawbacks of the candidate gene approach are that novel biological pathways were, by definition, less likely to be studied, and the failure to pursue associations in the 98% of the genome that is noncoding, but may have an important regulatory function. However, in addition to APOE, the candidate gene approach robustly identified the sortilin-related receptor LDLR class A repeats containing (SORL1, earlier also called SORLA) gene as a major AD locus. The initial report by Rogaeva et al. (2007) was based on a list of candidate genes

Genetics of Alzheimer’s Disease

257

involved in endocytosis and intracellular trafficking, among which, two clusters of SNPs in the SORL1 gene were found to be associated with the risk of AD. Genetic variation in SORL1 was also independently found to be associated with AD endophenotypes such as abstract thought, verbal memory, total brain volume, and white matter hyperintensities among persons free of AD (Seshadri et al., 2007). The association of SORL1 with AD was confirmed in a meta-analysis by Reitz et al. (2011) and just recently in genome-wide association analyses (see below) (Lambert et al., 2013; ­Miyashita et al., 2013). SORL1 might be involved in the pathophysiology of AD by regulating the trafficking of endocytic vesicles containing APP toward either the amyloidogenic late endosomal pathway or a more benign, APP-recycling pathway, thus limiting the production of Aβ peptides (Rogaeva et al., 2007; Willnow & Andersen, 2013). 4.1.3  Genome-Wide Association Study Signals 4.1.3.1  Principles of GWAS

Starting in 2005, genome-wide association study (GWAS) become feasible and discoveries of genes not previously suspected to be associated with agerelated, complex diseases such as age-related macular degeneration gave rise to expectations of similar findings in the field of AD (Wellcome Trust Case Control Consortium, 2007; Klein et al., 2005). GWAS, like genetic association studies, compare the frequency of alleles of a given variant in case and control populations. The main difference is the number and selection of tested variants, which instead of being chosen based on an a priori hypothesis of a few candidate genes and variants, agnostically cover the entire genome.The main advantages are the absence of any a priori hypothesis regarding the region associated with the disease (leading to novel discoveries), a better resolution and a greater statistical power to detect common variants than available through linkage analysis. This makes GWAS well adapted for the discovery of common genetic variation associated with complex diseases. These advantages are balanced by the great number of statistical tests, increasing the risk of false-positive results. To deal with this issue, stringent criteria for statistical significance need to be applied, and as reproducibility of the results is also essential to confirm the validity of the observed association, the best GWAS are organized in two stages, a discovery phase, genome-wide, where a p-value threshold of 5 × 10−8 is used to establish which associations reach statistical significance, and a replication phase, where association with selected variants is examined via de novo genotyping in an independent population. It is ideal to include all

258

Vincent Chouraki and Sudha Seshadri

samples with GWAS data available in the discovery phase as this maximizes the chances of novel discovery, but sometimes GWAS becomes available in new samples after the initial discovery analyses in which case these data have been used for “in silico” replication of the initial findings. 4.1.3.2  Discovery of New Variants Associated with Risk of Late-Onset Alzheimer’s Disease

The first GWAS of AD were disappointing because they only confirmed the APOE ε allele (Abraham et al., 2008; Carrasquillo et al., 2009; Coon et al., 2007; Li et al., 2008; Reiman et al., 2007; Webster et al., 2008). New loci were also identified near GAB2 and PCDH11X that have been difficult to replicate so far (Beecham et al., 2010; Miar et al., 2011;Wu et al., 2010).This can be attributed to a “winner’s curse” whereby effect sizes are typically smaller in replication studies compared to the discovery analysis or because these are false-positive findings. Replication attempts are still hampered by relatively small sample sizes, and the resulting lack of statistical power, given the expected effect size of these variants. These initial disappointing results prompted the creation of several large international consortia that pooled data across several thousands of participants. At the Vienna meeting of the International Conference on Alzheimer Disease in 2009, two of these large consortia, the European Alzheimer’s Disease Initiative (EADI) and the Genetic and Environmental Risk in Alzheimer’s Disease (GERAD) reported for the first time, the identification of novel variants outside APOE that were associated with AD (­Harold et al., 2009; Lambert et al., 2009). The EADI used a discovery stage comprising 2032 AD cases and 7848 controls from France and identified a genome-wide significant signal in the CLU gene. This signal was replicated in 3978 AD cases and 3297 controls from all over Europe. Furthermore, two suggestive variants in CR1 reached genome-wide significance when the discovery and replication stages were meta-analyzed together (Lambert et al., 2009). In a companion paper, the GERAD studied 3941 AD cases and 7848 controls to independently identify one of the CLU variants reported by Lambert et al. (2009), and also identified variants in the PICALM gene. Both these associations were replicated in a second stage involving 2033 AD cases and 2340 controls (Harold et al., 2009). These two are landmark publications because they report the first robust identification of genetic risk factors for LOAD outside the APOE locus. In 2010, the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium identified variants in the Bridge Integrator

Genetics of Alzheimer’s Disease

259

1 (BIN1) gene using a three-stage approach combining discovery (1) in an independent set of GWAS (totaling 3006 AD cases and 14,642 controls), (2) a two-stage sequential in silico replication using previous GWAS results from the EADI and GERAD consortia, and (3) de novo replication of the results that reached genome-wide significance after the in silico replication in an independent Spanish population of 1140 AD cases and 1209 controls through genotyping (Seshadri et al., 2010). A second gene, EPHA1, reached genome-wide significance only in the first stage of the two stage in silico replication. This study was interesting because it was the first GWAS of AD to use genotype imputation (in addition to studying the directly genotyped variants) and to meta-analyze results from several individual data sets. Imputation is a statistical technique that estimates missing genotypes in a population by comparing haplotypes of this population with those of a denser reference panel (Marchini & Howie, 2010). Imputing missing genotypes is essential to obtain a common set of genetic variants and meta-analyze results when studies have used different genotyping platforms, for example, Affymetrix or Illumina arrays (Zeggini & Ioannidis, 2009). In 2011, a collaborative effort across these three consortia (the EADI, GERAD, and CHARGE) identified new variants associated with AD in ABCA7 (Hollingworth et al., 2011). In a companion paper, the Alzheimer’s Disease Genetics Consortium (ADGC), reported genome-wide significant results in the MS4A gene cluster using a two-stage GWAS of discovery followed by in silico replication (of hits with p < 10−6) on a total of 11,840 AD cases and 10,931 controls (Naj et al., 2011). Furthermore, the collaboration of those two initiatives identified additional signals in or near EPHA1, CD33, and CD2AP by combining their data (Hollingworth et al., 2011; Naj et al., 2011). Most of these associations have been replicated in European populations. In addition to cross-replications in the publications mentioned earlier, specific replication studies that undertook targeted genotyping or sequencing of these newly discovered loci have validated the findings (Biffi et al., 2010; Carrasquillo et al., 2010, 2011a,2011b; Corneveaux et al., 2010; Gu et al., 2011; Jun et al., 2010; Kamboh, Minster, et al., 2012; Lambert et al., 2011; Omoumi et al., 2014; Piaceri et al., 2011). Other GWAS efforts have also replicated these results (Antúnez et al., 2011; Hu et al., 2011; Kamboh, Demirci, et al., 2012) and have identified additional putative GWAS signals near MTHFD1L (Naj et al., 2010), CUGBP2 (Wijsman et al., 2011), and ATP5H (Boada et al., 2014) that need independent verification in external samples. As is the case for GAB2 and PCDH11X, these new loci have not yet been consistently replicated. To

260

Vincent Chouraki and Sudha Seshadri

tackle the increasing overlap of individual study participants between these studies and facilitate further discoveries, a mega consortium called the International Genomics of Alzheimer’s Project (IGAP) was established. The first project of IGAP was to conduct a new GWAS on all subjects of European ancestry across the ADGC, EADI, CHARGE, and GERAD populations, representing 17,008 AD cases and 37,154 controls, and to genotype suggestive variants (with a p-value for association with risk of AD below 10−3) in a large replication stage of 8572 AD cases and 11,312 controls (Lambert et al., 2013). In addition to APOE, all the previous GWAS loci were confirmed, with the exception of CD33, which reached genome-wide significance in the discovery stage but was not replicated in the second stage. Furthermore, 11 new loci in or near HLA-DRB5/DRB1, PTK2B, SORL1, SLC24A4RIN3, INPP5D, MEF2C, NME8, ZCWPW1, CELF1, FERMT2, and CASS4 were identified. SORL1 had been previously identified in a GWAS of Asian and European ancestry participants (Miyashita et al., 2013) and its involvement in AD is further confirmed by this study. The fact that this locus was not identified in earlier GWAS could be explained by the relatively low minor allele frequency, making it hard to detect until a sufficiently large sample size became available. Attempts to replicate these novel signals are ongoing and a recent follow-up study has already confirmed the signal in ZCWPW1 and strengthened one of the suggestive signals near TRIP4 (Ruiz, Heliman, et al., 2014). The effect sizes for the novel GWAS signals are modest and are presented in Table 5.1 and Figure 5.2. 4.1.3.3  Replication of GWAS Hits in Minorities

Replication of results in independent populations constitutes a strong evidence of the “reality” of a statistical signal, and is integrated in the conception of GWAS, given the high number of statistical tests performed. The initial studies described above have been largely restricted to populations of European ancestry and the findings still need to be explored in other race/ ethnic groups. Studies performed in Asian populations replicated signals in CLU (Komatsu et al., 2011; Liu,Wang, et al., 2014;Yu et al., 2013), CR1 (Jin, Li, Yuan, Xu, & Cheng, 2012), PICALM (Chung et al., 2013; Liu, Zhang, et al., 2013; Miyashita et al., 2013), BIN1 (Liu, Zhang, Li, et al., 2013; Miyashita et al., 2013), CD33 (Deng et al., 2012; Tan,Yu, et al., 2013), and MS4A (Deng et al., 2012; Tan, Yu, Zhang, et al., 2013). In a study combining data from Japanese, Korean, and European participants, Miyashita et al. (2013) were the first to report an association reaching genome-wide significance at the SORL1 locus. Associations with CLU, PICALM, and BIN1 have also

Effect Allele Odds Ratio (95% Frequency Confidence Interval)

References

Frequent Variants

APOEa

19

45411941 45412079

ε4

0.14

∼2 to 5

207692049 127892810 234068476 88223420 32578530

A/G T/C T/C G/A C/A

0.197 0.409 0.488 0.408 0.276

1.18 (1.14–1.22) 1.22 (1.18–1.25) 1.08 (1.05–1.11) 0.93 (0.90–0.95) 1.11 (1.08–1.15)

1 2 2 5 6 6

rs10948363

47487762

G/A

0.266

1.10 (1.07–1.13)

NME8 ZCWPW1 EPHA1

7 7 7

rs2718058 rs1476679 rs11771145

37841534 100004446 143110762

G/A C/T A/G

0.373 0.287 0.338

0.93 (0.90–0.95) 0.91 (0.89–0.94) 0.90 (0.88–0.93)

PTK2B CLU

8 8

rs28834970 rs9331896

27195121 27467686

C/T C/T

0.366 0.379

1.10 (1.08–1.13) 0.86 (0.84-0.89)

CELF1

11

rs10838725

47557871

C/T

0.316

1.08 (1.05–1.11)

Strittmatter et al., 1993; Saunders et al., 1993; Corder et al., 1993 Lambert et al, 2009 Seshadri et al, 2010 Lambert et al, 2013 Lambert et al, 2013 Lambert et al, 2013 Naj et al, 2011; ­Hollingworth et al, 2011 Lambert et al, 2013 Lambert et al, 2013 Naj et al, 2011; ­Hollingworth et al, 2011 Lambert et al, 2013 Lambert et al, 2009; Harold et al, 2009 Lambert et al, 2013 Continued

261

CR1 BIN1 INPP5D MEF2C HLADRB1/5 CD2AP

rs429358/ C112R rs7412/ C158R rs6656401 rs6733839 rs35349669 rs190982 rs9271192

Genetics of Alzheimer’s Disease

Table 5.1  Genetic Risk Factors for Late-onset Alzheimer’s Disease Position Effect/Other Locus Chromosome Variant (hg19) Alleles

262

Table 5.1  Genetic Risk Factors for Late-onset Alzheimer’s Disease—cont’d Position Effect/Other Locus Chromosome Variant (hg19) Alleles

Effect Allele Odds Ratio (95% Frequency Confidence Interval)

MS4A

11

rs983392

59923508

G/A

0.403

0.90 (0.87–0.92)

PICALM SORL1 FERMT2 SLC24A4RIN3 TRIP4b ABCA7

11 11 14 14

rs10792832 rs11218343 rs17125944 rs10498633

85867875 121435587 53400629 92926952

A/G C/T C/T T/G

0.358 0.039 0.092 0.217

0.87 (0.85–0.89) 0.77 (0.72–0.82) 1.14 (1.09–1.19) 0.91 (0.88–0.94)

15 19

rs74615166 rs4147929

64725490 1063443

C/T A/G

0.02 0.19

1.31 (1.17–1.42) 1.15 (1.11–1.19)

CD33

19

rs3865444

51727962

A/C

0.307

0.94 (0.91–0.96)

CASS4

20

rs7274581

55018260

C/T

0.083

0.88 (0.84–0.92)

Ruiz et al, 2014a Naj et al, 2011; ­Hollingworth et al, 2011 Naj et al, 2011; Hollingworth et al, 2011 Lambert et al, 2013

21

rs63750847/ 27269932 A673T rs75932628/ 41129252 R47H

A/G

0.0045

0.24

Jonsson et al, 2012

T/C

0.0063

2.26 (1.71–2.98)

Jonsson et al, 2013; Guerreiro et al, 2013

References

Naj et al, 2011; Hollingworth et al, 2011 Harold et al, 2009 Lambert et al, 2013 Lambert et al, 2013 Lambert et al, 2013

APPc TREM2d

6

Note: Data extracted from Lambert et al., 2013 except (a) Alzgene, (b) Ruiz, Heliman, et al., 2014, (c) Jonsson et al., 2012 and (d) Jonsson et al., 2013.

Vincent Chouraki and Sudha Seshadri

Rare Variants

Genetics of Alzheimer’s Disease

263

Figure 5.2  Manhattan Plot of Known Genetic Risk Factors for Alzheimer’s Disease. This plot represents the p-values (y-axis), transformed as − log(p), testing the genetic associations between SNPs and risk of Alzheimer’s disease in the IGAP meta-analysis (Lambert et al., 2013) along the genome (x-axis; hg19). Linkage peaks reported by Butler et al. (2009) are represented as gray rectangles (darker gray = genome-wide suggestive; lighter gray = genome-wide nominal). The horizontal red line represents the genome-wide significance threshold (5 × 10−8). The black and green dots represent p-values obtained in the discovery stage, the red dots represent the p-values of the top hit in each locus, after combining discovery and replication stages. p-values for the APOE locus have been truncated for readability. Locations of the APP, PSEN1, PSEN2, and TREM2 genes are also represented. This figure was inspired by Bertram, Lill, and Tanzi (2010) and was generated using R and the qqman R package. Data were extracted from Lambert et al. (2013), Butler et al. (2009) and downloaded from http://www.pasteur-lille.fr/en/ recherche/u744/Igap_stage1.zip.

264

Vincent Chouraki and Sudha Seshadri

been reported in a GWAS of AD in Caribbean Hispanic populations (Lee et al., 2011). Finally, a GWAS of AD in African-American subjects reported genome-wide significant associations in ABCA7, and nominal replication of the associations noted with CR1, BIN1, EPHA1, and CD33 using genebased analyses (Reitz et al., 2013). Although the loci are deemed replicated, the SNPs involved, allele frequencies, direction of effects and strength of association may differ in minorities compared to European populations due to differing linkage disequilibrium with the causal variant (see Table 5.2). 4.1.3.4 Endophenotypes

Endophenotypes are biomarkers that are genetically correlated with disease liability, can be measured in all individuals (both affected and unaffected), and that provide greater power to identify disease-related genes than does disease “yes/no” status alone (Gottesman & Gould, 2003; Glahn,Thompson, & Blangero 2007). The endophenotype is usually less genetically complex than the disorder it underlies due both to the endophenotype’s relative proximity to gene expression in the chain of events leading from gene to disease, and to the increased probability that it reflects just one of likely several pathophysiological pathways that combine to result in clinical disease. Because the endophenotype is likely influenced by fewer genetic risk factors than the clinical disease as a whole, it can tell us something about the biological pathway through which a gene might act. Other advantages of the endophenotype strategy are the greater power to detect associations since even asymptomatic carriers of the risk allele typically show changes in the endophenotype. One robust endophenotype of AD appears to be hippocampal volume that is lower in PSEN1 mutation carriers (compared to noncarriers) over a decade prior to onset of clinical disease (Bateman et al., 2012). Brain amyloid burden on PET scan, cognitive changes, especially in verbal memory, and plaque and tangle burden are other well studied AD endophenotypes. Thus, studying quantitative endophenotypes directly or indirectly related to AD might provide another layer of evidence toward the biological relevance of a putative association with AD. Further, the population significance of an association is greater if it is associated not just with a greater risk of AD but with lower function in the larger sample of all older adults. Associations of the CLU, CR1, BIN1, PICALM, ABCA7, and CD2AP risk variants have been reported with an earlier age at onset (Thambisetty, An, & Tanaka, 2013, Thambisetty, Beason-Held, et al. 2013), greater burden of AD brain pathology (Biffi et al., 2012; Chibnik et al., 2011; Kok et al.,

Reference

CR1 BIN1 BIN1

1 2 2

rs6656401 rs744373 rs744373

207692049 127894615 127894615

A/G G/A G/A

0.019 0.314 0.33

1.76 (1.19–2.60) 1.14 (1.03–1.25) 1.25 (1.11–1.40)

AS AS AS

CLU CLU CLU

8 8 8

rs9331949 rs11136000 rs9331888

27454686 27464519 27468862

C/T T/C G/C

0.205 0.261 0.397

1.29 (1.09–1.52) 0.85 (0.78–0.93) 1.11 (0.77–1.61)

AS AS AS

MS4A6A 11

rs610932

59939307

G/T

0.469

1.61 (1.21–2.14)

AS

MS4A6A 11 PICALM 11

rs610932 rs677909

59939307 85757589

T/G C/T

0.363 0.414

0.72 (0.59–0.88) 0.63 (0.49–0.81)

AS AS

PICALM 11 PICALM 11

rs3851179 rs3851179

85868640 85868640

T/C T/C

0.382 0.39

0.88 (0.81–0.96) 0.80 (0.73–0.89)

AS AS

SORL1

11

rs11218343

121435587

C/T

0.34

0.81 (0.75–0.87)

AS+EA

ABCA7

19

rs115550680

1050420

G/A

0.07

1.79 (1.47–2.12)

AA

CD33

19

rs3865444

51727962

A/C

0.238

2.08 (1.53–2.85)

AS

CD33

19

rs3865444

51727962

A/C

0.173

1.49 (1.19–1.87)

AS

Jin et al, 2012a Liu et al, 2013c Miyashita et al, 2013 Yu et al, 2013 Liu et al, 2014 Komatsu et al, 2011 Deng et al, 2012 Tan et al, 2013a Chung et al, 2013 Liu et al, 2013b Miyashita et al, 2013 Miyashita et al, 2013 Reitz et al, 2013 Deng et al, 2012 Tan et al, 2013a

Note: AA = African American; AS = Asian; EA = European Ancestry; Data extracted from given references.

265

Ethnicity

Genetics of Alzheimer’s Disease

Table 5.2  Replication of GWAS Hits for Alzheimer’s Disease in Non-European Populations Odds Ratio (95% Position Effect/Other Effect Allele Confidence Locus Chromosome Variant (hg19) Alleles Frequency Interval)

266

Vincent Chouraki and Sudha Seshadri

2011; Shulman et al., 2013), more abnormal levels of cerebrospinal fluid biomarkers (Elias-Sonnenschein et al., 2013; Kauwe et al., 2011; Schjeide et al., 2011; Schott & A. D. N. I. Investigators, 2012), changes in total brain volume and white matter hyperintensities on brain MRI (Biffi et al., 2010; Bralten et al., 2011; Braskie et al., 2011; Erk et al., 2011; Furney et al., 2011; Green et al., 2014; Lancaster et al., 2011; Melville et al., 2012), EEG (­Ponomareva et al., 2013), and lower cognitive function (Barral et al., 2012; Chibnik et al., 2011; Engelman et al., 2013; Erk et al., 2011; Green et al., 2014; Lancaster et al., 2011; Mengel-From, Christensen, McGue, & C ­ hristiansen, 2011, Mengel-From et al., 2013; Pedraza et al., 2014; Schmidt,Wolff, Ahsen, & Zerr, 2012; Sweet et al., 2012; Thambisetty, Beason-Held, et al., 2013). A large GWAS of hippocampal volumes on over 20,000 persons identified several putative genes associated with apoptosis (HRK), transforming growth factor antagonism (LEMD3), neuronal migration (ASTN2), oxidative stress (MSRB3), brain development (WIF1), the ubiquitin pathway (FBXW8), and a gene (DPP4) encoding an enzyme, which is the target of the incretin class of antidiabetic medications such as sitagliptin (Bis et al., 2012). 4.1.3.5  Identification of Functional Variants and Functional Pathways

The majority of these genetic loci have only been identified through statistical testing for association between a set of SNPs and the risk of disease. These SNPs are mostly proxies, in linkage disequilibrium with “true” functional variants. It has been usual to link the genetic variant reaching genome-wide significance with the lowest p-value to the closest gene (see Table 5.1) at a locus, but each locus typically includes several genes of interest.Thus, once a candidate locus has been identified, the next objective is to identify functional variants within this locus, explore their effects on a gene, or a set of genes, and study how these effects might relate to the development of AD. Identification of functional variants is based on sequencing or genetic imputation of as many variants as possible in the region of interest, identification of the specific variants associated with an altered risk of AD, followed by a study of their qualitative—alteration of the sequence of amino acid due to a nonsynonymous-coding SNP, preferential expression of a particular splice variant—or quantitative—modulation of gene expression— impact on adjacent genes (Bettens, Sleegers, & Van Broeckhoven, 2013). Guerreiro et al. (2010) studied the CLU locus and found that none of the 24 common coding variants identified among 495 AD cases and 330 controls was associated with risk of AD or with total CLU gene expression in the brain, and suggested that either weak, hard-to-detect effects on

Genetics of Alzheimer’s Disease

267

resting gene expression, or variations in specific isoforms or in damageinduced expression might explain the association. Szymanski,Wang, Bassett, and Avramopoulos (2011) reported an association between rs9331888, one of the GWAS identified SNPs at this locus and preferential expression of one of the CLU isoforms, NM_203339 (different isoforms appear to be tightly regulated and to have varying effects). Bettens et al. (2012) reported the coexistence of rare nonsynonymous-coding variants, insertion–­deletions, and frequent variants in CLU, acting independently on the risk of AD. The PICALM is involved in clathrin-mediated endocytosis occurring at the plasma membrane. In the PICALM gene, Schnetz-Boutaud et al. (2012) failed to identify new variants after sequencing the gene in 48 cases and 48 controls, but noted that a previously described splice variant in LD with the GWAS hit could play a causal role. Ferrari et al. (2012) identified several rare coding variants in the PICALM region, none of which was however associated with risk of AD. In a study of predicted pathogenicity of nonsynonymous SNPs in PICALM, Masoodi, Al Shammari, Al-Muammar, Alhamdan, and Talluri (2013) reported one SNP, rs12800974 (T158P) that was predicted to be deleterious. Finally, in two yeast models, the PICALM ortholog was important for Aβ toxicity but the direction of effect was different in the two studies (Treusch et al., 2011; D’Angelo et al., 2013). Levels of clathrin-mediated endocytosis proteins, including PICALM, were increased in the brain of an amyloid mouse model of AD compared to wild-type mice (Thomas, Lelos, Good, & Kidd, 2011). Modulation of PICALM expression in vitro and in vivo resulted in modulation of Aβ production (Xiao et al., 2012). PICALM was expressed in neurons and colocalized with APP in endocytic vesicles (Xiao et al., 2012) as part of a complex that could be recognized by autophagosomes and target vesicles containing APP (Tian, Chang, Fan, Flajolet, & Greengard, 2013). This suggests a role for PICALM in Aβ clearance. A study of brain PICALM expression however reported that whereas cleaved fragments of PICALM were found to be increased in AD (LOAD and EOAD) brains compared to controls, expression was noted in neurons, microglia, and colocalized with neurofibrillary tangles only, and no colocalization with aggregated Aβ was observed (Ando et al., 2013). The gene CR1 is located on chromosome 1q32 and encodes complement component (3b/4b) receptor 1. The gene is present as four codominant alleles of various sizes due to genetic duplication and deletions. The complement receptor 1 protein is widely expressed on the surface of blood cells, choroid plexus, microglia, and neurons and, as its name suggests, can bind C3b and C4b, and moderate the activity of the complement system

268

Vincent Chouraki and Sudha Seshadri

(see Crehan, Holtan,Wray, Pocock, Guerreiro and Hardy (2012) for review). Brouwers et al. (2012) and Hazrati et al. (2012) have identified a subregion of CR1 containing two SNPs associated with risk of AD and with Aβ42 levels in the cerebrospinal fluid. Those signals were likely mediated by a copy number variation (CNV) associated with risk of AD and modulating levels of two particular isoforms of CR1, CR1-F and CR1-S, the latter containing an extra binding site for C3b/C4b. C3b and C4b are able to bind to Aβ and could participate in Aβ clearance. An association between a coding variant of CR1 and cognitive decline was reported by Keenan et al. (2012) but this finding could not be replicated in a second cohort (Van ­Cauwenberghe et al., 2013). The BIN1 (Bridge Integrator 1 or Amphiphysin 2) gene is located on chromosome 2q14.3 and encodes several splice variants mostly expressed in the brain and the muscles. BIN1 isoforms are involved in clathrinmediated endocytosis, intracellular trafficking, caspase-independent apoptosis, and interactions with the microtubule cytoskeleton (see Tan, Yu, and Tan (2013) for review). BIN1 is also a key regulator of endocytosis and membrane recycling, cytoskeleton regulation, DNA repair, cell cycle progression, and apoptosis and decreased expression has been associated with centronuclear myopathy, cardiomyopathy, and cancer (Prokic, Cowling, & Laporte, 2014) whereas increased expression is noted in AD. BIN1 has also been implicated in posterior cortical atrophy (Carrasquillo et al., 2014). With regards to LOAD, Chapuis et al. (2013) identified an insertion–deletion (rs59335482) in linkage disequilibrium with the variant reported by Seshadri et al. (2010) that was both associated with increased BIN1 brain expression and LOAD risk. Furthermore, Chapuis et al. (2013) also showed that BIN1 was expressed in neurons and that BIN1 and tau could physically interact. Glennon et al. (2013) found decreased expression of BIN1 in frontal lobes of 24 sporadic AD patients compared to 24 controls which contradicts results from Chapuis et al. (2013), and so both reports need to be confirmed. Finally, Masoodi et al. (2013) found two nonsynonymous SNPs in BIN1, rs11554585 (R397C) and rs11554585 (N106D), that were predicted to be deleterious using bioinformatics approaches. Altered expression of BIN1 has been demonstrated in aging mice, in transgenic mouse models of AD, and in persons with schizophrenia (English, Dicker, Focking, Dunn, & Cotter, 2009;Yang et al., 2008). Amphiphysin 1 (a related protein) knockout mice exhibit decreased synaptic vesicle recycling efficiency, seizures, and cognitive (memory) deficits (Di Paolo et al., 2002) and the protein appears to be the substrate for CDKL5, which gene can be mutated in patients with

Genetics of Alzheimer’s Disease

269

West syndrome and Rett syndrome, severe neurodevelopmental disorders (Sekiguchi et al., 2013). The CD33 gene is located on chromosome 19q13.33, and encodes a member of the sialic acid-binding immunoglobulin-like lectins (Siglec) family of receptors. CD33 is expressed on the surface of myeloid progenitor cells, mature monocytes, and macrophages, and is involved in inhibition of cell activity. Two isoforms have been described, one including the seven exons of the genes, and one without the second exon which encodes the V-set immunoglobin domain essential for the sialic acid-binding activity (see Jiang et al. (2014) for review). High CD33 brain expression has been associated with AD status (Griciuc et al., 2013; Karch et al., 2012) and higher clinical dementia rating scale scores, that is with greater severity of dementia (Karch et al., 2012). Bradshaw et al. (2013) reported that the GWAS SNP, rs3865444, was associated with the surface expression of CD33 on circulating monocytes. The C allele—at risk for AD—was associated with increased expression of CD33 and decreased uptake of Aβ42 uptake by the monocytes. Furthermore, this allele was also associated with greater brain amyloid burden measured by PIB-PET, and a greater proportion of amyloid plaques measured at autopsy. Finally, in the brain, CD33 was expressed on the surface of cells that had attributes of microglia and macrophages, and expression was concentrated around amyloid plaques. Griciuc et al. (2013) reported similar results in an independent study and confirmed that CD33 was involved in Aβ42 uptake by microglial cells through in vitro CD33 inhibition and overexpression experiments and using a CD33 knockout mouse model. The authors also showed that the inhibition of Aβ uptake was mediated through the Ig V-set domain. Finally, Malik et al. (2013) and Raj et al. (2014) reported an association between the AD risk allele of rs3865444 and greater expression of the CD33 isoform containing the Ig V-set domain which could explain the association with AD, given the previous results. As rs3865444 is not in the sequence of the gene, the authors nominated rs12459419, which is in LD with rs3865444 and was able to modulate splicing activity to be the functional SNP. The MS4A (membrane spanning four domains, subfamily A) gene cluster is located on chromosome 11q12 and consists of at least 12 genes with variable expression in several tissues and a suspected role in immune cell functions (Liang, Buckley, Tu, Langdon, & Tedder, 2001; Zuccolo et al., 2010). The GWAS signal spans MS4A4A and MS4A6A and recent studies have reported an association of the GWAS SNPs with MS4A4A brain expression (Allen et al., 2012) and MS4A6A blood and brain expressions

270

Vincent Chouraki and Sudha Seshadri

(Proitsi et al., 2014), which seem to confirm involvement of both genes in AD. Furthermore, Karch et al. (2012) reported an association between MS4A6A expression in the brain and Braak scores at autopsy as well as an association between rs670139, an SNP in MS4A6E, and Braak scores. Nevertheless, additional work is needed to understand the role of the gene cluster at this locus in the biology of AD. EPHA1 is located on chromosome 7q34 and belongs to the ephrin receptor subfamily of the protein tyrosine kinase family and encodes the ephrin type-A receptor 1 protein. This class of proteins are evolutionarily conserved, expressed in the brain, and have been called key components of a “global positioning system” for developing cells in olfactory, cochlear, retinal and thalamocortical pathways (Lackmann & Boyd, 2008). Thus, this family of proteins have been implicated in mediating brain development, particularly mediating axonal guidance. Eph receptors seem to play a similar role in guiding neural plasticity in the adult brain (Gerlai, 2001). Also, they modulate the MAPK pathway and response at glutamatergic synapses (Drescher, 2000; Miao et al., 2001; Kullander & Klein, 2002). In transgenic mouse models of AD, it was found that ephrin receptors were reduced in the hippocampus prior to the development of impaired object recognition and spatial memory and a reduction in Eph receptor levels has been noted in postmortem hippocampal tissue from patients with incipient AD (Simón et al., 2009). CD2AP is located on chromosome 6p12 and encodes CD2 associated protein, a scaffolding protein capable of direct interactions with proteins involved in cytoskeletal organization (Lehtonen, Zhao, & Lehtonen, 2002), resulting in roles in endocytosis (Cormont et al., 2003; Kobayashi, Sawano, Nojima, Shibuya, & Maru, 2004) and cell–cell interactions (Wolf & Stahl, 2003; Calabia-Linares et al., 2011). The AD risk variant was found to be associated with greater neuritic plaque burden (Shulman et al., 2013) and a functional screening of AD candidate genes in drosophila models identified the ortholog of the human CD2AP as a modulator of tau toxicity (Shulman et al., 2014). ABCA7 (ATP-binding cassette, subfamily A (ABCA), member 7) is located on chromosome 19p13.3 and encodes a protein with suspected functions in lipid metabolism and the phagocytosis of apoptotic cells. Higher expression of ABCA7 in the brain has been associated with more severe dementia (Karch et al., 2012) and associations between SNPs near ABCA7 and ABCA7 expression and AD have been reported (Allen et al., 2012). A behavioral study of a ABCA7 knockout mouse model showed

Genetics of Alzheimer’s Disease

271

subtle alteration of memory (Logge et al., 2012). A study crossing these mice with J20 amyloidogenic mice found an increased Aβ brain deposition and plaque load although there was no significant cognitive decline compared to J20 mice alone (Kim et al., 2013). Bone marrow-derived macrophages from these mice showed impaired Aβ uptake in vitro, suggesting a role for ABCA7 in amyloid clearance.

4.2  Rare Variants GWAS are designed to identify frequent variants (with a minor allele frequency >5%) that are associated with risk of complex diseases, including AD. Identification of less frequent (1–5% MAF) and rare (

Genetics of Alzheimer's disease.

Alzheimer's disease (AD) represents the main form of dementia, and is a major public health problem. Despite intensive research efforts, current treat...
1MB Sizes 5 Downloads 11 Views