Gene 551 (2014) 236–242

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Genome-wide detection of allelic gene expression in hepatocellular carcinoma cells using a human exome SNP chip Yon Mi Park a, Hyun Sub Cheong b, Jong-Keuk Lee a,⁎ a b

Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, Republic of Korea SNP-Genetics Inc., Seoul, Republic of Korea

a r t i c l e

i n f o

Article history: Received 8 April 2014 Received in revised form 19 August 2014 Accepted 1 September 2014 Available online 2 September 2014 Keywords: Single nucleotide polymorphism (SNP) Allelic gene expression Monoallelic expression Allelic imbalance Allele substitution Hepatocellular carcinoma cells

a b s t r a c t Allelic variations in gene expression influence many biological responses and cause phenotypic variations in humans. In this study, Illumina Human Exome BeadChips containing more than 240,000 single nucleotide polymorphisms (SNPs) were used to identify changes in allelic gene expression in hepatocellular carcinoma cells following lipopolysaccharide (LPS) stimulation. We found 17 monoallelically expressed genes, 58 allelic imbalanced genes, and 7 genes showing allele substitution. In addition, we also detected 33 differentially expressed genes following LPS treatment in vitro using these human exome SNP chips. However, alterations in allelic gene expression following LPS treatment were detected in only three genes (MLXIPL, TNC, and MX2), which were observed in one cell line sample only, indicating that changes in allelic gene expression following LPS stimulation of liver cells are rare events. Among a total of 75 genes showing allelic expression in hepatocellular carcinoma cells, either monoallelic or imbalanced, 43 genes (57.33%) had expression quantitative trait loci (eQTL) data, indicating that high-density exome SNP chips are useful and reliable for studying allelic gene expression. Furthermore, most genes showing allelic expression were regulated by cis-acting mechanisms and were also significantly associated with several human diseases. Overall, our study provides a better understanding of allele-specific gene expression in hepatocellular carcinoma cells with and without LPS stimulation and potential clues for the cause of human disease due to alterations in allelic gene expression. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Allelic variation in gene expression is common in humans (Lo et al., 2003; Pant et al., 2006) and leads to phenotypic variation, including disease (Maia et al., 2009). Although the majority of genes are expressed equally from both alleles in a diploid genome, some genes are differentially expressed. Generally, four different types of allelic gene expression can be observed, such as equal allelic gene expression, monoallelic gene expression, allelic imbalance and allele substitution by RNA editing. Monoallelic gene expression has been well studied in mammalian Xchromosome inactivation, genomic imprinting, and allelic exclusion (Allen et al., 2003; Yang and Kuroda, 2007; Zakharova et al., 2009). Allelic imbalance is described as one allele having a higher expression than the alternative allele (Mei et al., 2000). Allele substitution by RNA editing has also been frequently observed in many cell types. Allele

Abbreviations: AI, allelic imbalance; eQTL, expression quantitative trait loci; LPS, lipopolysaccharide; MAE, monoallelic expression; SNP, single nucleotide polymorphism. ⁎ Corresponding author at: Asan Institute for Life Sciences, University of Ulsan College of Medicine, 88 Olympic-Ro 43-Gil, Songpa-Gu, Seoul 138-736, Republic of Korea. E-mail address: [email protected] (J.-K. Lee).

http://dx.doi.org/10.1016/j.gene.2014.09.001 0378-1119/© 2014 Elsevier B.V. All rights reserved.

substitution is mainly due to the conversion of the adenosine base to inosine (Bahn et al., 2012; Wulff et al., 2011) and this A-to-I RNA editing event is known as a post-transcriptional process (Bahn et al., 2012). Gene expression is controlled by two regulatory gene expression mechanisms, either cis or trans (Stamatoyannopoulos, 2004). Many human genes are regulated by a cis-acting regulatory mechanism, which is on the same chromosome as the regulated gene (Bray et al., 2003; Buckland, 2004; Lee et al., 2006), whereas trans-acting regulatory mechanisms are relatively rare (Stranger et al., 2005; Zeller et al., 2010). Thus, it is generally believed that cis-regulatory polymorphisms are the primary source of phenotypic differences and are associated with many human diseases. To date, although many studies of allelic gene expression have been performed, including our previous studies (Lee et al., 2013; Song et al., 2012), relatively few cell types and culture conditions have been used. In particular, limited information is available on allelic gene expression after cell stimulation. Furthermore, it has been wellknown that lipopolysaccharide (LPS) stimulation induces a strong inflammatory response in various cell types, including liver cells (Liu et al., 2002; Sweet and Hume, 1996). Thus, in this study, we performed a high-density human exome SNP chip experiment to identify genomewide allele-specific gene expressions in hepatocellular carcinoma cells with and without LPS stimulation.

Y.M. Park et al. / Gene 551 (2014) 236–242

2. Materials and methods 2.1. Hepatocellular carcinoma cell lines and cell culture A total of 11 hepatocellular carcinoma cell lines (SNU-182, SNU-354, SNU-368, SNU-387, SNU-398, SNU-449, SNU-475, SNU-739, SNU-761, SNU-878, and SNU-886) originating from Korean liver cancer patients (Lee et al., 1999; Park et al., 1995) were obtained from the Korean Cell Line Bank (http://cellbank.snu.ac.kr). In addition, genomic DNA from 200 healthy cohort samples was used as a control, which is useful for genotype clustering of rare variants in cancer cells. Hepatocellular carcinoma cell lines were maintained in RPMI-1640 medium with 2 mM L-glutamine and 25 mM HEPES and supplemented with 10% fetal bovine serum and penicillin (1000 U/ml)/streptomycin (100 μg/ml). 2.2. Genomic DNA and RNA isolation and cDNA synthesis Genomic DNA was extracted from hepatocellular carcinoma cell lines using a Qiagen Blood and Cell Culture Mini kit (Qiagen, Hilden, Germany). Total RNA was isolated from hepatocellular carcinoma cells using TRIZOL reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's protocols. cDNA was synthesized from 10 μg of total RNA using the SuperScript III First-Strand cDNA Synthesis System (Invitrogen). cDNA was precipitated with 95% ethanol with 20 μg/μl glycogen and 3 M sodium acetate to use as a template for Illumina Human Exome BeadChip genotyping. 2.3. SNP genotyping For SNP genotyping, 200 ng of genomic DNA and cDNA derived from 10 μg of total RNA was used. SNP genotyping was performed using Illumina Human Exome BeadChips (Illumina, San Diego, CA), according to the manufacturer's instructions. The Illumina Human Exome BeadChip contains more than 240,000 SNPs, including 219,621 nonsynonymous SNPs. Genotyping clustering and calling were performed using GenomeStudio software (Illumina). 2.4. Analysis of allelic gene expression patterns using SNP genotyping data Allelic gene expression was determined by calculating the fluorescence signal ratio between two alleles (allele 1/allele 2) in genomic DNA and cDNA for heterozygous SNPs, as described in our previous studies (Lee et al., 2013; Song et al., 2012). SNPs where the ratio of two alleles in cDNA was ≤0.1 or ≥0.9 were defined as showing monoallelic expression, whereas SNPs with a ratio of two alleles in cDNA of ≤0.3 or ≥0.7 were defined as showing allelic imbalance. Allele substitutions were identified when an allele of cDNA was different from the allele of genomic DNA from the same cell line sample. Genotype clusters of genes showing differential allelic expression were inspected manually on GenomeStudio software and genes showing differential allelic expression were subjected to an eQTL database search (http://www.ncbi.nlm.nih.gov/gtex/GTEX2/ gtex.cgi). eQTL was used to identify genomic loci that regulate gene expression. To predict the potential role of allele substitution in nonsynonymous SNPs, we performed in silico prediction of the functionality of the genetic variants using the PolyPhen program (http:// genetics.bwh.harvard.edu/pph2/). All monoallelic gene expression and allele substitution sites were confirmed by capillary sequencing in genomic DNA and cDNA samples.

237

comparing the signal intensities of the probe before and after LPS treatment. To identify functional categories that were overrepresented within the genes differentially expressed after LPS treatment, the DAVID Bioinformatics Resources 6.7 program (david.abcc.ncifcrf.gov/) was used to classify these genes according to the published protocol for DAVID (Huang et al., 2009). 3. Results 3.1. Profiling of allelic gene expression in hepatocellular carcinoma cells To identify changes in allelic gene expression caused by LPS treatment in liver cells, a total of 11 hepatocellular carcinoma cell lines were treated with LPS (100 ng/ml) for 12 h, and genome-wide allelic gene expression was observed using Illumina Human Exome BeadChips (250K). The genotype call rates (mean ± standard deviation) from the Human Exome BeadChips for genomic DNA, cDNA, and LPS-treated cDNA were 99.72 ± 0.14%, 73.97 ± 2.94%, and 72.99 ± 3.17%, respectively (Fig. 1). The high genotype call rates (~73%) in cDNA samples indicated that most genes were expressed in hepatocellular carcinoma cells. We initially studied allelic gene expression patterns in hepatocellular carcinoma cells without LPS treatment. Genotype plots from the 250K Human Exome BeadChips showed four distinct patterns of allelic gene expression: equal allelic expression, monoallelic expression, allelic imbalance, and allele substitution (Fig. 2). With the use of the signal ratio and genotype plots, we identified 17 genes showing monoallelic expression, 58 genes showing allelic imbalance, and 7 genes (HNRNPCL1, FAM58BP, PRPS1L1, TAF1L, CSNK1A1L, CDC27, and SIRPG) with allele substitution in the cDNA (Tables 1, 2, and 3). All variant sites for monoallelic gene expression and allele substitution were confirmed by capillary sequencing. The 17 validated monoallelically expressed genes (RHD, OBSCN, LOC100131465, TTN, FAM55C, GATA2, TRIM31, ROS1, PEG10, MGAM, ANKRD18A, PTCHD3, OAS1, MAP2K3, ZNF285, ZNF813, and ZNF71) are presented in Table 1 and Fig. 2. In addition, one monoallelic expression SNP (rs1918172) located in an unknown gene region was also identified (Table 1). 3.2. Changes in allelic gene expression in hepatocellular carcinoma cell lines following LPS treatment To identify the changes in allelic gene expression in hepatocellular carcinoma cell lines after LPS treatment (100 ng/ml) for 12 h, we

2.5. Analysis of allelic gene expression in hepatocellular carcinoma cells after LPS treatment To identify quantitative and qualitative changes in gene expression after LPS stimulation, we treated hepatocellular carcinoma cells with LPS (100 ng/ml) for 12 h in vitro. We detected differential gene expression, either upregulation (N 1.5-fold) or downregulation (b1.5-fold), by

Fig. 1. Genotype call rates of the human exome (250K) SNP chip in DNA samples prepared from hepatocellular carcinoma cell lines (n = 11). Genomic DNA (gDNA), cDNA, and cDNA prepared from LPS-treated mRNA (cDNA-LPS) were prepared from each cell line, as described in the Materials and Methods. Genomic DNA samples isolated from healthy individuals (n = 200) were also used as a cancer cell line control in the process of genotype call analysis.

238

Y.M. Park et al. / Gene 551 (2014) 236–242

Fig. 2. Detection of allelic gene expression in hepatocellular carcinoma cell lines (n = 11) using an exome SNP chip. Examples are shown of gene expression patterns, including equal allelic expression (A), monoallelic expression (B), allelic imbalance (C), and allele substitution (D). Each dot indicates the genotype of each sample. A blue dot represents genomic DNA (gDNA; n = 11) and a red dot represents cDNA (n = 11). The x-axis and y-axis represent the fluorescence signal intensities of SNP allele 1 (shown as A) and SNP allele 2 (shown as B), respectively. A total of 18 monoallelically expressed genes were confirmed by sequencing in both gDNA and cDNA (E).

Table 1 Monoallelically expressed genes in hepatocellular carcinoma cell lines (n = 11). Chr.

Gene

rs #

Amino acid change

No. of samples heterozygous in gDNA (A)

No. of samples with monoallelic expression in cDNA (B)

% monoallelic (B/A × 100%)

No. of allele(s) involved in monoallelic expression

Association (GENE at NCBI)

1 1 1 2

RHD OBSCN LOC100131465 –

rs590787 rs12035900 rs3738443 rs1918172

Y311S – – –

2 3 2 2

2 2 2 2

100 67 100 100

1 2 1 1

2 3 3

TTN FAM55C GATA2

rs72648270 rs3796277 rs78245253

I30154F T507I P250A

2 3 2

2 3 2

100 100 100

1 2 1

6 6 7 7 9 10 12 17 19 19 19 Total

TRIM31 ROS1 PEG10 MGAM ANKRD18A PTCHD3 OAS1 MAP2K3 ZNF285 ZNF813 ZNF71 17 genes

rs2023472 rs9489124 rs3750105 rs2272330 rs1832313 rs77473776 rs10774671 rs58609466 rs12610859 rs10422163 rs2072501

– E1902K S605Y Q404H E130K Q186K – T222M R536G Y439F V105I

2 2 5 3 4 3 6 3 4 3 5

2 2 5 3 3 3 5 2 3 3 3

100 100 100 100 75 100 83 67 75 100 60

2 1 2 2 2 1 1 1 2 2 1

RhD blood group Macular degeneration Alcoholism Attention deficit hyperactivity disorder Cardiomyopathy – Leukemia, immune deficiency, myelodysplasia – Heart rate, lung cancer – Alcohol drinking – Body composition Glucose and adiponectin level – Myopia Coronary artery disease –

Monoallelic expression was determined from the ratio of the two alleles in cDNA (≤0.1 or ≥0.9). Genotype plots were also inspected manually.

Y.M. Park et al. / Gene 551 (2014) 236–242

analyzed the differentially expressed genes, either upregulated or downregulated, from high-density human exome SNP chip data. We detected a total of 33 differentially expressed genes by LPS treatment, including 31 upregulated genes and 2 downregulated genes (N1.5-fold change) (Supplementary Table 1). According to DAVID program analysis, as expected, the differentially expressed genes were significantly enriched in the biological pathway of inflammatory response (10 genes;

239

p = 1.57 × 10−8) and defense response (12 genes; p = 2.57 × 10−8) (Supplementary Table 2). To investigate the changes in allelic gene expression following LPS treatment in hepatocellular carcinoma cells, we examined all genes with a significant change in relative allele frequency in cDNA samples after LPS treatment. We identified significant changes in relative allele frequency in cDNA after LPS treatment in only three genes (MLXIPL, TNC, and MX2) (Fig. 3C and D). However, the allelic change

Table 2 Genes with allelically imbalanced expression in hepatocellular carcinoma cell lines (n = 11). Chr.

Gene

rs #

Amino acid change

No. of samples with No. of samples with % allelic heterozygous in allelic imbalance imbalance gDNA (A) in cDNA (B) (B/A × 100%)

No. of allele(s) involved in allelic imbalance

Association (GENE at NCBI)

1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 5 5

OSCP1 CRYZ CNN3 RGS5 RNF2 TTC27 LTBP1 ASB3 C2orf74 ANKRD53 TGOLN2 NCAPH LRP1B HIBCH TOP2B SLC4A7 CCDC13 CCDC66 COMMD2 UGT2B7 MTRR ITGA1

rs61308377 rs3819946 rs3789699 rs3806366 rs1046592 rs2273664 rs2290427 rs17521008 rs1729674 rs3796098 rs4240199 rs2305935 rs35546150 rs291466 rs61751634 rs75615379 rs17238798 rs139806311 rs11549572 rs12233719 rs162036 rs12520591

Y209H I39V – – – R525H P1009Q – Y37D L326L L441P V539A Q3734K M1T T1486M N756S R25W Q434R – A71S K350R I961M

3 2 2 3 3 2 2 3 3 2 2 2 2 2 2 2 3 2 3 2 2 2

3 3 2 4 3 2 2 5 3 3 2 2 2 2 2 2 3 2 3 2 3 2

100 67 100 75 100 100 100 60 100 67 100 100 100 100 100 100 100 100 100 100 67 100

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

5 5 5 5 6 6 6 6 6 6 6 7 8 8 9 9 9 10 10 10 10 11 11 12 12 12 13 13 13 14 14 14 15 18 18 19 20 Total

TMEM173 SLC25A2 PCDHGA1 TMED9 ECI2 MICD HLA-E SKIV2L TAP2 PEX6 MDN1 SFRP4 ASAH1 PYCRL CNTLN SVEP1 WDR5 ANKRD26 ZFAND4 FRA10AC1 GSTO1 ZNF215 TPCN2 KIF21A MON2 ZNF268 MTMR6 UTP14C CUL4A DHRS4L2 DHRS4L2 NIN TTLL13 OSBPL1A SLC14A1 B3GNT3 PYGB 58 genes

rs7380824 rs10075302 rs3749767 rs68036126 rs7166 rs2256902 rs1264457 rs419788 rs241448 rs1129187 rs41273327 rs1802073 rs1071645 rs2242089 rs3808794 rs17204533 rs11556390 rs12359281 rs41301625 rs726817 rs4925 rs11041115 rs2376558 rs75223821 rs11174549 rs36127550 rs7995033 rs3742289 rs2302757 rs2273946 rs2273947 rs10140023 rs2063743 rs9635963 rs1058396 rs36686 rs2228976

R293Q G159C – – A344V – G128R – X687Q P939Q R4266G P320T V72M V117M E291D T3559M – I425V – R16H A112D S263F L564P E1237D I1385V L679F I319V G85V K544R Q2H M19L – T262I – D280N R328H A303S

3 2 2 3 2 2 2 2 2 2 2 2 2 3 2 3 2 2 2 2 2 2 2 3 2 2 4 2 3 2 2 3 3 2 2 2 3

5 2 2 3 3 2 3 3 3 2 2 3 2 4 2 3 2 2 2 3 3 3 3 3 2 3 4 2 4 2 2 4 5 3 2 2 3

60 100 100 100 67 100 67 67 67 100 100 67 100 75 100 100 100 100 100 67 67 67 67 100 100 67 100 100 75 100 100 75 60 67 100 100 100

1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1

Neuropsychological tests, diabetic retinopathy – Stearic acid plasma level Bipolar disorder, tunica media, hypertension – Hippocampal atrophy Body height, lipoproteins, VLDL Childhood obesity Crohn's disease – – – Body mass index, aging 3-Hyroxyisobutyryl-CoA hydrolase deficiency Electrocardiography Breast neoplasm – Hip – – – γ-Glutamyltransferase, respiratory function Cytokine response – Asthma – – – Psoriasis, Bechet syndrome Macular degeneration Type 1 diabetes mellitus, multiple sclerosis Peroxisome biogenesis disorders Body weight – Farber's lipogranulomatosis, obesity – Gout Asthma, longevity – Thrombocytopenia – – – – Hair color Congenital fibrosis of the extraocular muscles – – – – – – – Seckel syndrome – Sex hormone-binding globulin, osteoporosis Urinary bladder neoplasm – –

One additional allelically imbalanced SNP (rs7550381) was also detected in an unknown gene region. Allelic imbalance gene expression was determined from the ratio of the two alleles in cDNA (≤0.3 or ≥0.7). Genotype plots were also inspected manually. Fifty-four genes showed unidirectional allelic imbalance; four genes showed bidirectional allelic imbalance.

240

Y.M. Park et al. / Gene 551 (2014) 236–242

Table 3 Genes with allele substitution in hepatocellular carcinoma cell lines (n = 11). Chr.

Gene

rs #

Type of substitution

Functional consequence (amino acid change)

Position in cDNA sequence

1 1 7 9 13 17 20 Total

HNRNPCL1 FAM58BP PRPS1L1 TAF1L CSNK1A1L CDC27 SIRPG 7 genes

rs2359484 rs10919847 rs3800962 rs146082915 rs186444083 rs75990396 rs62641735 7 SNPs

C to G C to A C to A T to C G to A C to G C to T

Coding variant (L148V) Coding variant (T236N) Coding variant (E279D) Coding variant (M615V) Coding variant (R135X) Coding variant (S194C) Coding variant (S108N)

NM_001013631.1: c.442GNC NM_001105517.1: c.707CNA NM_175886.2: c.837GNT NM_153809.2: c.1843ANG NM_145203.5: c.403CNT NM_001256.3: c.581CNG NM_018556.3: c.323GNA

PolyPhen prediction Benign Benign Benign (Stop codon) Benign Benign

Association (GENE at NCBI) – – – – – – T1DM

Allele substitution was identified by comparing the genotypes of genomic DNA and cDNA from hepatocellular carcinoma cells. Genotype plots were also inspected manually.

in gene expression after LPS treatment was observed in one sample only. These data indicate that changes in allelic gene expression caused by LPS treatment in liver cells are rare events. In addition, interestingly, LPS treatment of hepatocellular carcinoma cells induced slightly higher gene expression (detected with higher signal intensity), although statistical significance was not reached (paired t-test, p = 0.217). LPS treatment shifted cells from an allelic imbalanced gene expression to equal allelic gene expression (Fig. 3C), indicating that less actively expressed alleles were predominantly upregulated by LPS stimulation.

3.3. Potential regulatory mechanism of allelic gene expression in hepatocellular carcinoma cells Among 75 genes showing allelic expression in hepatocellular carcinoma cells, either monoallelic or imbalanced, a total of 43 genes (57.33%) were identified as having an eQTL (Table 4), indicating that human exome SNP chip-based screening is a reliable and efficient method for studying allelic gene expression. Of the genes showing allelic expression, 43 had eQTL data from various cells or tissues: 35 showed cis-eQTL regulation (81.40%), 3 showed trans-eQTL regulation (6.98%), and 5 showed both cis and trans-eQTL tissue-specific regulation (11.63%) (Table 4).

These data indicate that cis-acting regulation of gene expression is the major regulatory mechanism of allelic gene expression in liver cells. In addition, many genes showing allelic gene expression have been associated with diseases (Table 4). For example, the TAP2 gene, which showed allelic imbalance, is strongly associated with type 1 diabetes mellitus (rs1015166; p = 8.680 × 10− 107). The TPCN2 gene, which showed allelic imbalance, is also significantly associated with hair color (rs35264875; p = 4.00 × 10− 30) and prostate cancer (rs7130881; p = 8.00 × 10− 13 ) (Table 4). Furthermore, the monoallelically expressed OAS1 gene is also significantly associated with gammaglutamyltransferase level (rs11066453; p = 6.00 × 10−44). This strong association of genes showing allelic expression with certain human diseases suggests that allelic gene expression may play an important role in the development and progression of human diseases.

4. Discussion Allelic gene expression plays an important role in phenotypic variation, including human disease. Approximately 18% of human genes have shown allele-specific gene expression (Dimas et al., 2008; Djebali et al., 2012). To date, however, most allelic gene expression studies have

Fig. 3. Changes in allelic gene expression following LPS treatment in hepatocellular carcinoma cell lines. Exome SNP chip-based analysis detected a total of 33 differentially expressed genes, including a 3.85-fold upregulation of the SAA2 gene (A) and a 0.62-fold downregulation of the COL4A4 gene (B). A list of all 33 differentially expressed genes is shown in Supplementary Table 1. Three genes (TNC, MX2, and MLXIPL) showed a change in allelic gene expression after LPS treatment in hepatocellular carcinoma cells (C and D).

Y.M. Park et al. / Gene 551 (2014) 236–242

241

Table 4 eQTL associations of genes showing differential allelic expression. Chr. Type of gene expressiona

Gene (coding SNP; rs #)

eQTL associated with genes showing allelic expression (data from GTEx eQTL database)

Phenotypic association of genes showing allelic expression (NCBI GENE)

rs #

p-Value

SNP associated with disease (rs #)

1 1

MAE AI

RHD (rs590787) OSCP1 (rs61308377)

rs11802413 Upstream 104 kb rs3738836 Within gene +2 kb

Liver Lymphoblastoid

2.22 × 10−21 2.14 × 10−17

– rs10493074

1 1 1

AI AI AI

CRYZ (rs3819946) CNN3 (rs3789699) RGS5 (rs3806366)

rs10890142 Within gene +8 kb rs6541374 Downstream 1 kb rs10812599 Chr.9: 27,453,326

Lymphoblastoid Liver Brain cerebellum Brain temporal cortex Liver Lymphoblastoid Liver

4.33 × 10−15 1.32 × 10−19 6.95 × 10−9

rs10493074 – – rs12566267

1.74 × 10−8

rs937925

2.76 × 10−13 6.24 × 10−5 6.26 × 10−8



Distance from allelically Tissue type expressed gene

rs2886666

Chr.3: 159,695,932 Upstream 65 kb Upstream 568 kb Within gene +78 kb

1 2 2

AI AI AI

RNF2 (rs1046592) TTC27 (rs2273664) LTBP1 (rs2290427)

rs6674490 rs642472 rs4952336

2

AI

ASB3 (rs17521008)

rs10190578 Within gene +37 kb

Liver

8.46 × 10−7

2 2 2

AI AI AI

C2orf74 (rs1729674) TGOLN2 (rs4240199) LRP1B (rs35546150)

2 3 3

AI MAE AI

HIBCH (rs291466) FAM55C (rs3796277) TOP2B (rs61751634)

3 5

AI AI

SLC4A7 (rs75615379) MTRR (rs162036)

rs1729660 rs1053561 rs7565586 rs6879074 rs291472 rs10901194 rs6799331 rs1828591 rs9310843 rs161869

Downstream 6 kb Within gene +910 bp Upstream 26 kb Chr.5: 17,338,371 Upstream 6 kb Chr.9: 135,484,384 Within gene +64 kb Chr.4: 145,480,779 Downstream 43 kb Within gene +9 kb

Lymphoblastoid Liver Liver Liver Liver Liver Lymphoblastoid Liver Liver Lymphoblastoid

3.52 9.14 8.28 2.70 3.30 2.64 9.47 8.89 2.34 1.55

5

AI

ITGA1 (rs12520591)

rs4074793

Within gene +109 kb

Liver

3.20 × 10−9

5 5

AI AI

PCDHGA1 (rs3749767) TMED9 (rs68036126)

6 6

AI AI

ECI2 (rs7166) HLA-E (rs1264457)

rs6863411 rs2713589 rs11746443 rs9405254 rs3095341

Upstream 621 kb Chr.3: 128,290,207 Downstream 221 kb Upstream 929 kb Upstream 265 kb

Liver Liver Lymphoblastoid Lymphoblastoid Liver

3.47 1.76 6.25 1.74 2.21

6

AI

TAP2 (rs241448)

rs2071473

Downstream 7 kb

9.68 × 10−13

6 6 8 9

AI AI AI AI

PEX6 (rs1129187) MDN1 (rs41273327) ASAH1 (rs1071645) CNTLN (rs3808794)

rs2296804 rs368873 rs7828904 rs10963072

Downstream 351 bp Downstream 460 kb Upstream 231 kb Within gene +244 kb

Brain temporal cortex Brain pons Lymphoblastoid Lymphoblastoid Liver

9

AI

SVEP1 (rs17204533)

rs7149441

Chr.14: 75,818,546

Liver

3.12 × 10−6

10 10 10 11 11

AI AI AI AI AI

ANKRD26 (rs12359281) ZFAND4 (rs41301625) GSTO1 (rs4925) ZNF215 (rs11041115) TPCN2 (rs2376558)

rs3781099 rs4418752 rs703333 rs10839663 rs11228490

Within gene +26 kb Downstream 8 kb Downstream 104 kb Within gene +15 kb Upstream 3 kb

Liver Liver Liver Liver Liver

1.43 1.11 2.49 6.55 7.63

12

MAE

OAS1 (rs10774671)

rs4767030

Upstream 2 kb

Liver

7.91 × 10−40

12 12

AI AI

MON2 (rs11174549) ZNF268 (rs36127550)

13 13 14

AI AI AI

UTP14C (rs3742289) CUL4A (rs2302757) NIN (rs10140023)

18

AI

OSBPL1A (rs9635963)

rs11174604 rs991811 rs10870511 rs9596457 rs7985666 rs10942662 rs6572681 rs1860040 rs17203115

Upstream 92 kb Chr.4: 100,510,858 Downstream 396 kb Downstream 906 kb Downstream 402 kb Chr.5: 91,473,725 Downstream 69 kb Upstream 369 kb Within gene +41 kb

Lymphoblastoid Liver Lymphoblastoid Lymphoblastoid Lymphoblastoid Brain pons Liver Lymphoblastoid Liver

9.01 1.47 2.09 1.73 3.21 1.23 4.59 1.20 6.64

19 19 20

MAE MAE AI

ZNF285 (rs12610859) ZNF813 (rs10422163) PYGB (rs2228976)

rs344781 rs6509765 rs2257991

Downstream 715 kb Downstream 28 kb Within gene +42 kb

Liver Liver Lymphoblastoid

1.16 × 10−3 3.96 × 10−13 3.44 × 10−10

× × × × × × × × × ×

10−44 10−18 10−12 10−6 10−23 10−6 10−16 10−7 10−8 10−23

Association

p-Value (b1 × 10−5)

Neuropsychological tests Blood vessels

1.68 × 10−6

Bipolar disorder

8.82 × 10−6

Tunica media

9.41 × 10−6

rs6714546 rs2290447 rs10496009 rs10496009 – – rs657322 rs12474609 rs291465 – rs10510569

Body height Lipoproteins, VLDL Cholesterol, HDL Body height

2.00 8.89 4.84 4.29

Body mass index Aging Body mass index

8.61 × 10−10 6.00 × 10−9 9.56 × 10−8

Electrocardiography

5.21 × 10−7

rs4973768 rs10512966 rs10512966 rs4074793

Breast neoplasms Body height Cholesterol, LDL γGlutamyltransferase Respiratory function

2.00 1.72 5.26 3.00

Uric acid Psoriasis Bechet syndrome Diabetes mellitus, type 1

1.00 7.49 4.51 8.68

Body weight

6.53 × 10−6

Gout Asthma Asthma Longevity

4.56 9.00 7.00 1.00

Hair color Prostatic neoplasm γglutamyltransferase

4.00 × 10−30 8.00 × 10−13 6.00 × 10−44

Erythrocytes

9.19 × 10−6

Sex hormonebinding globulin Osteoporosis

2.00 × 10−7

rs1551943

5.72 8.60 1.40 2.01

× × × × ×

× × × ×

× × × × ×

× × × × × × × × ×

10−4 10−6 10−5 10−5 10−3

10−31 10−6 10−5 10−6

10−8 10−10 10−3 10−23 10−49

10−5 10−7 10−4 10−4 10−6 10−9 10−6 10−4 10−6

1.97 × 10−6

× × × ×

× × × ×

10−9 10−7 10−10 10−9

10−8 10−8 10−8 10−10

2.00 × 10−6

– rs6942328 rs10484552 rs1264459 rs1015166 – rs764108 – rs10511634 rs2383024 rs1889321 rs1327533 – – – rs35264875 rs7130881 rs11066453 – rs7953302

× × × ×

× × × ×

10−6 10−19 10−6 10−107

10−11 10−6 10−7 10−6

– –

rs9635963 rs7227401 –

4.00 × 10−7

The eQTL (expression quantitative trait loci) database (http://www.ncbi.nlm.nih.gov/gtex/GTEX2/gtex.cgi) was searched to find highly associated regulatory SNPs of genes showing differential allelic expression. SNPs with the highest p-value were selected for each corresponding gene. A total of 35 cis-eQTL-, 3 trans-eQTL-, and 5 both cis- and trans-eQTL-regulated genes are shown. a AI = allelic imbalance; MAE = monoallelic expression.

242

Y.M. Park et al. / Gene 551 (2014) 236–242

focused on unstimulated lymphoblastoid cell lines (Cheung et al., 2003; Gimelbrant et al., 2007; Pant et al., 2006). In this study, we used a highdensity human exome SNP chip containing more than 240,000 SNPs to analyze allelic gene expression in liver cancer cells with and without LPS stimulation. We detected a total of 18 monoallelically expressed loci containing 17 genes (Fig. 2 and Table 1). In our previous studies on allelic gene expression profiling using an Illumina NS12K SNP chip containing more than 11,000 SNPs, we found five monoallelically expressed genes in B cells and two monoallelically expressed genes in colon cancer cells (Lee et al., 2013; Song et al., 2012). These data indicate that highdensity exome SNP chips (N 20-fold higher SNP density) are useful for identifying a large number of genes showing allelic expression. Allelic gene expression assays can also be used to identify potential regulatory mechanisms of gene expression, either cis-acting or transacting (Lee et al., 2006; Sun et al., 2010). In addition, allelic imbalance can be expressed as two directional patterns. Unidirectional allelic imbalance is differential expression with the same allele being expressed at higher levels in each heterozygote, whereas bidirectional allelic imbalance is differential expression with the different allele being expressed at higher levels in different heterozygotes. In our study, of the 58 allelically imbalanced genes in hepatocellular carcinoma cells, 54 genes showed unidirectional allelic imbalance, and only 4 genes showed bidirectional allelic imbalance (Table 2). The high frequency of unidirectional allelic gene expression suggests that most allelic gene expressions are regulated by a cis-acting regulation mechanism. Furthermore, among 75 genes showing allelic expression in hepatocellular carcinoma cells, either monoallelic or imbalanced, a total of 43 genes (57.33%) were identified as having eQTL data, indicating that human exome SNP chip-based allelic gene expression profiling is a reliable and effective method for studying allelic gene expression regulation. Allele substitution occurs due to RNA editing. Adenosine-to-inosine editing is modified by adenosine deaminase, RNA-specific (ADAR) enzymes (Amariglio and Rechavi, 2007). ADAR enzymes are expressed in various tissues and are active in mammals (Ohman, 2007). In this study, we initially identified 141 loci showing allele substitutions, with a particularly high occurrence at splice sites (72.7%). However, all allele substitutions occurring at splice sites were false-positive cases when we validated the results using capillary sequencing (data not shown). Our previous studies using NS12K SNP chips did not detect allele substitutions at splice sites because NS12K SNP chips do not contain SNPs at splice sites. However, Illumina Human Exome BeadChips contain 10,675 SNPs selected from splice sites. These data indicate that the exon–intron boundary region of splice sites in cDNA templates cannot be genotyped accurately in cDNA samples. We also validated allele substitution present in gene coding regions by capillary sequencing. Out of 33 loci showing allele substitutions at coding regions, only 7 genes (21.2%) showed true allele substitution. Our results indicate that all allele substitutions detected by SNP chip should be validated by capillary sequencing. In our previous studies, we identified several genes showing allelic expression in CEPH family B cells (Song et al., 2012) and colorectal cancer cells (Lee et al., 2013). To identify cell-type specific allelic gene expression, here we compared the allelic gene expression in hepatocellular carcinoma cells with the allelic gene expression in CEPH family B cells and colorectal cancer cell data. All genes showing differential allelic expression in hepatocellular carcinoma cells were different from those seen in B cells and colon cancer cells except for the UTP14C gene, which was an allelic imbalanced gene in all three cell types. These data suggest that the three cell types contain distinct gene expression groups. In the case of allele substitution, one gene (LOC344382) also showed allele substitution in B cells. In conclusion, the high-density human exome SNP chip is very useful for the screening and large-scale identification of allelic gene expression in hepatocellular carcinoma cells. The genes showing differential allelic gene expression detected with or without LPS stimulation in this study will provide new information on the regulation of allelic gene expression. In addition, the integrated analysis of allelic gene expression and

disease association may aid understanding of gene expression-mediated causal mechanisms of human disease. Acknowledgments This research was supported by the Basic Science Research Program through the National Research Foundation (NRF) of Korea, funded by the Ministry of Education, Science and Technology (2012R1A1A2006638). Appendix A. Supplementary material Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2014.09.001. References Allen, E., et al., 2003. High concentrations of long interspersed nuclear element sequence distinguish monoallelically expressed genes. Proc. Natl. Acad. Sci. U. S. A. 100, 9940–9945. Amariglio, N., Rechavi, G., 2007. A-to-I RNA editing: a new regulatory mechanism of global gene expression. Blood Cells Mol. Dis. 39, 151–155. Bahn, J.H., Lee, J.H., Li, G., Greer, C., Peng, G., Xiao, X., 2012. Accurate identification of A-to-I RNA editing in human by transcriptome sequencing. Genome Res. 22, 142–150. Bray, N.J., Buckland, P.R., Owen, M.J., O'Donovan, M.C., 2003. Cis-acting variation in the expression of a high proportion of genes in human brain. Hum. Genet. 113, 149–153. Buckland, P.R., 2004. Allele-specific gene expression differences in humans. Hum. Mol. Genet. 13 (Spec No 2), R225–R260. Cheung, V.G., et al., 2003. Natural variation in human gene expression assessed in lymphoblastoid cells. Nat. Genet. 33, 422–425. Dimas, A.S., et al., 2008. Modifier effects between regulatory and protein-coding variation. PLoS Genet. 4 (10), e1000244. Djebali, S., et al., 2012. Landscape of transcription in human cells. Nature 489, 101–108. Gimelbrant, A., Hutchinson, J.N., Thompson, B.R., Chess, A., 2007. Widespread monoallelic expression on human autosomes. Science 318, 1136–1140. Huang, D.W., Sherman, B.T., Lempicki, R.A., 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57. Lee, J.H., Ku, J.L., Park, Y.J., Lee, K.U., Kim, W.H., Park, J.G., 1999. Establishment and characterization of four human hepatocellular carcinoma cell lines containing hepatitis B virus DNA. World J. Gastroenterol. 5, 289–295. Lee, P.D., et al., 2006. Mapping cis-acting regulatory variation in recombinant congenic strains. Physiol. Genomics 25, 294–302. Lee, R.D., Song, M.Y., Lee, J.K., 2013. Large-scale profiling and identification of potential regulatory mechanisms for allelic gene expression in colorectal cancer cells. Gene 512, 16–22. Liu, S., et al., 2002. Role of toll-like receptors in changes in gene expression and NF-κB activation in mouse hepatocytes stimulated with lipopolysaccharide. Infect. Immun. 70, 3433–3442. Lo, H.S., et al., 2003. Allelic variation in gene expression is common in the human genome. Genome Res. 13, 1855–1862. Maia, A.T., et al., 2009. Extent of differential allelic expression of candidate breast cancer genes is similar in blood and breast. Breast Cancer Res. 11, R88. Mei, R., et al., 2000. Genome-wide detection of allelic imbalance using human SNPs and high-density DNA arrays. Genome Res. 10, 1126–1137. Ohman, M., 2007. A-to-I editing challenger or ally to the microRNA process. Biochimie 89, 1171–1176. Pant, P.V., Tao, H., Beilharz, E.J., Ballinger, D.G., Cox, D.R., Frazer, K.A., 2006. Analysis of allelic differential expression in human white blood cells. Genome Res. 16, 331–339. Park, J.G., Lee, J.H., Kang, M.S., Park, K.J., Jeon, Y.M., Lee, H.J., Kwon, H.S., Yeo, K.S., Lee, K.U., 1995. Characterization of cell lines established from human hepatocellular carcinoma. Int. J. Cancer 62, 276–282. Song, M.Y., Kim, H.E., Kim, S., Choi, I.H., Lee, J.K., 2012. SNP-based large-scale identification of allele-specific gene expression in human B cells. Gene 493, 211–218. Stamatoyannopoulos, J.A., 2004. The genomics of gene expression. Genomics 84, 449–457. Stranger, B.E., et al., 2005. Genome-wide associations of gene expression variation in humans. PLoS Genet. 1 (6), e78. Sun, C., Southard, C., Witonsky, D.B., Olopade, O.I., Di Rienzo, A., 2010. Allelic imbalance (AI) identifies novel tissue-specific cis-regulatory variation for human UGT2B15. Hum. Mutat. 31, 99–107. Sweet, M.J., Hume, D.A., 1996. Endotoxin signal transduction in macrophages. J. Leukoc. Biol. 60, 8–26. Wulff, B.E., Sakurai, M., Nishikura, K., 2011. Elucidating the inosinome: global approaches to adenosine-to-inosine RNA editing. Nat. Rev. Genet. 12, 81–85. Yang, P.K., Kuroda, M.I., 2007. Noncoding RNAs and intranuclear positioning in monoallelic gene expression. Cell 128, 777–786. Zakharova, I.S., Shevchenko, A.I., Zakian, S.M., 2009. Monoallelic gene expression in mammals. Chromosoma 118, 279–290. Zeller, T., et al., 2010. Genetics and beyond—the transcriptome of human monocytes and disease susceptibility. PLoS One 5 (5), e10693.

Genome-wide detection of allelic gene expression in hepatocellular carcinoma cells using a human exome SNP chip.

Allelic variations in gene expression influence many biological responses and cause phenotypic variations in humans. In this study, Illumina Human Exo...
947KB Sizes 3 Downloads 4 Views