J. Anim. Breed. Genet. ISSN 0931-2668

ORIGINAL ARTICLE

Genome-wide association study for feedlot average daily gain in Nellore cattle (Bos indicus) M.H.A. Santana1, Y.T. Utsunomiya2, H.H.R. Neves2, R.C. Gomes3, J.F. Garcia2,4, H. Fukumasu1, S.L. Silva1, P.R. Leme1, L.L. Coutinho5, J.P. Eler1 & J.B.S. Ferraz1 1 2 3 4 5

Faculdade de Zootecnia e Engenharia de Alimentos, USP – Univ. de S~ao Paulo, Pirassununga, Brazil. ^ncias Agrarias e Veterinarias, UNESP - Univ. Estadual Paulista, Jaboticabal, Brazil Faculdade de Cie Empresa Brasileira de Pesquisa Agropecuaria, CNPGC/EMBRAPA , Campo Grande, Brazil Faculdade de Medicina Veterinaria de Aracßatuba, UNESP – Univ. Estadual Paulista, Aracßatuba, Brazil Escola Superior de Agricultura Luiz de Queiroz, USP – Univ. de S~ao Paulo, Piracicaba, Brazil

Keywords Bos indicus; genomic; growth; genome-wide association study; weight gain; zebu. Correspondence M.H.A. Santana, Faculdade de Zootecnia e Engenharia de Alimentos – USP, Departamento de Medicina Veterinaria, GMAB, Av. Duque de Caxias Norte, 225, Pirassununga, SP, 13635-900, Brazil. Tel: 55 19 3565 4105; Fax: 55 19 3565 4107; E-mail: [email protected] Received: 13 November 2013; accepted: 23 January 2014

Summary The genome-wide association study (GWAS) results are presented for average daily gain (ADG) in Nellore cattle. Phenotype of 720 male Bos indicus animals with information of ADG in feedlots and 354 147 singlenucleotide polymorphisms (SNPs) obtained from a database added by information from Illumina Bovine HD (777 962 SNPs) and Illumina BovineSNP50 (54 609) by imputation were used. After quality control and imputation, 290 620 SNPs remained in the association analysis, using R package Genome-wide Rapid Association using Mixed Model and Regression method GRAMMAR-Gamma. A genomic region with six significant SNPs, at Bonferroni-corrected significance, was found on chromosome 3. The most significant SNP (rs42518459, BTA3: 85849977, p = 9.49 9 108) explained 5.62% of the phenotypic variance and had the allele substitution effect of 0.269 kg/day. Important genes such as PDE4B, LEPR, CYP2J2 and FGGY are located near this region, which is overlapped by 12 quantitative trait locus (QTLs) described for several production traits. Other regions with markers with suggestive effects were identified in BTA6 and BTA10. This study showed regions with major effects on ADG in Bos indicus in feedlots. This information may be useful to increase the efficiency of selecting this trait and to understand the physiological processes involved in its regulation.

Introduction Animal growth rate has been a subject of study for many years in beef production systems and, possibly, it is one of the most valued traits in beef cattle. The average daily weight gain (ADG) plays an essential role as a selecting criterion in beef cattle breeding programs. The ADG tends to be higher in animals raised in feedlot systems, primarily due to the higher-concentrated food diet (Owens et al. 1995). Raising and terminating steers in feedlot systems seem a promising solution to the constraints of doi:10.1111/jbg.12084

agricultural practices, because feedlots do not require large areas. There are also concerns with environmental issues (methane emission, among others). Therefore, raising cattle in feedlots is a trend in several beef production systems. The assessment of the predictive value of chromosomal segments on weight gain traits has been the subject of recent studies (Casas et al. 2003; GutierrezGil et al. 2009). The use of genomic data to characterize loci that explain substantial proportions of phenotypic variance in ADG may contribute to accurate predictions of genetic parameters in weight traits. © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 131 (2014) 210–216

GWAS of average daily gain in Nellore

M. H. A. Santana et al.

Also, uncovering the molecular basis of differences in ADG may provide insights into the mechanisms controlling the relationship between weight gain and other important traits in livestock, like feed intake and body composition. There are increasing numbers of reports of associations of genome-wide single-nucleotide polymorphism (SNP) markers with weight gain in the bovine species (Snelling et al. 2010; Pausch et al. 2011; Lu et al. 2013). However, rarer are the association studies of weight trait in Bos indicus cattle (Bolormaa et al. 2011; Fortes et al. 2013; Utsunomiya et al. 2013). The objectives of this study were: (i) to identify SNPs associated with ADG in Nellore cattle (B. indicus) and (ii) to explore genes and QTLs nearby these SNPs in order to assess their potential roles in determining differences in ADG.

Materials and methods Animals and phenotypes

Phenotype data were collected from a total of 720 Nellore animals (B. indicus), including steers and young bulls. These data were obtained from eleven different experiments to evaluate weight gain and feed efficiency in three distinct regions of Brazil: South (Santana et al. 2012), Southeast (Gomes et al. 2013) and Central-West. These experiments were carried out in three different facility types (individual pens, Calan Gates and GrowSafe), with a length of 84 days with regular weighing on 21 days. At the beginning of the experiments, animals were on average with 550  115 days of age and 380  51 kg of live weight. The weighing was carried out without food or water fasting, and the ADG was estimated as the slope of a linear regression of weights on testing days. Phenotypic data were tested for normality (Shapiro–Wilk test) and outlier observations departing  3 standard deviations from the average ADG were excluded from further analyses. Each experiment was considered a contemporary group (animals from the same test had similar weights and ages, in addition to being subjected to the same diet during their respective tests). Genotyping and imputation

The 720 animals were genotyped in two different microarrays: 384 bulls were profiled in the Illuminaâ BovineHD Genotyping BeadChip assay (HD), and genotypes for 336 bulls and steers were obtained © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 131 (2014) 210–216

using the Illuminaâ BovineSNP50 v2 (50k), according to the manufacturer’s protocol (Illumina Inc., San Diego, CA, USA). We performed imputation to combine genotypes from the two microarrays in a single data set. The HD information was used to assess the accuracy of imputation, and only SNPs located in autosomal chromosomes were considered for analysis. Briefly, we randomly sampled 290 animals with HD genotypes, which were considered as the reference sample. The remaining 94 animals were used as a validation sample. To evaluate the performance of imputation from 50k to HD, we extracted the subset of 50k markers that are embedded in the HD panel and masked the remaining SNPs in the validation sample. Based on the reference sample, we used FImpute v2.2 (Sargolzaei et al. 2012) to predict the masked genotypes. The accuracy of imputation was evaluated as the proportion of genotypes correctly imputed. Only genotyped markers with accuracy >95% were considered for further analyses. Afterwards, the imputation was carried out for all 336 animals genotyped in the 50k, including all HD samples as reference. In total, 354,147 SNPs presented imputation accuracy larger than 95%. Data filtering and association analysis

After imputation, only samples with call rate >90% were considered for analysis. Markers were removed from the data set if they did not present: minor allele frequency greater than or equal to 2%; call rate of at least 95%; or Fisher exact test for Hardy–Weinberg equilibrium >1 9 105. To account for relatedness and population substructure in the association analysis, we used a two-steps score test that is an extension of the Genome-wide Rapid Association using Mixed Model and Regression method (GRAMMAR) (Aulchenko et al. 2007a), namely GRAMMAR-Gamma (Svishcheva et al. 2012). An appealing feature of this method is the ability to yield unbiased test statistics that approximate the gold standard likelihood ratio test, but with the computational complexity of an analysis that ignores genetic structure (Svishcheva et al. 2012). In the first step, a variance-components model is fitted to the data, and the Gamma factor is calculated from the estimated heritability and variance-covariance matrix as genomic relationship matrix to correct for population substructure. Here, we included average live weight, contemporary group and sex as fixed effects. In the second step, Gamma-corrected least squares equations are used to obtain allele substitution effect, variance and test statistic for every marker. Under the null 211

GWAS of average daily gain in Nellore

M. H. A. Santana et al.

hypothesis, the statistics are assumed to asymptotically follow a v² distribution with one degree of freedom. The validity of this assumption was checked by assessing the goodness of fit of the observed quantiles to the theoretical distribution via calculation of the inflation/deflation factor (k). Values of k below 1.1 were considered acceptable, and the test statistics was divided by k to adjust for the remaining inflation (Svishcheva et al. 2012). A Bonferroni-corrected threshold was adopted to declare significant SNPs (a = 0.05/numbers of SNPs). The proportion of phenotypic variance explained by significant SNPs was estimated as follows: VARð%Þ ¼

2pqb2  100 S2

Where p and q are the allele frequencies, b is the estimated allele substitution effect, and S2 is the sample phenotypic variance. All the analyses described previously were carried out in R v2.15.2, using the GenABEL v1.7-6 package (Aulchenko et al. 2007b). Genomic regions surrounding significant SNPs

The 1-Mb regions surrounding significant SNPs were investigated to determine whether they mapped against any previously described quantitative trait locus (QTL) deposited in the cattle QTLdb database (Hu et al. 2013). Gene coordinates in the UMD v3.1 assembly (Zimin et al. 2009) were obtained from the Ensembl genes 73 database (Kinsella et al. 2011) to annotate nearby genes.

Results Phenotype and genotype filtering

Based on the results for the Shapiro–Wilk test, we found no evidence that ADG deviated from normality (p = 0.15), and five outlier animals were excluded from the analysis. The mean, minimum and maximum were 1.49  0.42, 0.79 and 2.41, respectively. These results show that the dependent variable did not violate the assumptions of the association analysis used in this study. After genotype filtering and imputation, the final data set included 672 individuals and 290,620 SNPs. Association analysis

The calculated test statistics presented acceptable inflation (Figure 1) and were adjusted for k (Figure 2). 212

Figure 1 Quantile–quantile plot of the statistical tests used for the association analysis.

The most significant SNP was detected at the 85849977 position on chromosome 3 (rs42518459, p = 9.49 9 108) and presented an estimated allele substitution effect of -0.27 kg/day. The favorable allele is T, which each C in genotype causes this negative effect on ADG. The percentage of phenotypic variance explained by this SNP was 5.62%, and the frequency of allele C = 86% and frequency of allele T = 14%, in this population. Ten SNPs with the lowest p-value in the association analysis are shown in Table 1. The SNPs mentioned previously altogether could explain a large portion of the variation in ADG of Nellore cattle in feedlots. However, six of these SNPs are the same genomic region in BTA3 (85–86 Mb) and three in BTA10 (79–80 Mb) must therefore be at high linkage disequilibrium (LD) and its effects should be accounted for as one. The estimate of the variance explained in ADG, taking into account only SNPs with the lowest p-value in BTA3, BTA6 and BTA10 is 15%. We identified four genes in 1-Mb region surrounding the most significant SNP (Table 2). Other genes are also found in this genomic region, namely PDE4B, LEPR, LEPROT, UBE2U, PGM1, ITGB3BP, ALG6, ANGPTL3, PRKAA2 and PPAP2B, which also overlapped the 12 QTLs described in the literature (Table 3). Although this region shows greater relationship with ADG, of the 10 SNPs with the lowest p-value in the GWAS, four SNPs are not located in BTA3. Three of these SNPs are located in BTA10, in a region nearby several genes, among which are GPHN, MPP5, TMEM229B, PLEKHH1, PIGH, ARG2, RDH12 and © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. 131 (2014) 210–216

GWAS of average daily gain in Nellore

M. H. A. Santana et al.

Figure 2 Manhattan plot of -log10 (P-values) test for genome-wide association of average daily gain in Nellore cattle. The horizontal line represents the Bonferroni threshold (a = 1.72 x 10-7).

Table 1 Statistics and parameters estimated for the 10 most significant single-nucleotide polymorphisms (SNPs) SNP ID

Chromosome

Position(bp)

MAF

ba

SE

Var(%)b

p-value

rs42518459 rs42518463 rs137011979 rs109982289 rs42516428 rs42517631 rs135472159 rs110899667 rs136057886 rs133639017

3 3 10 3 3 3 10 6 10 3

85849977 85854233 79814549 85883091 85940753 85945421 80095314 63160626 79793695 85972521

0.14 0.14 0.05 0.12 0.15 0.13 0.22 0.18 0.11 0.16

0.269 0.259 0.435 0.285 0.273 0.257 0.374 0.316 0.366 0.220

0.05 0.05 0.08 0.05 0.05 0.05 0.08 0.07 0.08 0.05

5.62 5.66 6.28 3.50 5.72 1.52 2.08 3.14 4.63 2.49

9.49 9.49 1.21 1.28 1.93 6.79 2.56 2.63 3.13 4.63

9 9 9 9 9 9 9 9 9 9

108 108 107 107 107 107 106 106 106 106

a

Estimated allele substitution effect Percentage of phenotypic variance explained

b

Table 2 Genes close to the most significant single-nucleotide polymorphism (SNP) (rs42518459)

Gene

Gene ID

Distance (kb)

C3H1orf87

504762

410.3

NFIA CYP2J2 FGGY

615371 510406 534011

682.1 759.6 938.0

Description Chromosome 3 open reading frame Nuclear factor I/A Cytochrome P450,polypeptide 2 FGGY carbohydrate kinase domain

RAD51B, all

Genome-wide association study for feedlot average daily gain in Nellore cattle (Bos indicus).

The genome-wide association study (GWAS) results are presented for average daily gain (ADG) in Nellore cattle. Phenotype of 720 male Bos indicus anima...
163KB Sizes 0 Downloads 3 Views