J. Anim. Breed. Genet. ISSN 0931-2668

ORIGINAL ARTICLE

Genomic predictions across Nordic Holstein and Nordic Red using the genomic best linear unbiased prediction model with different genomic relationship matrices L. Zhou1, M.S. Lund1, Y. Wang2 & G. Su1 1 Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Tjele, Denmark 2 Key Laboratory of Agricultural Animal Genetics and Breeding, National Engineering Laboratory for Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, China

Summary

Keywords Genomic selection; multibreed; marker-based relationship matrix; linkage disequilibrium phase consistence. Correspondence G. Su, Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, DK-8830 Tjele, Denmark. Tel: +45 8715 7985; Fax: +45 87154994; E-mail: [email protected] Received: 21 October 2013; accepted: 18 March 2014

This study investigated genomic predictions across Nordic Holstein and Nordic Red using various genomic relationship matrices. Different sources of information, such as consistencies of linkage disequilibrium (LD) phase and marker effects, were used to construct the genomic relationship matrices (G-matrices) across these two breeds. Single-trait genomic best linear unbiased prediction (GBLUP) model and two-trait GBLUP model were used for single-breed and two-breed genomic predictions. The data included 5215 Nordic Holstein bulls and 4361 Nordic Red bulls, which was composed of three populations: Danish Red, Swedish Red and Finnish Ayrshire. The bulls were genotyped with 50 000 SNP chip. Using the twobreed predictions with a joint Nordic Holstein and Nordic Red reference population, accuracies increased slightly for all traits in Nordic Red, but only for some traits in Nordic Holstein. Among the three subpopulations of Nordic Red, accuracies increased more for Danish Red than for Swedish Red and Finnish Ayrshire. This is because closer genetic relationships exist between Danish Red and Nordic Holstein. Among Danish Red, individuals with higher genomic relationship coefficients with Nordic Holstein showed more increased accuracies in the two-breed predictions. Weighting the two-breed G-matrices by LD phase consistencies, marker effects or both did not further improve accuracies of the two-breed predictions.

Introduction The methodology of calculating breeding values by molecular markers was first proposed in 1997 (Nejati-Javaremi et al. 1997), and it was later called genomic selection (Meuwissen et al. 2001). Genomic selection has been successfully implemented in single-breed genetic evaluations (VanRaden & Sullivan 2010; Lund et al. 2011). Accuracies of genomic breeding values were significantly greater than conventional parentage average breeding values (VanRaden et al. 2009; Su et al. 2010). In the field of genomic predictions across different breeds, © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

simulation studies suggested that more accurate genomic prediction could be achieved when reference data of different populations or breeds were combined into a joint training set (De Roos et al. 2009; Toosi et al. 2010). Many researches on field data also showed that merging populations of same breed or closely related breeds would increase accuracies of genomic predictions (Brøndum et al. 2011; Jorjani et al. 2011; VanRaden et al. 2012). However, results of field data analysis have not shown a clear advantage of combining reference populations of distantly related breeds in genomic predictions. For example, using marker doi:10.1111/jbg.12089

L. Zhou et al.

Genomic predictions across breeds

effects estimated from one breed to predict genomic breeding values for individuals in another breed was not accurate in general (Harris et al. 2009). Combining Australian Holstein and Jersey cattle in a joint reference population only led to increased accuracies for some traits (like protein yield) in Jersey, while no significant increases were observed in Holstein (Erbe et al. 2012). Non-significant improvements of accuracies for Australian Jersey were achieved by using Holstein information as prior in a BayesRS model (Brøndum et al. 2012). Another study on Australian dairy cattle also showed no significant increase in accuracies by combining Holstein, Jersey and Fleckvieh populations (Pryce et al. 2011). Accuracies of genomic prediction did not increase when combining several small dairy breeds (Meszaros et al. 2012). Genomic prediction combining French Holstein, Montb eliarde and Normande resulted in small or no increase in accuracies by multiple traits models (Karoui et al. 2012). Several factors challenge the implementation of genomic predictions across breeds. Firstly, linkage phases between markers and quantitative trait loci (QTL) may be different for different breeds. Genome regions with high linkage disequilibrium (LD) may also harbour genes that relate to complex traits. If the LD phase consistencies between breeds are high, there are also higher chances that the LD phases between markers and QTL are the same across breeds. If QTL locate in the intervals where there are high LD phase consistencies between breeds, putting more weights on these intervals in the multibreed genomic relationship matrices may result in higher accuracies in genomic predictions across breeds. Secondly, it is also possible that QTL only segregate in one breed, or QTL effects may vary across different breeds, which possibly due to complicated interactions between the QTL and the background genes of different breeds. Weighting markers by their effects in different breeds in the G-matrices may also increase accuracies of genomic predictions across breeds. One reason that acrossbreed predictions have not been successful could be that these factors were not well considered in models for across-breed predictions. In this study, we hypothesize that accuracies of two-breed predictions can be improved by building the two-breed G-matrices for better description of variance and covariance structure using information of (i) consistencies of LD phase between two breeds or (ii) pre-estimated marker effects of two breeds. Data of Nordic Holstein and Nordic Red cattle which were genotyped by 50k SNP chips were used in this study to test these hypotheses. 2

Materials and methods Data

The data set included 5215 Nordic Holstein bulls and 4361 Nordic Red bulls. Nordic Red is a composite breed with three Red cattle populations: Danish Red, Finnish Ayrshire and Swedish Red. All the bulls were genotyped using the 50k Illumina Bovine SNP50 BeadChip (Matukumalli et al. 2009). Markers were: average GenCall scores (http://res.illumina.com/documents/products/technotes/technote_gencall_data_analysis_software.pdf) higher than 0.6 and minor allele frequencies higher than 0.01 in both breeds. After data editing, a total of 42 408 SNPs were retained for the analysis. The response variable was deregressed proof (DRP; Sigurdsson & Banos 1995), which was derived from the estimated breeding values (EBV) of Nordic cattle genetic evaluation (www.nordicebv.info). The Nordic Holstein bulls were born from 1974 to 2008, and Nordic Red bulls were born from 1971 to 2006. Among the Nordic Red bulls, there were 838 Danish Red, 1326 Swedish Red, 2159 Finnish Ayrshire, and 38 bulls from other countries. Bulls born before 1 January 2002 were considered the reference population, while bulls born later were set as the validation population. The numbers of bulls in the reference and validation populations for each trait are shown in Table 1. Five traits were analysed in this study: milk yield, fat yield, protein yield, fertility and mastitis. Fertility was an index that combined interval from calving to first insemination, interval from first to last insemination and number of inseminations. Mastitis was a binary trait and its DRP was derived from the EBV which was estimated using a multi-trait model including somatic cell count and udder information as correlated traits. Nordic Red bulls were further split into three populations to compare accuracies of each population, there were 173 Danish Red, 247 Swedish Red and 435 Finnish Ayrshire in the validation population for milk, fat and protein yield; 173 Danish Red, 255 Swedish Red and 444 Finnish Ayrshire for fertility; and 183 Danish Table 1 Numbers of bulls in reference and validation populations for each trait in the single-breed and two-breed genomic predictions Nordic Holstein

Nordic Red

Traits

Ref.

Val.

Ref.

Val.

Milk yield Fat yield Protein yield Fertility Mastitis

3081 3081 3081 3115 3084

1316 1316 1316 1299 1412

3437 3437 3437 3394 3437

861 861 861 878 916

© 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

L. Zhou et al.

Genomic predictions across breeds

Red, 262 Swedish Red and 465 Finnish Ayrshire for mastitis. The numbers of bulls for different traits differed because some bulls missed DRP in some traits. Statistical models GBLUP model

A genomic best linear unbiased prediction (GBLUP) method (VanRaden 2008) was used in this study. The basic GBLUP model was: y ¼ l þ Za þ e

ð1Þ

where y was the vector of DRP, l was the population mean, a was the vector of genomic breeding values, e was the vector of residuals, and Z was an identity design matrix allocating a to y. It was assumed that a  Nð0; Gr2g Þ and e  Nð0; Dr2e Þ; where G was the genomic relationship matrix (G-matrix), r2g was the additive genetic variance, D was a diagonal matrix with weights on the residual variance (Su et al. 2012), and r2e was the residual variance. The G-matrix was constructed using the method (method one) presented by VanRaden (2008) and Hayes and Goddard (2008). The genomic relationship coefficient (gij) of individual i and j was calculated as: gij ¼

m X

Mi;k Mj;k =

X

2pk ð1  pk Þ;

ð2Þ

k¼1

where M was the marker genotype matrix with elements 0 – 2pk, 1 – 2pk, and 2 – 2pk for genotypes A1A1, A1A2 and A2A2, respectively, pk was the observed allele frequency of A2 at locus k, and m was the total number of markers. This GBLUP model was used for two different scenarios: firstly, for single-breed predictions of Nordic Holstein and Nordic Red individually; secondly, for mutual predictions in which one breed was treated as the reference population to predict the other breed. Two-trait GBLUP model

A two-trait GBLUP model was used for genomic predictions across Nordic Holstein and Nordic Red. One single biological trait in these two breeds was regarded as two different traits. The model was:           1l1 Z1 0 a1 e1 y1 ¼ þ  þ ; y2 1l2 a2 e2 0 Z2 ð3Þ where y1 was the vector of DRP in Nordic Holstein, y2 was the vector of DRP in Nordic Red, l1 and l2 were means of the two breeds, a1 and a2 were the vectors of genomic breeding values, e1 and e2 were the vectors of residual effects, and Z1 and Z2 were incidence © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

matrices associating genomic breeding values with y1 and y2, respectively. It was assumed that 0 1 0 2 1 a1 0 r11 G r12 G 0 B a2 C B r12 G r2 G 0 0 C 22 C B C; ð4Þ Var B @ e1 A ¼ @ 0 0 R1 0 A e2 0 0 0 R2 where G was the two-breed genomic relationship matrix, r211 and r222 were the additive genetic variance of Nordic Holstein and Nordic Red, and r12 was the genetic covariance between Nordic Holstein and Nordic Red. R1 and R2 were covariance matrices of residual effects for Nordic Holstein and Nordic Red, which were defined as R1 ¼ D1 r2e1 and R2 ¼ D2 r2e2 , and D was a diagonal matrix of weighting factors (Su et al. 2012). The two-breed G-matrices were constructed by different strategies as below. Two-breed G-matrices

For genomic predictions across two breeds, four alternative two-breed G-matrices were built. They were: (i) unweighted G-matrix (GUNW), in which no weighting strategy was used in building the G-matrix, (ii) LD phase consistencies weighted G-matrices (GLD), in which LD phase consistencies were used as weights of genotypes, (iii) marker effects weighted G-matrices (GEff), in which squared marker effects and products of marker effects of each breed for each trait were used as weights, and (iv) LD phase consistencies and marker effects weighted G-matrices (GLD_Eff), in which marker effects and LD phase consistencies were used together to determine weights. Each of these two-breed Gmatrices contained within-breed blocks and betweenbreed blocks. These two types of blocks were separately constructed, then combined into the full G-matrices. Unweighted G-matrix (GUNW)

Within-breed blocks of the GUNW matrix were built in the same way as that for the G-matrix of single breed, but allele frequencies were calculated for each breed separately. Between-breed blocks were constructed by multiplying the genotypes of two breeds and dividing by the geometric mean of the sum of 2p (1 – p) of two breeds. Thus, the genomic relationship coefficient (gij) between individual i in breed 1 and individual j in breed 2 was calculated as: Pm k¼1 M1ði;kÞ M2ðj;kÞ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi gij ¼ pP P 2p1;k ð1  p1;k Þ 2p2;k ð1  p2;k Þ

ð5Þ

where M1and M2 were genotype matrices of breed 1 and breed 2, and p1,k and p2,k were the observed allele 3

L. Zhou et al.

Genomic predictions across breeds

frequencies of A2 at locus k for breed 1 and breed 2, respectively. LD phase consistence weighted G-matrices (GLD)

As the consistence of LD phase between marker and QTL in different breeds may be different across the genome, it is not appropriate to assume the covariance of SNP effects on two breeds is the same for all SNP in the genome. Therefore, we accounted for LD phase consistencies in constructing the two-breed G-matrices. As the phase consistency was defined to be 1 for two animals that belonged to the same breed, within-breed blocks of the GLD matrices were constructed without weighting, or weighting all markers by one (the same as the GUNW matrix). Hence, differential weighting of markers was only exerted to the between-breed blocks. For example, gij of individual i in breed 1 and individual j in breed 2 was calculated as: Pm k¼1 M1ði;kÞ M2ðj;kÞ wk ffi gij ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð6Þ P P 2p2;k ð1  p2;k Þ; 2p1;k ð1  p1;k Þ where M1, M2, m, p1 and p2 were defined as above, wk was the weight on marker k. The weights (w) were LD phase consistencies (The Bovine HapMap Consortium 2009), which were measured as correlations of all pairwise rLD between two breeds (indicated as corLD). The rLD was the measurement of LD (Hill & Robertson 1968) between any pair of markers within a marker interval, which was calculated as: f ðABÞ  f ðAÞf ðBÞ rLD ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; f ðAÞf ðaÞf ðBÞf ðbÞ

ð7Þ

where f(AB), f(A), f(a), f(B) and f(b) were observed frequencies of haplotype AB and alleles A, a, B and b, respectively. Three sizes of marker intervals were used, which contained 5, 10 or 15 SNPs in each interval. All the pairwise rLD in each marker interval were used to calculate LD phase consistencies (corLD). The value of corLD of each marker interval was used as the weight for all the SNPs in that interval. Marker effect weighted G-matrices (GEff)

Variances of SNP effects in each breed and covariances between breeds may differ among different SNPs. The marker effect weighted G-matrix is an approach to account for heterogeneous variances and covariances in different SNPs. To build the GEff matrices, marker 4

effects on each trait of each breed were firstly estimated by a BLUP model with random regression on SNP genotypes (Meuwissen et al. 2001), which assumed that marker effects were normally distributed with equal variance. The genomic relationship coefficient of individuals i and j of the same breed (within-breed blocks) was calculated as: m X X gij ¼ Mi;k Mj;k wk = 2pk ð1  pk Þ ð8Þ k¼1

where M, m and pk were described as above. For the within-breed blocks of GEff, the weight was calculated as wk ¼ a02k , where a0 k was the scaled marker effect for marker k of each trait. To achieve an averageqweight ffiffiffiffiffiffiffiffiffiffiffiffi of one, marker effects were scaled by a0k ¼  a2k =a2 ; and the sign of a0 k was assigned according to the original sign of marker effect ak. The between-breed blocks of the GEff matrices were constructed as Equation (6), except that the weight of marker k was calculated as wk = a0 1,ka0 2,k, where a0 1,k and a0 2,k were scaled marker effects of marker k on the trait of breed 1 and breed 2. Marker effects and LD phase consistencies weighted G-matrices (GLD_Eff)

For the GLD_Eff matrices, the within-breed blocks were weighted only by marker effects as in the GEff matrices. For between-breed blocks,   the weights were calculated as corLD  a01;k  a02;k , where corLD were LD phase consistencies (here only used the scenario of five SNPs in each interval), and a0 1,k and a0 2,k were the scaled marker effects as described above. The G-matrices of one breed, two breeds and their inverses were built by Fortran 90 programs written by the authors. The DMU package (http://dmu.agrsci.dk/ DMU/Doc/Current/dmuv6_guide.5.2.pdf) was used for estimation of variance and covariance components and calculation of direct genomic values (DGV). Results Genetic correlations of Nordic Holstein and Nordic Red were evaluated in this study. Accuracies (correlations of DGV and DRP) of DGV were compared among single-breed and two-breed predictions using different weighted G-matrices. Genetic correlations between Nordic Holstein and Nordic Red estimated by the two-trait GBLUP model were in the range of 0.37–0.58 (Table 2). Estimates of heritability were highly consistent with reliability of DRP of the traits. Heritabilities estimated from the model with weighted G-matrices were similar to those estimated from the model with the unweighted G © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

L. Zhou et al.

Genomic predictions across breeds

matrix. Accuracies of single-breed and two-breed predictions are shown in Table 3. When using one breed as the reference population to predict the other breed, accuracies varied from 0 to 0.2. It was much lower than predictions using the breed’s own reference population. Accuracies of two-breed genomic predictions using the two-trait GBLUP model with the GUNW matrix were higher than single-breed genomic predictions in milk yield and protein yield for Nordic Holstein and in all the three production traits for Nordic Red. Increased accuracies were achieved mainly in the production traits (milk, fat and protein yield) and were less obvious in fertility and mastitis. Compared with Nordic Holstein, Nordic Red had more gains in accuracies from the two-breed genomic predictions. However, the GLD matrices did not further increase Table 2 Genetic correlations and their standard errors of Nordic Holstein and Nordic Red estimated using the two-trait genomic best linear unbiased prediction (GBLUP) model with the two-breed unweighted G-matrix Traits

Genetic correlations

Standard errors

Milk yield Fat yield Protein yield Fertility Mastitis

0.46 0.58 0.37 0.46 0.46

0.07 0.07 0.07 0.10 0.08

accuracies in the two-breed predictions. The GEff and GLD_Eff matrices even resulted in lower accuracies of DGV. To further investigating reasons of the increased accuracies of Nordic Red from the two-breed predictions, we split the Nordic Red validation data set into Danish Red, Swedish Red and Finnish Ayrshire according to bulls’ country of origins. The results in Table 4 showed Danish Red had much higher increases in accuracies than Swedish Red and Finnish Ayrshire, from the Nordic Red single-breed predictions to the two-breed predictions. Averaging across five traits evaluated, the increases were 5.54, 2.03 and 0.36% for Danish Red, Swedish Red and Finnish Ayrshire. Accuracy for fat yield of Danish Red from the Nordic Holstein reference prediction was highest among five traits at 0.35 (Table 4). This is probably because Danish Red has a larger proportion of bulls having higher genomic relationship coefficients with Nordic Holstein compared to Swedish Red and Finnish Ayrshire, as shown in Figure 1. A larger number of Danish Red bulls had closer genomic relationships with Nordic Holstein bulls, surpassing both Swedish Red and Finnish Ayrshire (Figure 1). Between individuals from the Nordic Holstein reference population and the Nordic Red validation population, there were 104 Nordic Red bulls which had genomic relationship coefficients higher

Table 3 Correlations between direct genomic values (DGV) and deregressed proof (DRP) for Nordic Holstein and Nordic Red in the single-breeda and two-breed genomic predictions Single-breed predictions Breeds Nordic Holstein

Nordic Red

Traits Milk yield Fat yield Protein yield Fertility Mastitis Milk yield Fat yield Protein yield Fertility Mastitis

Hol-ref. 0.62 0.64 0.62 0.47 0.51 0.13 0.18 0.11 0.14 0.02

Two-breed predictions Red-ref. 0.15 0.20 0.07 0.00 0.13 0.56 0.62 0.55 0.43 0.44

GUNWb 0.63 0.64 0.63 0.47 0.51 0.58 0.63 0.56 0.43 0.45

GLD5 0.63 0.64 0.63 0.47 0.51 0.58 0.63 0.56 0.43 0.45

GLD10 0.63 0.64 0.63 0.47 0.51 0.58 0.63 0.56 0.43 0.45

GLD15 0.63 NA 0.63 0.47 0.52 0.58 NA 0.56 0.43 0.45

GEff c

NA 0.60 0.60 0.45 0.49 NA 0.62 0.53 0.40 0.44

GLD_Eff 0.63 0.60 0.60 0.45 0.49 0.57 0.63 0.54 0.41 0.44

a

In the single-breed predictions, each breed was used as the reference population to predict themselves and the other breed using a genomic best linear unbiased prediction (GBLUP) model. In the two-breed predictions, both breeds were used in the reference population using the two-trait GBLUP models. b GUNW: predictions by the two-trait GBLUP model using the unweighted G-matrix (GUNW); GLD5: predictions using LD consistencies weighted G-matrices with each interval of five SNP for calculation of LD phase consistencies, and GLD10 and GLD15 indicated 10 and 15 SNP in each interval; GEff: predictions using the marker effects weighted G-matrices (GEff); GLD_Eff: predictions using marker effects and LD phase consistencies together weighted G-matrices. c AI-REML iterative processes of the two-trait model did not converge.

© 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

5

L. Zhou et al.

Genomic predictions across breeds

Table 4 Correlations of direct genomic values (DGV) and deregressed proof (DRP) for Danish Red, Swedish Red and Finnish Ayrshire in different scenarios of genomic predictions of Nordic Red Danish Red

Swedish Red

Finnish Ayrshire

Traits

Hol-refa

Red-refa

Com-refa

Hol-ref

Red-ref

Com-ref

Hol-ref

Red-ref

Com-ref

Milk yield Fat yield Protein yield Fertility Mastitis

0.23 0.35 0.18 0.05 0.17

0.48 0.58 0.53 0.32 0.48

0.53 0.61 0.55 0.34 0.49

0.18 0.08 0.13 0.00 0.05

0.59 0.59 0.59 0.47 0.41

0.61 0.61 0.61 0.47 0.41

0.08 0.07 0.05 0.08 0.08

0.56 0.65 0.53 0.46 0.44

0.57 0.65 0.53 0.46 0.44

a Hol-ref: predictions using Nordic Holstein as the reference population to predict Nordic Red; Red-ref: predictions using Nordic Red as the reference population to predict Nordic Red; Com-ref: two-breed combined predictions using the two-trait GBLUP model with the unweighted G-matrix (GUNW).

60

Density plot of genomic relationship coefficients

30 20

Density

40

50

DNK FIN SWE

0

10

Discussion

−0.15

−0.10

−0.05

0.00

0.05

0.10

0.15

Genomic relationship coefficients

Figure 1 Genomic relationship coefficients of Danish Red (DNK), Finnish Ayrshire (FIN) and Swedish Red (SWE) with Nordic Holstein in the twobreed unweighted G-matrix. It was calculated based on the separate base populations for Nordic Holstein and Nordic Red. In the figure, Nordic Holstein included 3126 individuals in the reference population (born before 1st January 2002), and Danish Red, Finnish Ayrshire and Swedish Red included 184, 469 and 265 individuals in the validation population (born after 1 January 2002).

than 0.05 with Nordic Holstein bulls according to the GUNW matrix. Among these Nordic Red bulls, 103 bulls were Danish Red, one was Swedish Red and none were Finnish Ayrshire. 6

We also divided the Danish Red validation set into two sets: one set with higher and one set with lower genomic relationship coefficients with the Nordic Holstein reference population. Accuracies of these two sets for single Nordic Red predictions (Red-ref) and two-breed combined predictions (Com-ref) using the GUNW matrix are shown in Table 5. The set with higher genomic relationship with Nordic Holstein showed increased accuracies when moving from the Nordic Red predictions to the two-breed combined predictions. There was no increase for the set with lower genomic relationship with the Nordic Holstein. These results indicated that the closer the relationship with Nordic Holstein, the greater the increase in accuracies in the two-breed predictions. However, this was only observed in three production traits, not in fertility or mastitis.

It was reported that accuracies of predictions using one breed to predict another breed were around zero in a study of different US beef cattle breeds (Kachman et al. 2013). Accuracies of genomic predictions using a one-breed reference population to predict another breed were much lower than within-breed predictions in this study, but they were obviously higher than zero except for fertility in Nordic Holstein and mastitis in Nordic Red. It indicated that Nordic Holstein and Nordic Red were useful to some extent in predicting each other. Moreover, it was found that accuracies increased more in Danish Red than Swedish Red and Finnish Ayrshire when using the two-breed predictions. This was because a larger proportion of Danish Red had a closer relationship with Nordic Holstein than Swedish Red and Finnish Ayrshire (Figure 1). It has been reported that Holstein cattle were used in the breeding programme of Danish Red (Makgahlela et al. 2012). On the other hand, Danish Red (set 2) © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

L. Zhou et al.

Genomic predictions across breeds

Table 5 Correlations of direct genomic values (DGV) and deregressed proof (DRP) for two sets of Danish Red in Nordic Red reference predictions and combined reference predictions Set 1a

Set 2a

Traits

Red-refb

Com-refb

Red-ref

Com-ref

Milk yield Fat yield Protein yield Fertility Mastitis

0.62 0.70 0.70 0.36 0.59

0.63 0.68 0.70 0.37 0.59

0.34 0.42 0.34 0.24 0.40

0.41 0.50 0.37 0.25 0.41

a

Set 1 is a validation set of Danish Red (81 bulls) in lower genomic relationship with Nordic Holstein with an average genomic relationship with Nordic Holstein of 0.036. Set 2 is a validation set of Danish Red (103 bulls) in higher genomic relationship with Nordic Holstein with an average genomic relationship with Nordic Holstein of 0.075. The average genomic relationship with Nordic Holstein is calculated by averaging maximums of genomic relationship of each Danish Red bull with all Nordic Holstein reference bulls. b Red-ref: predictions using Nordic Red as the reference population; Com-ref: two-breed combined predictions using the two-trait genomic best linear unbiased prediction (GBLUP) model with the unweighted G-matrix (GUNW).

which had closer genomic relationship with Nordic Holstein showed larger increases in accuracies from the Nordic Red predictions to the two-breed predictions (Table 5). However, this set of Danish Red had lower accuracies compared to another set (set 1) which had less close genomic relationship with Nordic Holstein. This could be explained that set 1 was on the other hand more close to the Nordic Red reference which included Danish Red, Swedish Red and Finnish Ayrshire. At the same time, accuracies of Swedish Red for three production traits (milk, fat and protein yield) were higher than Finnish Ayrshire in predictions using Nordic Holstein as the reference population (Table 4). This perhaps because Holstein genes were transmitted to Swedish Red by Danish Red bulls, as Danish Red had been used in the breeding programme of Swedish Red (Bett et al. 2010). The results strongly confirmed that accuracies of genomic predictions were more likely due to population structure or close relationships instead of LD between QTL and markers (Daetwyler et al. 2012). Accuracies of predictions using the combined reference population of two breeds were slightly higher than the single-breed reference population predictions for both Nordic Holstein and Nordic Red. For Nordic Red, accuracies improved around 2% for the three production traits and around 1% for fertility and mastitis. For Nordic Holstein, accuracies increased slightly only for milk yield, protein yield and mastitis (Table 3). These results are in line with some previous © 2014 Blackwell Verlag GmbH

• J. Anim. Breed. Genet. (2014) 1–9

studies (Makgahlela et al. 2012; Olson et al. 2012), which also achieved small increases in accuracies. Increased accuracies from the combined two-breed predictions were observed mainly in the production traits and less obviously in fertility and mastitis. One reason for this is that more complex genetic mechanism and more QTL with smaller effects relate to these health and reproduction traits. Another possible reason is the definitions of these traits. For example, fertility is a combined index of interval from calving to first insemination, interval from first to last insemination and number of inseminations. The QTL affecting a single trait (like calving to first insemination) may have no effects on the other traits (like interval from first to last insemination). Consequently, QTL with smaller effects more greatly influence these combined index traits. This could be a possible reason that fertility and mastitis showed smaller increase in accuracies from the combined two-breed predictions. Using the GLD matrices in the two-breed predictions did not increase accuracies compared with using the GUNW matrix. It was observed that genomic relationship coefficients between Nordic Holstein and Nordic Red individuals in the GLD matrices were very similar to those in the unweighted matrix (GUNW). LD phase consistencies generally scaled down genomic relationship coefficients between individuals of these two breeds, as weights that were used to construct the G-matrix were less than one. The means of LD phase consistencies for marker intervals with 5, 10 and 15 SNPs were 0.59, 0.52 and 0.46. For example, there were 104 Nordic Red validation bulls which had genomic relationships above 0.05 with the Nordic Holstein reference bulls in the GUNW matrix. This number reduced to 32 pairs in the GLD matrix (at the case of five SNP in each interval). When we tried to scale the LD phase consistencies to higher mean (from 0.59 to 0.8), the accuracies were almost the same as without scaling (results not shown). Accuracies of two-breed predictions using the GEff matrices or GLD_Eff matrices were even lower than using the GUNW matrix. Weighting G-matrices with marker effects did not perform better than the GUNW matrix in the two-breed predictions, even not in the single-breed predictions (results not shown). We also tested using blocks of marker effects as weights, where marker effects were summed in each block. However, accuracies also decreased compared with using the GUNW matrix or the GLD matrices in the two-breed genomic predictions (results not shown). The unsuccessful results of weighting by marker effects suggest that using effects estimated from data to construct covariance structure for fitting data in the present way 7

L. Zhou et al.

Genomic predictions across breeds

may be problematic. The utilization of marker effects to construct the G-matrix for genomic prediction needs further exploration. In our study, within-breed blocks of the G-matrices were constructed as separate populations for Nordic Holstein and Nordic Red, where allele frequencies were calculated using all genotyped bulls of each breed. Erbe et al. (2012) explored a GBLUP method using a common base population with allele frequencies p = aphol + (1  a) pjer in genomic predictions of Australian Holstein and Jersey, where a was the relative inbreeding coefficient for Jersey. Using this common base population for the G-matrix, accuracies increased for Jersey but not for Holstein cattle in the joint Australian Holstein and Jersey predictions, compared with the single-breed predictions. How to choose base population for calculating genomic relationship matrices of multiple breeds is an important aspect for research. In general, accuracies of genomic predictions using a combined reference population were only marginally higher than that of the single-breed predictions. The increase in accuracies of the across-breed genomic predictions in the present study can be explained by the relatively close relationship between these two breeds. One possible reason for only small increase in accuracies from the across-breed genomic prediction could be that the marker density in this study is not high enough to preserve strong LD of markers and QTL across breeds. It has been reported that between Bos taurus cattle breeds, LD phase is persistent only for marker pairs

Genomic predictions across Nordic Holstein and Nordic Red using the genomic best linear unbiased prediction model with different genomic relationship matrices.

This study investigated genomic predictions across Nordic Holstein and Nordic Red using various genomic relationship matrices. Different sources of in...
168KB Sizes 0 Downloads 3 Views