Gene 543 (2014) 22–27

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Phylogenetic positions of RH blood group-related genes in cyclostomes Akinori Suzuki, Kouhei Endo, Takashi Kitano ⁎ Department of Biomolecular Functional Engineering, College of Engineering, Ibaraki University, 4-12-1 Nakanarusawa-cho Hitachi 316-8511, Japan

a r t i c l e

i n f o

Article history: Received 29 January 2013 Received in revised form 24 March 2014 Accepted 6 April 2014 Available online 8 April 2014 Keywords: RH Gene family Duplication Lamprey Phylogeny

a b s t r a c t The RH gene family in vertebrates consists of four major genes (RH, RHAG, RHBG, and RHCG). They are thought to have emerged in the common ancestor of vertebrates after two rounds of whole genome duplication (2R-WGD). To analyze the detailed phylogenetic relationships within the RH gene family, we determined three types of cDNA sequence that belong to the RH gene family in lamprey (Lethenteron reissneri) and designated them as RHBG-like, RHCG-like1, and RHCG-like2. Phylogenetic analyses clearly showed that RHCG-like1 and RHCG-like2 genes, which were probably duplicated in the lamprey lineage, are orthologs of gnathostome RHCG. In contrast, the clear phylogenetic position of the RHBG-like gene could not be obtained. Probably some convergent events for cyclostome RHBG-like genes prevented the accurate identification of their phylogenetic positions. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Human RH blood group (also known as RH30) genes encode membrane proteins composed of twelve transmembrane domains (Avent et al., 1990, 1992) and are expressed only on erythrocytes (ChérifZahar et al., 1990). In humans, two tandem-duplicated RHD and RHCE loci are located on chromosome 1p34–p36 (Chérif-Zahar et al., 1991; Ruddle et al., 1972), and human individuals are divided into RHpositive and RH-negative categories according to the presence or absence of the RHD antigen. Three distantly related homologous genes, RHAG (RH-associated glycoprotein, also known as RH50, 6p21.1-p11), RHBG (RH family, B glycoprotein, 1q21.3), and RHCG (RH family, C glycoprotein, 15q25) have been identified in the human genome. RHAG is an erythrocyte-specific protein and is known to interact with RH. The RHBG and RHCG proteins are non-erythroid members of the RH blood group gene family and are primarily expressed in the kidney, acting as ammonium transporters (Marini et al., 2000). These proteins are also known to act as ammonium transporters on fish gills (Nakada et al., 2007). The four genes of the RH gene family are thought to have duplicated in the common ancestor of vertebrates (Huang and Peng, 2005; Kitano et al., 2010; Peng and Huang, 2006), probably by two rounds of whole genome duplication (2R-WGD) (Ohno, 1970). These two genome duplications should have occurred between the Vertebrata and Tunicata

Abbreviations: RH, Rhesus blood group; RHD, Rhesus blood group D antigen; RHCE, Rhesus blood group CcEe antigens; RHAG, RH-associated glycoprotein; RHBG, RH family B glycoprotein; RHCG, RH family C glycoprotein; WGD, whole genome duplications. ⁎ Corresponding author. E-mail address: [email protected] (T. Kitano).

http://dx.doi.org/10.1016/j.gene.2014.04.015 0378-1119/© 2014 Elsevier B.V. All rights reserved.

divergence (ca. 722.5 Ma ago (MYA)) and the Tetrapoda and fish divergence (ca. 400.1 MYA) (divergence times were cited from the TimeTree database, a public resource for species relationships and divergences (Hedges et al., 2006)). However, detailed duplication timings and aspects of the differentiation of genes of the RH gene family are still open to discussion. In particular, there is no phylogenetic analysis of the RH gene family that has included cyclostome species. In this study, therefore, we examined the duplication timings of genes of the RH gene family with newly determined cDNA sequences for the Far Eastern brook lamprey (FEB; Lethenteron reissneri), which belongs to the Cyclostomata, a primitive Vertebrata group. Members of Cyclostomata (hagfish and lamprey) are the living representatives of Agnatha (jawless vertebrates). Agnatha diverged from Gnathostomata (jawed vertebrates), after the divergence between Tunicata (e.g., Ciona) and Vertebrata. 2. Materials and methods 2.1. cDNA sequencing Two Far Eastern brook (FEB) lamprey individuals (designated w2 and G2) were captured in the village of Sakegawa, Yamagata Prefecture, Japan, in May 2006. Total RNA was extracted from the whole body, using TRIzol Reagent (Invitrogen). Reverse transcription was performed using SuperScript II reverse transcriptase and oligo dT-adaptor primer (Invitrogen). Typical PCR was performed in a 20 μl reaction volume containing 0.5–1.0 μl of first-strand cDNA, 1 × Ex Taq buffer, 0.2 mM dNTP mixture, 10 pmol each of forward and reverse primers, and 1 unit of TaKaRa Ex Taq (TaKaRa Bio). A list of primers used in this study is shown in Supplementary Table S1. We also performed 5′ rapid amplification of cDNA ends (5′RACE) using the 5′RACE System (Invitrogen) to determine the 5′ sequence region of the cDNA. PCR products were

A. Suzuki et al. / Gene 543 (2014) 22–27

purified using the QIAquick PCR Purification Kit (QIAGEN). DNA sequencing was performed on PCR products using the BigDye Terminator v3.1 Cycle Sequencing Kit and ABI PRISM 310 Genetic Analyzer (Applied Biosystems). The Phred/Phrap software program (Ewing et al., 1998) was used for base-calling and assembly and for obtaining quality scores for assembled data. Editing was performed using the Consed package (Gordon et al., 1998) to identify all low-quality bases and to check that the assembly was correct, based on linking information.

23

positions of the codon and were viewed as synonymous changes. RHCG-like1 and RHCG-like2 probably occupied distinct loci from each other rather than being alleles at a single locus, because both the partial RHCG-like1 and RHCG-like2 sequences were also observed in the other individual, G2 (data not shown). The codon starting position of each sequence was predicted by making comparisons with other vertebrate gene sequences of the RH gene family. The DDBJ/EMBL/GenBank International Nucleotide Sequence Database accession numbers are AB777254–AB777256.

2.2. Phylogenetic analyses 3.2. Phylogenetic positions of genes of the RH gene family in cyclostomes Sequence data retrieved from GenBank and Ensembl are listed in Supplementary Table S2. Multiple alignments were performed using ClustalW2 (Larkin et al., 2007), MAFFT (Katoh and Standley, 2013), MUSCLE (Edgar, 2004), and T-Coffee (Notredame et al., 2000) software. Amino acid sequences from exon 2 to exon 7 (exon numbers followed human RHD) were used, because these regions are relatively wellconserved enough to construct decent multiple alignments (Kitano et al., 2010). A multiple alignment was done for each exon, which were then concatenated into one multiple alignment. trimAl (CapellaGutiérrez et al., 2009) software with the ‘strict’ option was used to remove poorly aligned regions from the four multiple alignments by ClustalW2, MAFFT, MUSCLE, and T-Coffee. Regions, which were assigned as poorly aligned regions even for one of the four multiple alignments, were excluded from the phylogenetic inferences. Nucleotide sequence data was back-translated from the multiple alignments from the above trimmed amino acid sequences. Moreover, to avoid the effects of saturation, homoplasy, and GC content bias, we only used nucleotide sequence data from the 1st and 2nd codon positions for the phylogenetic analyses. Qiu et al. (2011) showed extremely high GC content in the 3rd codon positions in various protein-coding nucleotides of the sea lamprey (Petromyzon marinus). In the data set for the RH gene family, higher GC contents in the 3rd codon positions of lamprey species were also observed (Supplementary Fig. S1). The Bayesian approach (Huelsenbeck et al., 2001) implemented in the MrBayes version 3.2 software package (Ronquist et al., 2012) with 10,000,000 generations was used for phylogenetic inference. The mixed model and the GTR model were used for amino acid and nucleotide data, respectively. To select substitution models, the ModelGenerator (Keane et al., 2006) and the Find Best DNA/Protein Models (ML) option implemented in the MEGA5 software package (Tamura et al., 2011) were used for amino acid and nucleotide sequence data. The PhyML 3.0 software package (Guindon et al., 2010) was used to compare the three topologies for genes of the RH gene family. JTT (Jones et al., 1992), WAG (Whelan and Goldman, 2001), and LG (Le and Gascuel, 2008) models were used for the amino acid data. The approximately unbiased test (Shimodaira, 2002) was performed using the CONSEL software package (Shimodaira and Hasegawa, 2001). Phylogenetic networks of the RH gene family were constructed using the neighbor-net method (Bryant and Moulton, 2004) with the JTT model (Jones et al., 1992) for amino acid data, and the GTR model (Tavaré, 1986) for nucleotide data, implemented using the SplitsTree4 software package (Huson and Bryant, 2006). 3. Results and discussion 3.1. RH gene family sequences in the Far Eastern brook lamprey We sequenced three types of genes of the RH gene family using two Far Eastern brook lamprey individuals, designated w2 and G2 (Supplementary Table S3). We tentatively named these sequences ‘RHBG-like’, ‘RHCG-like1’, and ‘RHCG-like2’. The full cDNA sequence for RHBG-like was determined from individual G2, and those for RHCG-like1 and RHCG-like2 were determined from individual w2. In RHBG-like, the following seven ambiguous nucleotide sites were observed: 234Y, 258M, 285R, 396Y, 417Y, 537M, and 747Y; all of them were on the 3rd

Fig. 1 shows a phylogenetic tree of genes of the RH gene family constructed from amino acid data using the Bayesian method outlined above. In total, 225 amino acid sites (Supplementary Fig. S2) were used for the phylogenetic analysis. There were four groups: RH, RHAG, RHBG, and RHCG. Three Ciona and two amphioxus sequences were used as out-groups. The three cyclostome sequences determined in this study (FEB_Lamprey_RHBG-like, FEB_Lamprey_RHCG-like1, and FEB_Lamprey_RHCG-like2) and three from databases (S_Lamprey_RHCG, Hagfish_RHBG, and Hagfish_RHCG) were included in the phylogenetic tree. Because the four genes (RH, RHAG, RHBG, and RHCG) of the RH gene family in the human genome are not clustered in a particular region, but are spread across different chromosomal locations, and because almost all vertebrates have the four genes of the RH gene family, it is possible to assume that the four genes have duplicated by 2R-WGD in the common ancestor of vertebrates, not by a series of tandem duplications. In light of this conclusion, it is possible to assume that the tree topology of the four genes should show a symmetrical topology such as ((RH, RHAG), (RHBG, RHCG)). In the phylogenetic tree, however, although the RHBG cluster and the RHCG cluster formed a cluster, the RH cluster and the RHAG cluster did not clearly form a cluster. The phylogenetic positions of the FEB_Lamprey_RHCG-like1, the FEB_Lamprey_RHCG-like2, and the Hagfish_RHCG were located on the common ancestral position of gnathostome RHCG. Therefore, it is possible to assume that these three genes are orthologs of gnathostome RHCG. In contrast, the phylogenetic positions of the FEB_Lamprey_RHBG-like, the Hagfish_RHBG, and the S_Lamprey_RHCG were unclear. Although the gene (S_Lamprey_RHCG) from the genome data of sea lamprey (Petromyzon_marinus_7.0) in Ensembl is annotated as RHCG, it does not appear to be an ortholog of gnathostome RHCG, because it did not form a cluster with other RHCG genes. When a phylogenetic tree was constructed using nucleotide sequence data from the 1st and 2nd codon positions, similar results were obtained (Supplementary Fig. S3). Although cyclostome RHBG-like genes were located in the common ancestral position of gnathostome RHBG and RHCG genes, the probability of this location is not high (Supplementary Fig. S3). Since the phylogenetic position of cyclostome RHBG-like genes were unclear in the phylogenetic tree (Fig. 1 and Supplementary Fig. S3), it is possible that incompatible informative sites exist in the sequence data, which had been generated by convergent events. Although the concept of a phylogenetic tree is very simple and has proved to be extremely robust across many studies, phylogenetic trees are less suited to modeling mechanisms of reticulate evolution such as horizontal gene transfer, hybridization, recombination, convergence, and reassortment. Moreover, mechanisms such as incomplete lineage sorting, or complicated patterns of gene duplication and loss, can lead to incompatibilities that cannot be represented in a tree. A phylogenetic network is a generalization of the concept of the phylogenetic tree. In contrast to the phylogenetic tree, a phylogenetic network may show some incompatible tree topologies on a single diagram through the implementation of reticulations (Huson et al., 2010; Kitano, 2012). To continue our analysis more deeply, we constructed a phylogenetic network of genes of the RH gene family (Fig. 2) by amino acid data using the neighbor-net method with a JTT matrix model that included the gamma model (α = 1.486). This model was selected as the suit one from the Bayesian approach above.

24

A. Suzuki et al. / Gene 543 (2014) 22–27

Human Macaque Mouse 100 98 Rat 100 Platypus RHCG 99 Opossum 99 Chicken 100 Xenopus 100 Takifugu 1 61 Takifugu 2 99 99 FEB Lamprey RHCG1 FEB Lamprey RHCG2 Cyclostome Hagfish RHCG 99 Human Macaque 100 100 Mouse 100 95 Rat 98 Opossum 93 100 100

]

100

100 100

Xenopus Takifugu 98 Human 100 Macaque 100 Mouse 99 Rat 98 Opossum RHAG 100 Platypus 99 Chicken Xenopus Takifugu

RHCG-like

RHBG Chicken

100

100

100 100

97 100

Chicken

100

S Lamprey RHCG 74 FEB Lamprey RHBG Hagfish RHBG Ciona RHA 100 Ciona RHB Ciona RHC Amphioxus RHR1 Amphioxus RHR2 100

100 100

Human D Human CE Macaque Mouse Rat

RH

Xenopus Takifugu

] Cyclostome RHBG-like

0.1 Fig. 1. A phylogenetic tree of RH blood group-related genes in vertebrates (RH, RHAG, RHBG, and RHCG) constructed from amino acid sequence data using the Bayesian method with a JTT matrix-based model that included the gamma model (α = 1.486). Posterior probabilities are shown on each branch. Two amphioxus and three Ciona genes are used as out-groups. FEB_Lamprey: Far Eastern brook lamprey (Lethenteron reissneri); S_Lamprey: Sea lamprey (Petromyzon marinus).

The constructed phylogenetic network (Fig. 2) clearly shows the four gene clusters (RH, RHAG, RHBG, and RHCG) of the RH gene family in vertebrates in the same way as the phylogenetic tree (Fig. 1). In addition, the phylogenetic network suggests a cluster of the RH and RHAG clusters, which was not evidenced in the phylogenetic tree. Moreover, the phylogenetic network implies that three genes (FEB_Lamprey_RHBG-like, Hagfish_RHBG, and S_Lamprey_RHCG) of the RH gene family from cyclostomes are closer to gnathostome RHAG than to others. On the other hand, a phylogenetic network constructed from nucleotide sequence data from the 1st and 2nd codon positions implies that those three genes (FEB_ Lamprey_RHBG-like, Hagfish_RHBG, and S_Lamprey_RHCG) from cyclostomes are in fact closer to gnathostome RH (Supplementary Fig. S4). To perform a detailed phylogenetic analysis of the phylogenetic positions of genes of the RH gene family in cyclostomes, we carried out a maximum likelihood analysis for three possible topologies (Fig. 3). It should be noted that these three topologies satisfy the following conditions: (1) In gnathostomes, the RH gene cluster forms a cluster with the RHAG gene cluster, and the RHBG gene cluster forms a cluster with the RHCG gene cluster (Huang and Peng, 2005; Kitano et al., 2010). (2) Cyclostomes are monophyletic (Blair and Hedges, 2005; Delarbre et al., 2002; Delsuc et al., 2006; Heimberg et al., 2010; Kuraku et al., 1999; Mallatt and Sullivan, 1998; Mallatt and Winchell, 2007; Meyer and Zardoya, 2003; Stock and Whitt, 1992; Takezaki et al., 2003). (3) Wellestablished phylogeny for gnathostomes (((((((Human, Macaque), (Mouse, Rat)), Opossum), Platypus), Chicken), Xenopus), Takifugu) is used for each gene cluster. (4) Three genes (FEB_Lamprey_RHCG-like1, FEB_Lamprey_RHCG-like2, and Hagfish_RHCG) of cyclostomes are orthologs of gnathostome RHCG. (5) 2R-WGD occurred before the

divergence between cyclostomes and gnathostomes (Kuraku et al., 2009). Table 1 shows the maximum likelihood analyses using the three models. The JTT, LG, and WAG models were chosen as the best model from MrBayes, ModelGenerator, and MEGA, respectively. The result showed that topology 2, wherein cyclostome RHBG-like genes are orthologs of gnathostome RHAG, was the most probable topology, although not to a statistically significant degree (Table 1). Thus, we cannot reject the remaining two possibilities that cyclostome RHBG-like genes are orthologs of gnathostome RHBG or RH. In order to test whether there were any particular regions that supported particular topologies, a window analysis among the three topologies (Fig. 3) was carried out. Fig. 4 shows its results. To make this graph, we compared maximum likelihood values among the three topologies (Fig. 3) for each 50-amino-acid-residue region, excluding gap sites, and plotted log-likelihood differences from the maximum likelihood values. For example, in the region from 1 to 50, the log-likelihood values of topologies 1, 2, and 3 were −1366.39563, −1363.38764, and −1363.38763, respectively. Thus, 0 for topology 3, −0.00001 for topology 2, and −3.00800 for topology 1 were plotted. Then, after sliding over 10 residues, to the region from 11 to 60, the same calculations were performed. This step was repeated until the region from 171 to 220. As a result, although a middle segment of the data, which corresponds to the exons 4–5 regions, supports the cluster of cyclostome RHBG-like genes and gnathostome RHBG genes, the rest of the segment shows lower likelihood values for this clustering than those of other possible clusterings. In contrast, topology 2, which indicates the clustering of cyclostome RHBG-like genes and gnathostome RHAG genes, has predominantly higher likelihood values than others throughout the

A. Suzuki et al. / Gene 543 (2014) 22–27

_R

Ta kif ug lik u y_R e 1 HC G-l ike 2 Rat Mouse Macaque Human ssum s Opo latypuen P i c k us C h op n Xe HC

G-

gfis

h_

RH

kif Ta

Ha

gu

Xe

no

pu

s

Ch

ick

en

Mac

Pla C h ic t Opypus ken os su m

ue aq ac an M m Huat R se Mou us Xenop

Ta

ifu

se Mou Rat

Takifugu

RHC

pre

Ciona _RH A Cio Cio na_RH Am Am na C ph _R iox ph HB us io xu _R s_ HR R 2 H R 1

rey

am

S_L amp rey_ R ampre y_RHB HCG G-like Hagfish_RHBG

mp

B_L

G

FE

k Ta

RH

aque

Human D Human C E

0.01

AG

F E B_L

La

G

RH

B_

B

Rat Mousuee caq n um pus MaHumaoss no Op Xe

FE

R

H

CG

lo

C

ke

k ug ifug u u 2 1

yc

e

H

-li

Chicken

C

st

om

R

G

25

Cyclostome RHBG-like

Fig. 2. A phylogenetic network of RH blood group-related genes in vertebrates (RH, RHAG, RHBG, and RHCG) constructed by the neighbor-net method with a JTT matrix-based model that included the gamma model (α = 1.486), using amino acid sequence data.

data, except for the middle segment. Therefore, it is possible to assume that cyclostome RHBG-like genes have experienced some convergent events in the region around exons 4–5, and that the outcomes from those events have hindered the construction of the true topology. Alternatively, it might be possible to explain the observed pattern by exon shuffling. However, because the four genes (RH, RHAG, RHBG, and RHCG) of the RH gene family in vertebrates are not clustered in a particular region but are spread across different chromosomal locations, it is unlikely that exon shuffling occurred among these four genes. However, because the genomic sequences of the RH gene family of the lamprey have not been determined yet, further genomic sequencing analyses will be necessary to consider whether exon shuffling affects the RH gene family of the lamprey or not. When we compared amino acid sequences between the RHBG-like and RHCG-like genes of cyclostomes (Fig. 5), we were able to observe relatively high homology between a region made of part of exon 2 and

Topology 1 RH RHAG RHBG cycRHBGL RHCG cycRHCGL

a region from the end of exon 4 to the beginning of exon 5, which correspond to transmembrane domains 1 and 6, respectively. It has been suggested that some amino acid residues on these regions participate in channel activity (Callebaut et al., 2006). Thus, we can assume the following scenario. (1) Cyclostome RHBG-like genes were intrinsically orthologs of gnathostome RH or RHAG. (2) The original RHBG genes of cyclostomes have been lost from the genome of the common ancestor of cyclostomes. (3) Cyclostome RHBG-like proteins needed to have RHBG-like or RHCG-like transport activities instead of the activity of the original cyclostome RHBG. (4) Some convergent amino acid changes, which caused genes to resemble RHBG or RHCG, have occurred in parts of cyclostome RHBG-like genes. (5) As a result, the phylogenetic position of cyclostome RHBG-like genes moved away from being neighbors of gnathostome RH or RHAG in the phylogenetic tree. RH and RHAG are mainly expressed on erythrocytes, whereas RHBG and RHCG are mainly expressed on non-erythroid organs. Moreover, RHBG and

Topology 2

Topology 3 RH RHAG cycRHBGL RHBG RHCG cycRHCGL

RH cycRHBGL RHAG RHBG RHCG cycRHCGL

Fig. 3. Three possible topologies for the maximum likelihood analysis. RH, RHAG, RHBG, and RHCG indicate gnathostome genes. cycRHBGL: cyclostome RHBG-like genes; cycRHCGL: cyclostome RHCG-like genes. White and black diamonds indicate positions of 1R-WGD and 2R-WGD, respectively.

26

A. Suzuki et al. / Gene 543 (2014) 22–27

In conclusion, although our results suggested that cyclostome RHCGlike genes are orthologs of gnathostome RHCG, the true phylogenetic position of cyclostomes RHBG-like genes could not be conclusively defined. To elucidate the orthology of cyclostome RHBG-like genes, analyses that leverage a greater number of RHBG-like genes from some other cyclostome species and/or synteny information of the RH gene family obtained from genome sequence data of cyclostome species as well as integrated phylogenetic analyses that include neighboring genes should be performed. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2014.04.015.

Table 1 Results of the approximately unbiased test. Topologya

cycRHCGLb

cycRHBGLb

JTT

LG

WAG

1 2 3

RHCG RHCG RHCG

RHBG RHAG RH

0.073 0.678* 0.546

0.094 0.822* 0.349

0.078 0.793* 0.397

P-values of the approximately unbiased test (Shimodaira, 2002) are shown. The maximum likelihood estimate for each model is shown by an asterisk (*). a See Fig. 3. b cycRHCGL and cycRHBGL indicate the clustering of cyclostome RHCG-like and RHBGlike genes, respectively, with orthologous gnathostome genes.

RHCG are mainly expressed on gills in fish species (Nakada et al., 2007; Wright and Wood, 2009). Thus, it might be assumed that there is some functional differentiation between the RH–RHAG and RHBG–RHCG pairs.

Conflict of interest The authors declare that there is no conflict of interests regarding the publication of this paper.

21

31

41

~

~

~

~

~

~

~

~

~

~

~

~

~

~

~

~

~

0

11

~

Amino acid sites of 50 each windows 1

51

61

71

81

91

101

111

121

131

141

151

161

171

50

60

70

80

90

100

110

120

130

140

150

160

170

180

190

200

210

220

lnL differnece from ML

-1 -2 -3 -4 -5 -6 -7 Topology 1

Topology 2

Topology 3

-8

Fig. 4. Distribution of log-likelihood differences from the maximum likelihood values for each topology (see Fig. 3) by the sliding window analyses. The window size of amino acid data is 50 residues and the sliding range is 10 residues. Log-likelihood values were estimated by the JTT model.

Hagfish_RHBG S_Lamprey_RHCG FEB_Lamprey_RHBG-Like Hagfish_RHCG FEB_Lamprey_RHCG-Like1 FEB_Lamprey_RHCG-Like2

Hagfish_RHBG S_Lamprey_RHCG FEB_Lamprey_RHBG-Like Hagfish_RHCG FEB_Lamprey_RHCG-Like1 FEB_Lamprey_RHCG-Like2

Hagfish_RHBG S_Lamprey_RHCG FEB_Lamprey_RHBG-Like Hagfish_RHCG FEB_Lamprey_RHCG-Like1 FEB_Lamprey_RHCG-Like2

TM1 TM2 TM3 TM4 ============================ exon 2 ============================|========== exon 3 ======================= FQDVHVMIFIGFGFLMTFLKRYSFSSVGFNFLIAAFGLQWAVLLQGWLHHFDSSTMKIHIGMEG|MINADFCTAAVLISFGAVLGKTSPLQLLVMTVLEEVFFALN FQDVHVMVFVGFGFLMTFLQRYGYSSVGFNFMIAGFGIQWAILVQGWVQHFDSNTGTIKVNLES|MINADFCSASVLISFGAVLGKTSPVQLLIMTVIEELIFSVN FQDVHVMVFVGFGFLMTFLQRYGYSSVGFNFMIAGFGIQWAILVQGWVQHFDSVTGTIKVGLES|MINADFCSAAVLISFGAVLGKTSPVQLLIMTVIEELIFAVN FQDVHVMVFMGFGFLMTFLQRYGFSAVGFNFLVASFSLQWATLMQGWFHHFQ--DGKILVGVES|LINADFCAASMLIAFGAVLGRTRPVQLLIMAFFQVTLFSVN FQDVHVMIFIGFGFLMTFLKRYGFTSVGATFFLAAFGIQWATLMQGWFWHLGP-DGKIRVGVIN|MINADFCVGSVLIAFGALLGKTTPVQLLFMALPQITLYAVN FQDVHVMIFIGFGFLMTFLKRYGFSSVGFNFMLAAFGIQWATLMQGWFEHLGP-DGKIRVGVIN|MINADFCTASVLIAFGAVLGKTTPVQMLFMAIFQVTLFAVN ******* * ********* ** ** * * * *** * *** * * | ****** ** *** ** * * * * * *

TM5 TM6 TM7 =========|=========================== exon 4 ===============|====================== exon 5 =============== EHIAIGILQ|VNDAGGSMVIHLFGAYFGLAVSRVMFRDGLHGETHEKEGSVYHSDVFAMI|GTIFLWMFWPSFNSAIAGHGDDQMRAAMNTYFSLAASLVATFAIS EFVGIYILR|VKDAGGSMIIHIFGAYFGLTVTRMLYRKGL-EEGHPKEGAVYHSDLFAMI|GTLFLWMFWPSFNSAIALHGDDQMRAVMHTYFSLASCVLTTIVIS EFVGIYILR|VKDAGGSMIIHIFGAYFGLTVTRMLYRKGL-EEGHPKEGAVYHSDLFAMI|GTLFLWMFWPSFNSAIALHGDDQMRAVMHTYFSLASCVLTTIVIS EYILLNLLE|VIDARGSMTIHCFGGFFGLAVSRVLYRPGL-KEPHRKASSVYHSDLFAMI|GTLFLWIYWPSFNSAISEKGVNQTRAVINTYYTLASCTVTTCILS EYIVLHLLR|CNDAGGSMTIHTFGAYFGLAVSRVLYRPGL-KNGHPKNGSVYHSDVFAMI|GTIFLFLFWPSFNSAISAAGADQHRASINTYYSLMASVVMTYAIS EFIILHPLH|CNDAGGSMTIHTFGCYFGLAVSRVLYRPGL-KEGHPKNGSVYHSDLFAMI|GTLYLWMYWPSFNSAISVTGADQHRAAINTYYTLAACVVVTVAMS * * | ** *** ** ** *** * * * ** * * ***** ****|** * ******** * * ** ** * * *

TM8 TM9 TM10 ==============|========================= exon 6 =============|================== exon 7 =================== SLTSREGKLDM---|VHIQNASLAGGVAVGTCADMMINPYGAIMIGFCSGIVSTLGFKYLA|PLLESKLKVHDTCGVHNLHGMPGILGGLSGAIAAAFASEEVYGLS SLLDNHGKLDMVPP|VHIQNASLAGGVAVGTCGDMMLSPYGALILGFIAAIISTFGFKYLT|PILASKLKIQDTCGVHNLHGLPGIVAGIAGAVVAACASEEVYGFS SLLDNHGKLDM---|VHIQNASLAGGVAVGTCGDMMLSPYGALILGFIAAIISTFGFKYLT|PVLASKLKIQDTCGVHNLHGLPGILAGIAGAVVAACASEQVYGFS SLVDKSGRINM---|VHLQSSTLAGAVAVGTAAEMMLTPYGSLIVGLILGTLSTLGYTFIT|PALEKYLHVQDTCGIHNLHALPGFCGGIIGAITAAAASEATYGSR SLTDKHGKLDM---|VHIQNATLAGGVAMGTAGEMMVTPYGSLIVGFIAGTISTLGFKYLS|PLLDSKLKIQDTCGIHNLHAMPGFIGGIVGAVTAATAPNSGTYGL SLLDKRGRLDM---|VHIQNATLAGGVAMGTAGEMMITAYGALIVGFITGIVSTLGFKYLT|PFMASKLKIQDTCGIHNLHGMPGIIGGITGAITAATAPTSGTYGE ** * * |** * *** ** ** ** ** * ** * |* * **** **** ** * ** ** *

Fig. 5. A multiple alignment of exons 2–7 of RH blood group-related genes of cyclostomes. Transmembrane-domains are boxed, and some amino acid sites that participate in the channel function of the proteins are highlighted by gray backgrounds. Domains and functional sites are drawn following Callebaut et al. (2006). Regions that were considered poorly aligned and excluded from phylogenetic inferences are shown in gray (see Supplementary Fig. S2).

A. Suzuki et al. / Gene 543 (2014) 22–27

Acknowledgments We are grateful to Yoshiyuki Nagahata for helping with the capture of Far Eastern brook lamprey samples. We thank anonymous reviewers and the editor for their valuable comments. This study was supported by a Grant-in-Aid to TK for Scientific Research from the Japan Society for Promotion of Science, Grant Numbers 18770002, 23770269. References Avent, N.D., Ridgwell, K., Tanner, M.J., Anstee, D.J., 1990. cDNA cloning of a 30 kDa erythrocyte membrane protein associated with Rh (Rhesus)-blood-group-antigen expression. Biochemical Journal 271, 821–825. Avent, N.D., Butcher, S.K., Liu, W., Mawby, W.J., Mallinson, G., Parsons, S.F., Anstee, D.J., Tanner, M.J., 1992. Localization of the C termini of the Rh (rhesus) polypeptides to the cytoplasmic face of the human erythrocyte membrane. Journal of Biological Chemistry 267, 15134–15139. Blair, J.E., Hedges, S.B., 2005. Molecular phylogeny and divergence times of deuterostome animals. Molecular Biology and Evolution 22, 2275–2284. Bryant, D., Moulton, V., 2004. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Molecular Biology and Evolution 21, 255–265. Callebaut, I., Dulin, F., Bertrand, O., Ripoche, P., Mouro, I., Colin, Y., Mornon, J.P., Cartron, J.P., 2006. Hydrophobic cluster analysis and modeling of the human Rh protein threedimensional structures. Transfusion Clinique et Biologique 13, 70–84. Capella-Gutiérrez, S., Silla-Martínez, J.M., Gabaldón, T., 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973. Chérif-Zahar, B., Bloy, C., Le Van Kim, C., Blanchard, D., Bailly, P., Hermand, P., Salmon, C., Cartron, J.P., Colin, Y., 1990. Molecular cloning and protein structure of a human blood group Rh polypeptide. Proceedings of the National Academy of Sciences of the United States of America 87, 6243–6247. Chérif-Zahar, B., Mattéi, M.G., Le Van Kim, C., Bailly, P., Cartron, J.P., Colin, Y., 1991. Localization of the human Rh blood group gene structure to chromosome region 1p34.3– 1p36.1 by in situ hybridization. Human Genetics 86, 398–400. Delarbre, C., Gallut, C., Barriel, V., Janvier, P., Gachelin, G., 2002. Complete mitochondrial DNA of the hagfish, Eptatretus burgeri: the comparative analysis of mitochondrial DNA sequences strongly supports the cyclostome monophyly. Molecular Phylogenetics and Evolution 22, 184–192. Delsuc, F., Brinkmann, H., Chourrout, D., Philippe, H., 2006. Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439, 965–968. Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797. Ewing, B., Hillier, L., Wendl, M.C., Green, P., 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research 8, 175–185. Gordon, D., Abajian, C., Green, P., 1998. Consed: a graphical tool for sequence finishing. Genome Research 8, 195–202. Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O., 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biology 59, 307–321. Hedges, S.B., Dudley, J., Kumar, S., 2006. TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics 22, 2971–2972. Heimberg, A.M., Cowper-Sal-lari, R., Sémon, M., Donoghue, P.C., Peterson, K.J., 2010. microRNAs reveal the interrelationships of hagfish, lampreys, and gnathostomes and the nature of the ancestral vertebrate. Proceedings of the National Academy of Sciences of the United States of America 107, 19379–19383. Huang, C.H., Peng, J., 2005. Evolutionary conservation and diversification of Rh family genes and proteins. Proceedings of the National Academy of Sciences of the United States of America 102, 15512–15517. Huelsenbeck, J.P., Ronquist, F., Nielsen, R., Bollback, J.P., 2001. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310–2314. Huson, D.H., Bryant, D., 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23, 254–267. Huson, D.H., Rupp, R., Scornavacca, C., 2010. Phylogenetic Networks: Concepts Algorithms and Applications. Cambridge University Press, Cambridge. Jones, D.T., Taylor, W.R., Thornton, J.M., 1992. The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences 8, 275–282.

27

Katoh, K., Standley, D.M., 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772–780. Keane, T.M., Creevey, C.J., Pentony, M.M., Naughton, T.J., Mclnerney, J.O., 2006. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evolutionary Biology 6, 29. Kitano, T., 2012. Application of phylogenetic network. In: Hirai, H., Imai, H., Go, Y. (Eds.), Post-Genome Biology of Primates. Springer, pp. 181–190. Kitano, T., Satou, M., Saitou, N., 2010. Evolution of two Rh blood group-related genes of the amphioxus species Branchiostoma floridae. Genes & Genetic Systems 85, 121–127. Kuraku, S., Hoshiyama, D., Katoh, K., Suga, H., Miyata, T., 1999. Monophyly of lampreys and hagfishes supported by nuclear DNA-coded genes. Journal of Molecular Evolution 49, 729–735. Kuraku, S., Meyer, A., Kuratani, S., 2009. Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after. Molecular Biology and Evolution 26, 47–59. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G., 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948. Le, S.Q., Gascuel, O., 2008. An improved general amino acid replacement matrix. Molecular Biology and Evolution 25, 1307–1320. Mallatt, J., Sullivan, J., 1998. 28S and 18S rDNA sequences support the monophyly of lampreys and hagfishes. Molecular Biology and Evolution 15, 1706–1718. Mallatt, J., Winchell, C.J., 2007. Ribosomal RNA genes and deuterostome phylogeny revisited: more cyclostomes, elasmobranchs, reptiles, and a brittle star. Molecular Phylogenetics and Evolution 43, 1005–1022. Marini, A.M., Matassi, G., Raynal, V., André, B., Cartron, J.P., Chérif-Zahar, B., 2000. The human Rhesus-associated RhAG protein and a kidney homologue promote ammonium transport in yeast. Nature Genetics 26, 341–344. Meyer, A., Zardoya, R., 2003. Recent advances in the (Molecular) phylogeny of vertebrates. Annual Review of Ecology and Systematics 34, 311–338. Nakada, T., Westhoff, C.M., Kato, A., Hirose, S., 2007. Ammonia secretion from fish gill depends on a set of Rh glycoproteins. FASEB Journal 21, 1067–1074. Notredame, C., Higgins, D.G., Heringa, J., 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 302, 205–217. Ohno, S., 1970. Evolution by Gene Duplication. Springer Verlag, New York. Peng, J., Huang, C.H., 2006. Rh proteins vs Amt proteins: an organismal and phylogenetic perspective on CO2 and NH3 gas channels. Transfusion Clinique et Biologique 13, 85–94. Qiu, H., Hildebrand, F., Kuraku, S., Meyer, A., 2011. Unresolved orthology and peculiar coding sequence properties of lamprey genes: the KCNA gene family as test case. BMC Genomics 12, 325. Ronquist, F., Teslenko, M., van der Mark, P., Ayres, D.L., Darling, A., Höhna, S., Larget, B., Liu, L., Suchard, M.A., Huelsenbeck, J.P., 2012. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Systematic Biology 61, 539–542. Ruddle, F., Ricciuti, F., McMorris, F.A., Darlington, G., Chen, T., 1972. Somatic cell genetic assignment of peptidase C and the Rh linkage group to chromosome A-1 in man. Science 176, 1429–1431. Shimodaira, H., 2002. An approximately unbiased test of phylogenetic tree selection. Systematic Biology 51, 492–508. Shimodaira, H., Hasegawa, M., 2001. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17, 1246–1247. Stock, D.W., Whitt, G.S., 1992. Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group. Science 257, 787–789. Takezaki, N., Figueroa, F., Zaleska-Rutczynska, Z., Klein, J., 2003. Molecular phylogeny of early vertebrates: monophyly of the agnathans as revealed by sequences of 35 genes. Molecular Biology and Evolution 20, 287–292. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular Biology and Evolution 28, 2731–2739. Tavaré, S., 1986. Some probabilistic and statistical problems on the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences 17, 57–86. Whelan, S., Goldman, N., 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution 18, 691–699. Wright, P.A., Wood, C.M., 2009. A new paradigm for ammonia excretion in aquatic animals: role of Rhesus (Rh) glycoproteins. Journal of Experimental Biology 212, 2303–2312.

Phylogenetic positions of RH blood group-related genes in cyclostomes.

The RH gene family in vertebrates consists of four major genes (RH, RHAG, RHBG, and RHCG). They are thought to have emerged in the common ancestor of ...
508KB Sizes 1 Downloads 3 Views