The complete mitochondrial genome of the endangered Apollo butterfly, Parnassius apollo (Lepidoptera: Papilionidae) and its comparison to other Papilionidae species Yan-hong Chen, Dun-yuan Huang, Yun-liang Wang, Chao-dong Zhu, Jia-sheng Hao PII: DOI: Reference:

S0014-4886(14)00197-6 doi: 10.1016/j.expneurol.2014.06.010 YEXNR 11762

To appear in:

Experimental Neurology

Please cite this article as: Chen, Yan-hong, Huang, Dun-yuan, Wang, Yunliang, Zhu, Chao-dong, Hao, Jia-sheng, The complete mitochondrial genome of the endangered Apollo butterfly, Parnassius apollo (Lepidoptera: Papilionidae) and its comparison to other Papilionidae species, Experimental Neurology (2014), doi: 10.1016/j.expneurol.2014.06.010

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Dun-yuan1,2,3,

Wang

Yun-liang1,

RI P

Chen Yan-hong1, Huang Chao-dong2, Hao Jia-sheng1

T

The complete mitochondrial genome of the endangered Apollo butterfly, Parnassius apollo (Lepidoptera: Papilionidae) and its comparison to other Papilionidae species Zhu

MA NU

SC

1. Laboratory of Molecular Evolution and Biodiversity, College of Life Sciences, Anhui Normal University, Wuhu, P.R. China 2. Institute of Zoology, Chinese Academy of Sciences, Beijing, P. R. China 3. College of Forestry, Jiangxi Environmental Engineering Vocational College, Ganzhou, P. R. China

Abstract

AC

CE

PT

ED

The Apollo butterfly, Parnassius apollo is a representative species of the butterfly subfamily Parnassiinae. This charming species is one of the most endangered butterfly species in the world. In this study, we sequenced its complete mitochondrial genome (mitogenome), with the aim of accumulating genetic information for further studies of population genetics and mitogenome evolution in the Papilionidae. The 15,404-bp long mitogenome harbors a typical set of 37 genes and is the largest butterfly mitogenome determined, except for P. maraho (16,094 bp). Like many other sequenced lepidopteran species, one tRNATrp-like and one tRNALeu(UUR)-like sequences were detected in the AT-rich region. A total of 164 bp of non-coding sequences are dispersed in 14 regions throughout the genome. The longest intergenic spacer (68 bp) is located between tRNASer(AGN) and tRNAGlu, and is the largest spacer at this location among Papilionidae species. This spacer may have resulted from an 8-fold repetition of a TTTCTTCT motif or a 4-fold repetition of a CTTTATTT motif. Key words: Parnassiinae; Parnassius apollo; Mitochondrial genome; tRNA-like sequence; Intergenic spacer

Introduction The Apollo butterfly, Parnassius apollo is distributed mainly in the mountainous areas of Europe and northwest of China. This beautiful and charming species is one of the largest Parnassius butterfly species, with the wingspan about 70 mm (Carter, 2000). Its adults are decorated with large black eye-spots on the forewings and red eye-spots on the hind wings. The size and color of the striking eye-spots can change

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

as their habitats vary, and a variety of subspecies have evolved in different areas. Mainly owing to the over-collection, habitat loss, the destruction of its host plant (Sedum and Sempervivum species) and climate change, the Apollo butterfly is now becoming endangered in some of its habitats (Collins, 1985). It consequently has been assessed as vulnerable (VU) in Appendix II in CITES (Collins, 1985; Still, 1996) and IUCN Red List (Gimenez, 1996), as well as listed as Grade-II protected by the Chinese government. The typical insect mitochondrial genome (mitogenome) has a circular structure about 15-16 kb long, and contains 37 genes, including 13 protein coding genes (PCGs), 22 transfer RNA (tRNA) genes, 2 ribosome RNA (rRNA) genes and and a non-coding region (i.e., the control region or the AT-rich region) (Wolstenholme, 1992; Boore, 1999). In view of its maternal inheritance and strict orthology, lack of recombination and an accelerated evolutionary rate compared to nuclear genome, mitogenome has become popular in comparative and evolutionary genomics, molecular evolution, phylogenetics, and population genetics (Nardi et al., 2003, 2005; Simon et al., 2006; Cameron, 2014). Up to the present, 45 complete or nearly complete mitogenome sequences of true butterflies (superfamily: Papilionoidea) have been reported including 8 species from the family Papilionidae. Within the subfamily Parnassiinae, there are two complete mitogenome sequences (Kim et al., 2009; Ji et al., 2012). Thus, more mitogenomic data of representative species, especially of endangered species, are very important for the studies of Papilionoidae phylogeny and ecology. In this study, the complete mitogenome sequence of P. apollo was determined using the long PCR and the conserved primer walking methods, and the sequence was analyzed to determine gene arrangement, nucleotide composition and secondary structures, with the aim of providing key molecular data for the studies of its population genetics, conservation biology, molecular ecology, historical biogeography, etc. Furthermore, the mitogenome sequence was compared with those of other Papilionidae species available, to improve our understanding of molecular evolution within the Papilionidae mitogenomes.

Materials and methods Sample collection and DNA extraction An adult individual of P. apollo was collected in Mountain Tianshan, Xinjiang Province, China in July 2012. After collection, the sample was preserved in 100% ethyl alcohol immediately and stored at -20℃ before DNA extraction. The total genomic DNA was extracted from thorax muscle using a DNA extraction kit (Sangon, China) according to the manufacturer’s instruction.

ACCEPTED MANUSCRIPT PCR amplification and sequence determining

AC

CE

PT

ED

MA NU

SC

RI P

T

To sequence the full-length mitogenome of P. apollo, 13 pairs of primers for the amplification of 6 short fragments and 7 long fragments were used (Fig. 1). Three short fragments (SF2, SF4, SF5) and one long fragment (LF4) were amplified using universal PCR primers from Caterino and Sperling (1999), Simmons and Weller (2001), and Zhao et al. (2013). Other primers including the AT-rich region were designed by the multiple sequence alignments of the known mitochondrial sequences of the lepidopteran species, using Clustal X 1.83 (Thompson et al., 1997) and Primer Premier 5.0 software (Singh et al., 1998) (Table 1).

Fig. 1. Circular map of the mitochondrial genome of Parnassius apollo. The abbreviations for the genes are as follows: COI, COII, and COIII refer to the cytochrome oxidase subunits, CytB refers to cytochrome B, ATP6 and ATP8 refer to subunits 6 and 8 of F0 ATPase, and ND1-6 refers to components of NADH dehydrogenase. tRNAs are indicated by the IUPAC-IUB single letter amino acid codes, while L1, L2, S1, S2 denote tRNALeu(CUN), tRNALeu(UUR), tRNASer(AGN) and tRNASer(UCN), respectively. Gene names that are not underlined indicate transcription on the majority strand whereas underlines indicate transcription on the minority strand. The P.apollo mitogenome was sequenced by 6 short fragments (SF1-SF6) and 7 long fragments (LF1-LF7) as templates, shown as single lines within a circle.

ACCEPTED MANUSCRIPT

Table 1 List of primers used to amplify and sequence the mitogenome of Parnassius apollo. Fragment name

Directione

Primer name

Sequence(5’- 3’)

Nucleotide

ND2-Fa

F

CGTTCATTTCTATTTCAGC

a

R

ACACCACCTATTGTTCCTA

F

ND2-R

k807

b

COIII-F

SF5

SF6

2

718-736

1

TACAATTTATCGCCTAAACTTCAGCC

1699-1721

3

R

TGAAAATGAGCTACAACATAATA

2548-2570

0

a

F

ATCTCAATGATGACGAGAT

4859-4878

1

a

COIII-R SF4

298-316

SC

SF3

k698

R

CAAATCCAAAATGGTGAGTA

5386-5405

3

c

F

AAAACTTCCAGAAAATAATCTC

6786-6807

5

c

ND5-R

R

TTGCTTTATCTACTTTAAGACA

7261-7282

1

REVCB2Hd

F

TGAGGACAAATATCATTTTGAGGW

10895-10918

1

REVCBJd

R

ACTGGTCGAGCTCCAATTCATGT

11498-11520

3

lrRNA-Fa

F

TACGCTGTCATCCCTAA

12976-12992

1

a

R

AAGTCTAATCTGCCCAC

13337-13353

0

CCCTTTCATTTCTGATTCC

564-582

1

ACTGTTCGTCCTGTTCCT

1803-1820

2

TCACAAGAAAGTGGAAAA

2212-2229

0

TCTCTCATCGTAAGCCT

4929-4945

3

GCTGATAGTATTTATGGTTC

5263-5282

0

TTGTATGTGCTGGAGTT

7152-7168

1

F

AATTATACCAGCACATAT

7149-7166

3

R

TTATCGACTGCAAATC

10992-11007

2

F

TCCTGCTAACCCTTTAGTCA

11263-11282

3

ND5-F

lrRNA-R

Annealing temperature(°C)

MA NU

SF2

b

RI P

Short fragments SF1

f

T

position

Mismatchg

47.3

46.9

46.8

46.5

45.0

47.5

ND2-COI-Fa

F

a

ND2-COI-R LF2

COI-COIII-F

R a

F

a

R

a

F

a

R

COI-COIII-R LF3

COIII-ND5-F

LF4

ND5-CytB-F

CE

COIII-ND5-R c

c

ND5-CytB-R

a

AC

LF5

CytB-lrRNA-F

a

CytB-lrRNA-R LF6

lrRNA-srRNA-F

R

GAGTATTTTGTTGGGGT

13082-13098

0

a

F

CTGGGGTCTTCTCGTCT

13158-13174

2

a

R

GCAATAAGTTGGCGGTA

14495-14511

3

F

GAAACACTTTCCAGTACCT

14139-14157

0

330-348

1

lrRNA-srRNA-R LF7

srRNA-ND2-F

a a

PT

LF1

ED

Long fragments

srRNA-ND2-R R CTAAACCAATTCAACATCC Primers newly designed for this genome. b Primers from Caterino and Sperling (1999). c Primers from Zhao et al. (2013). d Primers from Simmon and Weller (2001). e F and R, forward and reverse direction of transcription. f Nucleotide positions are with respective to Parnassius apollo mitogenome. g Mismatches are with respective to Parnassius apollo mitogenome. a

Short-fragment PCR was implemented using Taq DNA polymerase (Takara Bio, Otsu, shiga, Japan) and each PCR reaction was performed in 50µL mixture (10×Buffer 6.0µL, 2.5mmol/L MgCl2 8.0µL, 0.2µg/µL BSA 5.0µL, 2.5mmol/L dNTPs 1.5µL, 0.1µmol/L primers (both direction) 1.8µL, 1.0units Taq DNA

51.1

47.2

47.7

47.1

49.2

51.6

49.7

ACCEPTED MANUSCRIPT

MA NU

SC

RI P

T

polymerase and 1.5µL of the template DNA). The cycling parameters were as follows: 1 min at 94°C; followed by 35 cycles of 1 min at 94°C,1 min at 45°C ~48°C and 2~2.5 min at 72°C; final elongation for 10 min at 72°C. Long-fragment PCR was implemented using LA Taq DNA polymerase (Takara Bio, Otsu, shiga, Japan) and each PCR reaction were also performed in 50µL mixture (10×LA PCR Buffer I (Mg2+ plus) 5µL, 2.5mmol/L MgCl2 3µL, 2.5mmol/L dNTP Mix 8.0µL, 0.1µmol/L primers (both direction) 1.5µL, 1.0units LA Taq DNA polymerase and 1.5µL of the template DNA). The relevant PCR parameters were as follows: 5 min at 95°C; followed by 30 cycles of 55s at 95°C, 2 min at 47~52°C and 2~2.5 min at 68°C; final elongation for 10 min at 68°C. The PCR products were separated by electrophoresis in a 1.2% agarose gel and purified using the DNA gel extraction kit (TaKaRa). All PCR fragments were directly sequenced after purification with the QIA quick PCR Purification Kit reagents (QIAGEN). All fragments were sequenced by primer walking from double strands. Data analysis

AC

CE

PT

ED

Sequences obtained were assembled and data regarding the sequence were determined using BioEdit version 4.8.9 (Hall, 1999). PCGs and rRNA genes were identified by comparing their similarity to published insect mitochondrial sequences using ClustalX 1.8 (Thompson et al., 1997) and MEGA 5.0 (Tamura et al., 2011). Both the lrRNA and srRNA predicted secondary structures were drawn according to models proposed for these genes in other insects (Gillespie et al., 2006; Cameron and Whiting, 2008). The proposed secondary structures of the tRNA genes were predicted with the aid of tRNAscan-SE 1.21 using invertebrate codon predictors and a cove score cut-off of 1 (Lowe and Eddy, 1997). The tRNAs not found by tRNAscan-SE were identified through comparison of P. apollo nucleotide sequence with the regions coding these tRNAs in other insects. All tRNAs were folded by hand, using tRNAscan-SE output as template when possible. Nucleotide composition was calculated using Mega 5.0 (Tamura et al., 2011). The bias in nucleotide composition can be measured as AT-skew and GC-skew ((A%-T%)/(A%+T%) and (G%-C%)/(G%+C%), respectively) (Perna and Kocher, 1995). The AT-rich regions were determined via the alignment of the sequences with homologous regions of known full-length insect mitogenome sequences and the tandem repeats in the AT-rich region were predicted by the Tandem Repeats Finder available online (http://tandem.bu.edu/trf/trf.html) (Benson, 1999). The complete mtDNA sequence of P. apollo was deposited in GenBank under accession no. KF746065.

Results and Discussion Gene structure, organization and composition

ACCEPTED MANUSCRIPT

PT

ED

MA NU

SC

RI P

T

The P. apollo mitogenome contains typical 37 genes for insects: 13 PCGs, 2 rRNA genes, 22 tRNA genes, and one non-coding AT-rich region (control region) (Fig. 1, Table 2). Like many other insect mitogenomes, the major strand codes for 23 genes (9 PCGs and 14 tRNAs) and the AT-rich region, while the minor strand codes for the remaining 14 genes (4 PCGs, 8 tRNAs and 2 rRNA genes). Though the genome size is nearly the same with that of available congeneric mitogenome, P. bremeri (15,389 bp in size), the 15,404-bp long genome is the second largest of the Papilionidae butterflies, after P. maraho (16,094 bp in size) (Table 3).The gene order and orientation are similar to that found in the inferred ancestral hexapod (Boore et al., 1998; Crease, 1999), with the exception of the arrangement of tRNAs between the AT-rich region and the ND2 gene. This type of arrangement (M-I-Q) is found in nearly all the lepidopterans, whereas the insect ground plan arrangement is I-Q-M (Taylor et al., 1993; Cao et al., 2012; Cameron, 2014). As is the case in other Papilionidae butterflies, the nucleotide composition of the entire P. apollo mitogenome is significantly biased, with the highest A+T content (81.3%) of Papilionidae species the same as that of P. bremeri (Table 3). The overall AT- and GC- skews of the P. apollo mitogenome (measured on the majority strand) are -0.016 and -0.187, respectively, indicating that more Ts and Cs than As and Gs are used (Table 3). This is similar to the skew statistics of other Papilionidae butterfly species which have negligible AT-skew values (-0.040 to 0.006) and moderate GC-skew values (-0.262 to -0.191).

CE

The protein coding genes

AC

The P. apollo mitogenome harbors 13 protein coding genes (PCGs), which collectively harbor 3,720 codons, exclusive of the termination codons. Codon number is identical to that of P. machaon. Of the 13 PCGs, nine are encoded on the J-strand (ATP6, ATP8, COI, COII, COIII, CytB, ND2, ND3, ND6), while the other four are encoded on the N-strand (ND1, ND4, ND4L, ND5). All PCGs are initiated by typical ATN codons (ATP8, ND2, ND3 and ND5 with ATT; ATP6, COII, COIII, CytB, ND1, ND4 and ND4L with ATG; ND6 with ATC), except COI gene utilizes CGA as a start codon. In the COI gene, no canonical ATN initiator was found in the start site. The only plausible traditional start codon for the COI gene is ATC, located within the tRNATyr gene, overlapping 25 bp with the tRNATyr. This ATC sequence requires nine additional amino acids, resulting in a peculiar alignment as compared with other lepidopteran species. However, a codon following this triplet has a stop codon (TAG) which is present at the beginning region of the COI gene. Consequently, this ATC sequence may not be the start codon for the COI gene, and there are no other probable start codons for P.apollo COI. According to these criteria, the first nonoverlapping codon in the COI gene is the CGA, designating arginine existing in a highly conserved region in most lepidopteran

ACCEPTED MANUSCRIPT insects (Cameron and Whiting, 2008; Hao et al., 2013).

Position

Size(bp)

Intergenic length

Anticodon

F

1-69

69

0

lle

tRNA

F

70-133

64

0

tRNAGln

R

131-199

69

-3

ND2

F

240-1253

1014

40

tRNATrp

F

1253-1318

66

-1

Cys

R

1311-1376

66

-8

Tyr

tRNA

R

1381-1444

64

4

COI

F

1447-2977

1531

2

F

2978-3044

67

tRNA

Leu

tRNA

(UUR)

Stop codon

33-35 CAT

-

-

99-101 GAT

-

-

167-169 TTG

-

-

-

ATT

TAA

1283-1285 TCA

-

-

1345-1347 GCA

-

-

1412-1414 GTA

-

-

-

CGA

T-tRNA

0

3008-3010 TAA

-

-

SC

tRNA

Start codon

T

Direction Met

MA NU

Gene

RI P

Table 2 Organization of the Parnassius apollo mitogenome.

3045-3726

682

0

-

ATG

T-tRNA

F

3727-3797

71

0

3757-3759 CTT

-

-

Asp

tRNA

F

3797-3863

67

-1

3828-3830 GTC

-

-

ATP8

F

3864-4025

162

0

-

ATT

TAA

ATP6

F

4019-4696

678

-7

-

ATG

TAA

F

4696-5484

789

-1

-

ATG

TAA

tRNA

F

5488-5554

67

3

5518-5520 TCC

-

-

ND3

tRNA

COIII

5555-5908

354

0

-

ATT

TAA

F

5908-5973

66

-1

5938-5940 TGC

-

-

Arg

F

5973-6038

66

-1

6002-6004 TCG

-

-

Asn

F

6039-6105

67

0

6069-6071 GTT

-

-

Ser

tRNA (AGN)

F

6109-6169

61

3

6030-6032 GCT

-

-

Glu

F

6238-6303

66

68

6269-6271 TTC

-

-

Phe

R

6302-6367

66

-2

6333-6335 GAA

-

-

R

6369-8102

1734

1

-

ATT

TAA

R

8103-8167

65

0

8135-8137 GTG

-

-

R

8167-9507

1341

-1

-

ATG

TAA

tRNA tRNA tRNA

tRNA

AC

tRNA

CE

F Ala

PT

Gly

ED

F Lys

COII

ND5 His

tRNA ND4

R

9507-9797

291

5

-

ATG

TAA

Thr

F

9800-9866

67

2

9832-9834 TGT

-

-

Pro

tRNA

R

9867-9931

65

0

9989-9991 TGG

-

-

ND6

F

9934-10464

531

2

-

ATC

TAA

CytB

F

10481-11629

1149

16

-

ATG

TAA

tRNASer(UCN)

F

11631-11697

67

1

11660-11662 TGA

-

-

ND1

R

11714-12652

939

16

-

ATG

TAG

R

12654-12722

69

1

12690-12692 TAG

-

-

R

12723-14064

1342

0

-

-

-

tRNA

R

14065-14129

65

0

14097-14099 TAC

-

-

srRNA

R

14130-14900

771

0

-

-

-

14901-15404

504

0

-

-

-

ND4L tRNA

Leu

tRNA

(CUN)

lrRNA Val

AT-rich region

F = foward; R = reverse.

ACCEPTED MANUSCRIPT

Table 3 Characteristics of the mitogenomes of Papilionidae species. Taxon

PCGb

Mitogenome (majority strand)

tRNA

rRNA

AT-rich region

Genbank

References

Size

AT

AT-skew

GC-skew

No. codons

a

Size

AT

Size

AT

Size

AT

(%)

(bp)

(%)

(bp)

(%)

(bp)

(%)

(%)

Papilio bianor

15,340

80.6

-0.015

-0.210

3,719

79.0

1,453

Papilio maraho

16,094

80.5

0.006

-0.262

3,717

78.1

1,442

Papilio maackii

15,357

80.7

-0.014

-0.212

3,721

79.2

1,452

Papilio machaon

15,185

80.3

-0.031

-0.198

3,720

79.0

Troides aeacus

15,263

80.2

-0.040

-0.232

3,724

Teinopalpus aureus

15,242

79.9

-0.005

-0.238

3,719

Parnassius apollo

15,404

81.3

-0.016

-0.187

3,720

Parnassius bremeri

15,389

81.3

-0.011

-0.191

3,722

15,242

80.9

-0.009

-0.221

Papilioninae

3,691

81.4

2,097

84.2

498

94.0

NC018040

Xu et al., unpublished

80.7

2,112

84.4

1,270

94.3

FJ810212

Wu et al. (2010)

81.4

2,100

84.3

514

92.8

NC021411

Dong et al. (2013)

1,446

81.4

2,092

83.8

362

92.5

HM243594

Xu et al., unpublished

79.0

1,472

80.6

2,018

83.9

419

89.8

EU625344

Jiang et al., unpublished

78.3

1,455

81.2

2,101

83.7

395

93.2

HM563681

Qin et al. (2012)

80.1

1,460

81.5

2,113

84.5

504

93.8

KF746065

This study

80.2

1,462

80.9

2,117

84.4

504

93.7

FJ871125

Kim et al. (2009)

79.8

1,457

81.6

2,097

83.9

408

94.1

HQ259122

Ji et al. (2012)

MA NU

ED

a

SC

Papilionidae

Sericinus montela

no.

AT

(bp)

Parnassiinae

RI P

T

accession

PT

Termination codons were excluded in total codon count. b Protein coding genes. Bar (-) indicates lack of sequence information on the AT-rich region in the genome.

AC

CE

Eleven genes have complete termination codons, either TAA (ATP6, ATP8, COIII, CytB, ND2, ND3, ND4, ND4L, ND5, ND6) or TAG (ND1), while the remaining two genes (COI and COII) end with the incomplete termination codon T. This phenomenon of partial termination codons (i.e., T or TA) is observed in all sequenced lepidopteran insects and has been interpreted in terms of posttranscriptional polyadenylation, by which “A” residue(s) are added to create TAA terminator (Kim et al., 2009). The relative synonymous codon usage (RSCU) analysis demonstrated that codons with As or Ts at the third position are always overused compared to other synonymous codons. For example, the codon TTG (Leu) is utilized only twice per 1000 codons, corresponding to an RSCU of 0.07, but its synonymous codon TTA (Leu) is significantly overused (134 per 1000), corresponding to an RSCU of 5.42 (Table 4). This trend has also been noted in all other insect mitogenomes, thus indicating the fact of universally biased usage of A and T nucleotides in the PCGs (Cameron & Whiting, 2007; Nelson et al., 2012). In addition, NC1000 statistics showed that TTT (Phe), TTA (Leu), ATT (Ile), ATA (Met), and AAT (Asn) are the five most frequently used, accounting for 48.36% of all the codons (Table 4). Similar cases have been detected in other lepidopterans (data not shown).

ACCEPTED MANUSCRIPT

Table 4 Relative synonymous codon usage (RSCU) and number of codons per 1000 codons (NC1000) in the protein-coding genes of the Parnassius apollo mitogenome. acid

M

V

NC1000

1.84

90.86

TCT

2.82

30.38

TTC

0.16

7.80

TCC

0.27

2.96

TTA

5.42

133.87

TCA

2.24

24.19

TTG

0.07

1.61

TCG

0.05

0.54

CTT

0.27

6.72

CCT

2.40

19.35

CTC

0.02

0.54

CCC

0.37

2.96

CTA

0.22

5.38

CCA

1.23

9.95

CTG

0.00

0.00

CCG

0.00

0.00

ATT

1.92

116.94

ACT

2.36

24.46

ATC

0.08

5.11

ACC

0.21

2.15

ATA

1.87

76.61

ACA

1.40

14.52

ATG

0.13

5.38

ACG

0.03

0.27

GTT

2.05

16.13

GCT

2.42

19.35

GTC

0.09

0.81

GCC

0.24

1.88

GTA

1.83

15.86

GCA

1.34

10.75

S*

Amino

Y

codon

RSCU

NC1000

TAT

1.94

49.46

TAC

0.06

1.61

Amino

codon

RSCU

NC1000

TGT

1.82

8.06

TGC

0.18

0.81

TGA

1.98

25.27

TGG

0.02

0.27

CGT

1.31

4.84

acid

C

W

CAT

1.92

18.28

CAC

0.08

0.81

CGC

0.15

0.54

CAA

1.97

16.13

CGA

2.33

8.60

CAG

0.03

0.27

CGG

0.22

0.81

AAT

1.87

65.32

AGT

0.67

7.26

AAC

0.13

4.57

AGC

0.02

0.27

AAA

1.92

25.81

AGA

1.92

20.7

AAG

0.08

1.08

AGG

0.00

0.00

GAT

1.90

15.59

GGT

1.06

13.98

GAC

0.10

0.81

GGC

0.04

0.54

GAA

1.87

19.35

GGA

2.52

33.33

0.03 0.27 0.00 0.00 0.13 GTG GCG GAG Termination codons were excluded to the count due to the uncertainty in many species.

1.34

GGG

0.39

5.11

P

T

A

H

Q

MA NU

I

RSCU

acid

PT

L*

TTT

N

K

D

E

R

S

G

The intergenic spacer sequences The P. apollo mitogenome includes a total of 164 bp of intergenic spacer regions, spread over 14 non-coding regions, ranging from 1 to 68 bp. The four longest spacers are located between tRNAGln and ND2 (40 bp), tRNASer(AGN) and tRNAGlu (68 bp), ND6 and CytB (16 bp), tRNASer(UCN) and ND1 (16 bp) (Table 2). The tRNAGln ND2 spacer has been detected in most other Papilionidae species, with a size range of 40 to 72 bp. This sequence is 70% homologous to its neighboring ND2 gene, suggesting that this sequence may be derived from ND2 (Fig. 2). The Ser - Glu spacer is a unique to the genus Parnassius, and in P. bremeri the corresponding region is 43 bp long (Kim et al., 2009). This region sequence appears to be the result of an 8-fold repetition of TTTCTTCT motif or a 4-fold repetition of a CTTTATTT motif (Fig. 2). The third one (16 bp) show a low level of sequence similarity compared with its neighboring ND6 and CytB genes, showing significant variations in length among the sequenced lepidopterans (data not shown). Finally, the last one (16 bp) was detected in all sequenced Papilionidae butterflies, which harbor an ATACTAA motif (Fig. 3). Due to this intergenic spacer sequences is located at the end site of the major-strand coding region, this 7-bp sequence was suggested to be the possible binding site for mtTERM, the transcription termination peptide (Cameron and Whiting, 2008; Taanman, 1999).

CE

L

codon

acid

AC

F

Amino

T

NC1000

RI P

RSCU

SC

codon

ED

Amino

SC

RI P

T

ACCEPTED MANUSCRIPT

CE

PT

ED

MA NU

Fig. 2. Sequences of two relatively large intergenic spacers. (A) Alignment of the spacer sequence located between tRNAGln and ND2 gene, and the neighboring partial ND2 gene of Parnassius apollo. Asterisks indicate consensus sequences in the alignment between the spacer sequence and the ND2 gene. Sequence homology is shown on the right side of the alignment. (B) The intergenic spacer sequence detected between the tRNASer(AGN) and tRNAGlu of Parnassius apollo (68 bp), and the alignment of repeat sequences detected within the intergenic spacer sequence. The nucleotide position is indicated at the beginning and end sites of the sequence.

AC

Fig. 3. Alignment of the intergenic spacer between the tRNASer(UCN) and ND1 among all sequenced Papilionidae butterflies, with the ATACTAA motif shown by shadow area.

The tRNA genes

There are 22 tRNA genes in the P. apollo mitogenome, ranging in length from 61 bp (tRNASer (AGN)) to 71 bp (tRNALys) (Table 2). The nucleotide composition of these 22 tRNA genes (1,460 bp in total size) is AT biased (81.5%). All of them possess the typical clover-leaf secondary structures, with the exception of tRNASer (AGN), which lacks a dihydrouridine (DHU) arm (Fig. 4). Similar cases have been detected in most insects including all lepidopterans studied to date (Wolstenholme, 1992; Salvato et al., 2008; Hu et al., 2010; Wang et al., 2011; Sun et al. 2012; Zhao et al., 2013). All P. apollo tRNAs possess 7 bp aminoacyl stems, 7 bp anticodon loop and 5 bp anticodon stems, but other portions of tRNAs are variable in length, particularly within the DHU and TΨC loops (4-7 bp and 3-9 bp, respectively). A total of 11 pair mismatches are found in all the tRNA stem regions: 6 U-Us, 2 A-Cs, 2 U-Cs, and one

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

G-A (Fig. 4).

Fig. 4. Predicated clover-leaf secondary structure of the Parnassius apollo tRNA genes.

The rRNA genes Like all other insect mitogenome sequences, two rRNA genes (1,342 bp lrRNA and 771 bp srRNA) are found in the P. apollo mitogenome. They are located between

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

tRNALeu(CUN) and tRNAVal, and between tRNAVal and the AT-rich region (Table 2), with an A+T% of 84.1% and 85.3%, respectively. Both of these values are well within the range reported for other lepidopteran insects (Salvato et al., 2008). In predictive secondary structures, lrRNA contained six domains (labeled I, II, III, IV, V, and VI) with 49 helices, while the srRNA harbored three domains (labeled I, II, III) with 33 helices (Figs. 5 and 6), respectively. The morphological characteristics of both lrRNA and srRNA are quite similar to their counterparts in A. mellifera, and M. sexta (Gillespie et al., 2006; Cameron and Whiting, 2008).

Fig. 5. Predicted secondary structure of the Parnassius apollo lrRNA. Roman numerals denote the conserved domain structure. Helices are numbered according to the annotation systems of Gillespie et al. (2006). Tertiary structures are denoted by boxed bases joined by solid lines. Watson-Crick pairs are joined by dashes, other interactions are joined by plus signs.

AC

CE

PT

ED

MA NU

SC

RI P

T

ACCEPTED MANUSCRIPT

Fig. 6. Predicted secondary structure of the Parnassius apollo srRNA. The annotation is the same as in Fig. 5.

The AT-rich region The 504 bp AT-rich region of P. apollo mitogenome is located between srRNA and tRNAMet, and shows a relatively high level of A+T content (93.8%); well within the range of other Papilionidae species 92.5% in P. machaon to 94.3% in P. maraho (Table 3). The region is composed mostly of non-repetitive sequences, but harbors some typical structures characteristic of lepidopterans: the putative ON (Origin of minority or light strand replication) located 22 bp upstream of the 5'-end of the srRNA gene, and is contains of the motif ATAGA followed by an 17 bp poly-T stretch; and a microsatellite-like repeat (TA)9 preceded by the ATTTA motif (Fig. 7). Another (TA)9 microsatellite repeat located 126 bp upstream of the srRNA, is found in both Parnassius species (Fig. 7).

ED

MA NU

SC

RI P

T

ACCEPTED MANUSCRIPT

PT

Fig. 7. Characteristic sequences of AT-rich region of Parnassius apollo. (A) The special TA repeat sequences of the AT-rich region in Parnassius apollo and Parnassius bremeri. (B) Squence of Parnassius apollo AT-rich region. The shadow areas show the ATAGA motif, poly-T stretch, ATTTA sequence and microsatellite TA repeat sequence. The underlined sequences show the tRNATrp-like sequence and the tRNALeu(UUR)-like sequence. (C) Secondary structures of the tRNATrp-like sequence and the tRNALeu(UUR)-like sequence.

AC

CE

It has been previously demonstrated that the presence of tRNA-like sequences within the AT-rich region in mammalian mitogenome is due to the failure to cleave the tRNA primers from the nascent DNA strand after the mitochondrial DNA synthesis, and consequently tRNA-like sequences are incorporated into the mitogenome (Cantatore et al., 1987). Afterwards, the tRNA-like sequences has also been reported in many insects, such as Hymenoptera (Cha et al., 2007), Diptera (Cameron et al., 2007), Lepidoptera (Kim et al., 2009), and Coleoptera (Hong et al., 2009). In P. apollo, one tRNATrp-like and one tRNALeu(UUR)-like sequences are detected in its AT-rich region, as is the case in its congeneric species P. bremeri (Fig. 7).

Acknowlegements:

This work was supported by the National Science Fundation of China

(Grant No. 41172004) and the Opening Funds from the State Key Laboratory of Palaeobiology and Stratigraphy, Nanjing Institute of Geology and Palaeontology, Chinese Academy of Sciences (Grant No.104143).

References Benson, G., 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

Boore, J.L., 1999. Animal mitochondrial genomes. Nucleic Acids Res. 27, 1767–1780. Boore, J.L., Lavrov, D.V., Brown, W.M., 1998. Gene translocation links insects and crustaceans. Nature 392, 667–668. Cameron, S.L., 2014. Insect mitochondrial genomics: Implications for evolution and phylogeny. Annu. Rev. Entomol. 59, 95–117. Cameron, S.L., Lambkin, C.L., Barker, S.C., Whiting, M.F., 2007. A mitochondrial genome phylogeny of Diptera: whole genome sequence data accurately resolve relationships over broad timescales with high precision. Syst. Entomol. 32, 40–59. Cameron, S.L., Whiting, M.F., 2007. Mitochondrial genomic comparisons of the subterranean termites from the Genus Reticulitermes (Insecta: Isoptera: Rhinotermitidae). Genome 50, 188–202. Cameron, S.L., Whiting, M.F., 2008. The complete mitochondrial genome of the tobacco hornworm, Manduca sexta, (Insecta: Lepidoptera: Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths. Gene 408, 112–123. Cantatore, P., Gadaleta, M.N., Roberti, M., Saccone, C., Wilson, A.C., 1987. Duplication and remodeling of tRNA genes during the evolutionary rearrangement of mitochondrial genomes. Nature 329, 853–855. Cao, Y.Q., Ma, C., Chen, J.Y., Yang, D.R., 2012. The complete mitochondrial genomes of two ghost moths, Thitarodes renzhiensis and Thitarodes yunnanensis: the ancestral gene arrangement in Lepidoptera. BMC Genomics 13, 276. Carter, D., 2000. Butterflies and Moths. Dorling Kindersley, London. Caterino, M.S., Sperling, F.A.H., 1999. Papilio phylogeny based on mitochondrial cytochrome oxidase I and II genes. Mol. Phylogenet. Evol. 11, 122–137. Cha, S.Y., Yoon, H.J., Lee, E.M., Yoon, M.H., Hwang, J.S., Jin, B.R., Han, Y.S., Kim, I., 2007. The complete nucleotide sequence and gene organization of the mitochondrial genome of the bumblebee, Bombus ignitus (Hymenoptera: Apidae). Gene 392, 206–220. Collins, N.M., Morris, M.G., 1985. Threatened swallowtail butterflies of the world. IUCN, Gland, Switzerland and Cambridge, UK. Crease, T., 1999. The complete sequence of the mitochondrial genome of Daphnia pulex (Cladocera: Crustacea). Gene 233, 89–99. Dong, Y., Zhu, L.X., Wu, Y.F., Wu, X.B., 2013. The complete mitochondrial genome of the Alpine black swallowtail, Papilio maackii (Insecta: Lepidoptera: Papilionidae). Mitochondrial DNA 24, 639–641. Gillespie, J.J., Johnston, J.S., Cannone, J.J., Gutell, R.R., 2006. Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) Rrna genes of Apis mellifera (Insecta: Hymenoptera): structure, organization and retrotransposable elements. Insect Mol. Biol. 15, 657–686. Gimenez, D.M., 1996. Parnassius apollo. In: IUCN 2013. IUCN Red List of Threatened Species. Version 2013.1. . Assessed 1 August 1996. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/ NT. Nucleic Acids Symp. Ser. 41, 95–98. Hao, J.S., Sun, M.E., Sun, X.Y., Shao, L.L., Yang, Q., 2013. Complete mitogenomes of Euploea mulciber (Nymphalidae: Danainae) and Libythea celtis (Nymphalidae: Libytheinae) and their phylogenetic implications. ISRN Genomics 491636, 1–14. Hong, M.Y., Jeong, H.C., Kim, M.J., Jeong, H.U., Lee, S.H., Kim, I., 2009. Complete mitogenome sequence of the jewel beetle, Chrysochroa fulgidissima (Coleoptera:

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

Buprestidae). Mitochondrial DNA 20, 46–60. Hu, J., Zhang, D.X., Hao, J.S., Huang, D.Y., Cameron, S., Zhu, C.D., 2010. The complete mitochondrial genome of the yellow coaster, Acraea issoria (Lepidoptera: Nymphalidae: Heliconiinae: Acraeini): sequence, gene organization and a unique tRNA translocation event. Mol. Biol. Rep. 37, 3431–3438. Ji, L.W., Hao, J.S., Wang, Y., Huang, D.Y., Zhao, J.L., Zhu, C.D., 2012. The complete mitochondrial genome of the dragon swallowtail, Sericinus montela Gray (Lepidoptera: Papilionidae) and its phylogenetic implication. Acta Entomol. Sin. 55, 91–100. Kim, M.I., Baek, J.Y., Kim, M.J., Jeong, H.C., Kim, K.G., Bae, C.H., Han, Y.S., Jin, B.R., Kim, I., 2009. Complete nucleotide sequence and organization of the mitogenome of the red–spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) and comparison with other lepidopteran insects. Mol. Cells 28, 347–363. Lowe, T.M., Eddy, S.R., 1997. tRNAscan–SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964. Nardi, F., Carapelli, A., Dallai, R., Roderick, G.K., Frati, F., 2005. Population structure and colonization history of the olive fly Bactrocera oleae (Diptera: Tephritidae). Mol. Ecol. 14, 2729–2738. Nardi, F., Spinsanti, G., Boore, J.L., Carapelli, A., Dallai, R., Frati, F., et al., 2003. Hexapod origins: monophyletic or paraphyletic? Science 299, 1887–1889. Nelson, L.A., Lambkin, C.L., Batterham, P., Wallman, J.F., Dowton, M., Whiting, M.F., Yeates, D.K., Cameron, S.L., 2012. Beyond barcoding: A mitochondrial genomics approach to molecular phylogenetics and diagnostics of blowflies (Diptera: Calliphoridae). Gene 511, 131–142. Perna, N.T., Kocher, T.D., 1995. Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genome. J. Mol. Evol. 41, 353–358. Qin, F., Jiang, G.F., Zhou, S.Y., 2013. Complete mitochondrial genome of the Teinopalpus aureus guangxiensis (Lepidoptera: Papilionidae) and related phylogenetic analyses. Mitochondrial DNA 23, 123–125. Salvato, P., Simonato, M., Battisti, A., Negrisolo, E., 2008. The complete mitochondrial genome of the bag–shelter moth Ochrogaster lunifer (Lepidoptera, Notodontidae). BMC Genomics 9, 331. Simmons, R.B., Weller, S.J., 2001. Utility and evolution of cytochrome b in insects. Mol. Phylogenet. Evol. 20, 196–210. Simon, C., Buckley, T.R., Frati, F., Stewart, J.B., Beckenbach, A.T., 2006. Incorporating molecular evolution into phylogenetic analysis and a new compilation of conserved polymerase chain reaction primers for animal mitochondrial DNA. Ann. Rev. Ecol. Evol. Syst. 37, 545–579. Singh, V.K., Mangalam, A.K., Dwivedi, S., Naik, S., 1998. Primer premier: program for design of degenerate primers from a protein sequence. BioTechniques 24, 318–319. Still, J., 1996. Butterflies and Moths of Britain and Europe. Harper Collins, London. Sun, Q.Q., Sun, X.Y., Wang, X.C., Gai, Y.H., Hu, J., Zhu, C.D., Hao, J.S., 2012. Complete sequence of the mitochondrial genome of the Japanese buff–tip moth, Phalera flavescens (Lepidoptera: Notodontidae). Genet. Mol. Res. 11, 4213–4225. Taanman, J.W., 1999. The mitochondrial genome: structure, transcription, translation and replication. Biochim. Biophys. Acta 1410, 103–123. Tamura, K., Peterson, D., Peterson, N., Stecher, G., et al., 2011. MEGA5: molecular evolutionary

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

MA NU

SC

RI P

T

genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. Taylor, M. F., McKechnie, S.W., Pierce, N., Kreitman, M., 1993. The lepidopteran mitochondrial control region: structure and evolution. Mol. Biol. Evol. 10, 1259–1272. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., 1997. The CLUSTAL_X windows interface: Flexible strategies for multiple sequences alignment aided by quality analysis tools. Nucleic Acids Res. 24, 4876–4882. Wang, X.C., Sun, X.Y., Sun, Q.Q., Zhang, D.X., Hu, J., Yang, Q., Hao, J.S., 2011. The complete mitochondrial genome of the laced fritillary Argyreus hyperbius (Lepidoptera: Nymphalidae). Zool. Res. 32, 465–475. Wolstenholme, D.R., 1992. Animal mitochondrial DNA: structure and evolution. Int. Rev. Cytol. 141, 173–216. Wu, L.W., Lees, D.C., Yen, S.H., Hsu, Y.F., 2010. The Complete mitochondrial genome of the near–threatened swallowtail, Agehana maraho (Lepidoptera: Papilionidae): Evaluating sequence variability and suitable markers for conservation genetic studies, Entomol. News 121, 267–280. Zhao, F., Huang, D.Y., Shi, Q.H., Hao, J.S., Sun, X.Y., Zhang, L.L., Yang, Q., 2013. The first mitochondrial genome for the butterfly family Riodinidae and its systematic implications. Zool. Res. 34: 109–119.

Special issue on stem cells: "the end of the beginning".

Special issue on stem cells: "the end of the beginning". - PDF Download Free
1MB Sizes 0 Downloads 3 Views