JIPB

Journal of Integrative Plant Biology

Prospects for discriminating Zingiberaceae species in India using DNA barcodes Meenakshi Ramaswamy Vinitha1, Unnikrishnan Suresh Kumar2, Kizhakkethil Aishwarya3, Mamiyil Sabu3 and George Thomas1*

Research Article

1

Plant Molecular Biology, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, Kerala 695014, India, 2Regional Facility for DNA Fingerprinting, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, Kerala 695014, India, 3Department of Botany, University of Calicut, Calicut, Kerala 673635, India. *Correspondence: [email protected]

Abstract We evaluated nine plastid (matK, rbcL, rpoC1, rpoB, rpl36‐rps8, ndhJ, trnL‐F, trnH‐psbA, accD) and two nuclear (ITS and ITS2) barcode loci in family Zingiberaceae by analyzing 60 accessions of 20 species belonging to seven genera from India. Bidirectional sequences were recovered for every plastid locus by direct sequencing of polymerase chain reaction (PCR) amplicons in all the accessions tested. However, only 35 (58%) and 40 accessions (66%) yielded ITS and ITS2 sequences, respectively, by direct sequencing. In different bioinformatics analyses, matK and rbcL consistently resolved 15 species (75%) into monophyletic groups and five species into two paraphyletic groups. The 173 ITS sequences, including 138 cloned sequences from 23 accessions, discriminated only 12 species (60%), and the remaining species were entered into three paraphyletic groups. Phylogenetic and genealogic analyses of plastid and ITS sequences imply the possible occurrence of

natural hybridizations in the evolutionary past in giving rise to species paraphyly and intragenomic ITS heterogeneity in the species tested. The results support using matK and rbcL loci for barcoding Zingiberaceae members and highlight the poor utility of ITS and the highly regarded ITS2 in barcoding this family, and also caution against proposing ITS loci for barcoding taxa based on limited sampling.

INTRODUCTION

and Qin 1998; Soontornchainaksaeng and Jenjittikul 2010; Yob et al. 2011). However, ensuring quality, efficacy, and compositional consistency of herbal drugs depends on the genuineness of the raw materials used (Patwardhan et al. 2005; Li et al. 2011). According to Li et al. (2011), three factors negatively affect the quality of herbal materials used in the pharmaceutical industry: scanty morphological descriptors to identify authentic materials; use of the same common name for different species; and intentional mixing with inexpensive herbs. DNA barcoding has great potential in Zingiberaceae, both for the confirmation of raw materials in the pharmaceutical sector (Srirama et al. 2010; Li et al. 2011; Xue and Li 2011) and as a tool in conservation biology and ecology. Success in DNA barcoding depends on choice of a DNA region (barcode locus) with an optimum level of “universality” in terms of polymerase chain reaction (PCR) amplification, sequencing, and sequence alignment, and “resolvability” with respect to the ability of its sequence to differentiate conspecifics from congenerics (Hollingsworth et al. 2011). Despite considerable debate over the last 8 years, there is little consensus regarding the choice of loci for barcoding plants. The focus in the plant barcoding initiative so far has been to screen the nuclear loci, ribosomal DNA (nrDNA) internal transcribed spacer (ITS), and coding and noncoding plastid loci to identify suitable loci for barcoding a taxon (Hollingsworth et al. 2011). Each plant barcode locus has different strengths and weaknesses, and their resolvability and universality vary

Zingiberaceae (gingers), a pantropic family with center of distribution in the Indo‐Malayan region, consists of approximately 53 genera and 1,200 species (Kress et al. 2002; Sabu 2006). Most of the gingers are rhizomatous, and the rhizomes (underground stem) of several gingers are used as spices, vegetables, neutraceuticals, drugs, or indispensable ingredients in traditional medicines in Southeast Asian countries (Narayana et al. 2000; Kala 2005; Tushar et al. 2010; Shi et al. 2011; Yob et al. 2011). The capsules (fruit) of both the small cardamom (Elettaria cardamomum Maton) and large cardamom (Amomum subulatum Roxb.) are important spices in the global market. Many other Zingiberaceae such as Amomum spp., Alpinia spp., and Kaempferia spp. are ornamentals or landscape plants. India is rich in Zingiberaceae, and an account of Indian Zingiberaceae can be found online at http://www.gingersofindia.com/, a site maintained by one of the authors (M.S.). Species identification is exceedingly difficult in Zingiberaceae due to narrow differences in morphological characteristics between species, the plants’ phenotypic plasticity, and their short and seasonal flowering cycles. Assigning a rhizome sample to a species is even more difficult, especially after processing by drying, chipping, or powdering. The medicinal gingers are attractive commodities to the pharmaceutical industry in India, China, and other Southeast Asian countries (Li August 2014 | Volume 56 | Issue 8 | 760–773

Keywords: Concerted evolution; DNA barcoding; ITS heterogeneity; natural hybridizations; Zingiberaceae Citation: Vinitha MR, Suresh Kumar U, Aishwarya K, Sabu M, Thomas G (2014) Prospects for discriminating Zingiberaceae species in India using DNA barcodes. J Integr Plant Biol 56: 760–773. doi: 10.1111/jipb.12189 Edited by: Catherine Kidner, University of Edinburgh, UK Received Nov. 21, 2013; Accepted Feb. 20, 2014 Available online on Feb. 26, 2014 at www.wileyonlinelibrary.com/ journal/jipb © 2014 Institute of Botany, Chinese Academy of Sciences

www.jipb.net

DNA barcoding of Indian Zingiberaceae considerably between taxa (CBOL Plant Working Group 2009; Hollingsworth et al. 2011). Success in species identification using DNA barcoding is considered to be low in taxa that experienced natural hybridization or reticulation evolution (Fazekas et al. 2009; Spooner 2009; Roy et al. 2010). However, the reasons for poor barcode success in a taxon are seldom analyzed in depth. In‐depth analysis of the causes of variation in barcode success between plant lineages may help to derive new hypotheses regarding the choice of barcode loci for a taxon. Although the prospect of developing a universal barcode database in plants is relatively bleak, recent reports suggest the possibility of developing sub‐databases in a taxon‐based manner. Candidate loci have been recently proposed for barcoding the species of Asteraceae (Gao et al. 2010a), Lemnaceae (Want et al. 2010), Pinaceae (Ran et al. 2010), Meliaceae (Muellner et al. 2011), Fabaceae (Gao et al. 2010b), Lamiaceae (Theodoridis et al. 2012), and Araliaceae (Liu et al. 2012). Shi et al. (2011) evaluated the universality and resolvability of five plastid loci and nrDNA ITS2 in Zingiberaceae members in China and found ITS2 be the most promising locus. Despite their vast diversity and utility, species of family Zingiberaceae are seldom subjected to genetic and molecular biology research (Gao et al. 2006; Xia et al. 2009). Little information is available on the performance of known barcode loci on Indian members of this family. Because the genetic architecture of different populations of a species may vary by geography (Dobes et al. 2004; Kikuchi et al. 2010), there is a need to evaluate the putative loci proposed for barcoding a species by analyzing samples from different geographic regions before adopting such loci for practical applications. In this study, we performed a rigorous evaluation of the specimen identification potential of nine plastid loci and two nuclear loci (ITS and ITS2) in a subset of Indian Zingiberaceae. We also assessed the relationship between barcode success and the evolutionary events hypothesized to have occurred in Zingiberaceae members.

RESULTS Amplification and sequence characteristics Every barcode locus tested in the study yielded the expected amplicon in all the accessions analyzed. However, only the plastid loci yielded good quality bidirectional sequences by direct sequencing. Of the 60 accessions, the nuclear ITS and ITS2 sequences were recovered by direct sequencing in only 35 and 40 accessions, respectively (Table S1). In eight species, each of the three accessions sampled yielded ITS sequence by direct sequencing, whereas only two accessions in Alpinia malaccensis, one accession each in Amomum pterocarpum, A. subulatum, Hedychium flavescens, Globba orixensis, and Zingiber odoriferum, and no accessions in Kaempferia galanga, K. rotunda, Zingiber parishii, Curcuma amada, C. caesia, and C. longa yielded ITS sequence by direct sequencing. Multiple overlapping peaks were found in the electropherograms of the accessions that did not yield ITS data by direct sequencing. In order to understand the pattern of ITS variation and the role of this variation in determining sequence recovery (universality) and resolvability, we cloned the ITS amplicons from 23 of the 25 accessions, which did not yield ITS sequence www.jipb.net

761

by direct sequencing. At least five clones each were sequenced from the 23 amplicons and generated altogether 138 ITS sequences. Since the accessions belonging to the genus Curcuma yielded highly divergent ITS sequences in our cloning experiments, and also the Indian Curcuma is reported to be highly divergent at ITS loci (Zaveska et al. 2012), we cloned ITS amplicons in only seven of the nine Curcuma accessions included in the study. Sequences obtained from a single species were aligned and removed altogether 21 sequences as they were duplicates. Thus, the study consisted of a total of 152 ITS sequences, including the 35 sequences obtained by direct sequencing (Table S1). No attempt was made to clone ITS2 amplicons from accessions that did not yield sequence information by direct sequencing; instead, ITS2 segments were excised from ITS sequences and used in phylogenetic tree construction. Sequence characteristics of the nine plastids and two nuclear loci are given in Table 1. The amplicon size varied from 300 bp in accD to 925 bp in matK whereas in ITS, ITS2, and trnH‐ psbA it ranged between 650–750, 275–325, and 370–400 bp, respectively. The aligned length varied from 236 bp for accD to 922 bp for rbcL. Variable sites ranged from 3.8% for trnH‐psbA and rpoC1 to 54.7% for ITS2. The sequences of rpoC1, trnH‐psbA, and accD were the most conserved (96.2%) among the loci analyzed when the sequence length and number of conserved sites were taken into consideration. ITS2 had the highest nucleotide variation (54.7%) followed by ITS (44.8%), trnL‐F (16.7%), and matK (11.4%). Parsimony informative sites ranged from 3.8% (trnH‐psbA, rpoC1) to 49% (ITS2). The highest number of singleton sites was recorded for ITS (6.6%), whereas five loci had no singleton sites (Table 1). The mean inter‐ and intra‐ specific variability for each locus is given in Table 2. The ITS2 locus yielded the highest mean percentage for both inter‐ (18.8%  3.0%) and intra‐specific (1.37%  0.37%) variability. Among the plastid loci, matK produced the highest mean interspecific variability (2.4%  0.5%), with a mean intra‐specific variability of 0.06%  0.03%. rpoC1 and accD yielded the lowest interspecific variability (0.8%  0.4%), with no variability at the intra‐specific level. rpl36‐rps8 also showed no intraspecific variability, but the interspecific variability of this locus was relatively high (1.3  0.5). The interspecific variability of rbcL was relatively low (0.9%  0.3%), as was its intraspecific variability (0.01%  0.01%). Barcode gap In the bar graph method, the level of intraspecific distance overlapped with the level of interspecific distance for all of the loci examined, suggesting no barcode gap (data not shown). In the dot plot method, the number of dots above (barcode gap) and below (no barcode gap) the 1:1 slope differed between loci (data not shown). The percentage of species exhibiting a barcode gap with respect to each locus is given in Table 3. Seventy percent of species showed a barcode gap in matK, rbcL, ITS, and rpl36‐rps8, followed by 55% of species in rpoB, 45% in rpoC1, trnL‐F, and ndhJ, 20% in accD, and 15% in trnH‐psbA. Resolvability of barcode loci in tree‐based and non‐tree‐ based methods The ability of plastid loci to resolve the examined Zingiberaceae species in accordance with the current taxonomic classification varied markedly between loci. A representative neighbor August 2014 | Volume 56 | Issue 8 | 760–773

762

Vinitha et al.

Table 1. Amplicon size of plastid and nuclear barcode loci in Zingiberaceae species and the sequence characteristics, singly and in different multi‐locus combinations Barcode target matK rbcL ITSa ITS2b rpoC1 rpoB ndhJ trnH‐psbA rpl36‐rps8 accD trnL‐F matK þ rbcL matK þ rbcL þ rpoC1 matK þ rbcL þ rpoB matK þ rbcL þ ndhJ matK þ rbcL þ trnH‐psbA matK þ rbcL þ rpl36‐rps8 matK þ rbcL þ accD matK þ rbcL þ trnL‐F Nine plastid targets

Amplicon size (bp)

Aligned length (bp)

Conserved sites

Variable sites

Parsimony informative sites

Singleton sites

925 950 650–750 275–325 620 570 420 370–400 550 300 400

718 922 823 331 502 517 343 289 407 236 293 1,640 2,142 2,157 1,983 1,929 2,047 1,876 1,933 4,227

636 882 433 160 483 475 327 278 369 224 243 1,518 2,001 1,993 1,845 1,796 1,887 1,742 1,761 3,917

82 40 369 181 19 42 16 11 28 12 49 122 141 164 138 133 150 134 171 299

76 39 315 164 19 40 16 11 28 12 42 115 134 155 131 126 143 127 157 283

6 1 54 17 0 2 0 0 0 0 7 7 7 9 7 7 7 7 14 16

a

Analysis included the sequences obtained by both direct sequencing and cloning. bAnalysis included the sequences obtained by direct sequencing and by excising from ITS sequences generated by cloning.

joining (NJ) tree yielded by matK sequences is given in Figure 1. matK and rbcL consistently yielded the highest species resolution of 75%, followed by 70% by rpl36‐rps8, 60% by rpoB, 45% by rpoC1, 40% each by trnL‐F and ndhJ, and 20% each by accD and trnH‐psbA (Table 3). The percentage of species resolved by ITS (60%), including the sequences generated by cloning (Figure S1), was lower than that of matK and rbcL (75%) (Table 3). Of the eight species in which all the accessions

Table 2. Mean intra‐ and inter‐specific K2P distance in plastid and nuclear loci in 60 samples belonging to 20 species of the family Zingiberaceae Mean (%) Loci matK rbcL ITSa ITS2b rpoB rpoC1 accD trnH‐psbA rpl36‐rps8 trnL‐F ndhJ

Interspecific

Intraspecific

2.4  0.5 0.9  0.3 12.0  1.4 18.8  3.0 1.8  0.6 0.8  0.4 0.8  0.4 0.9  0.5 1.3  0.5 2.4  0.8 1.2  0.5

0.06  0.03 0.01  0.01 0.92  0.19 1.37  0.37 0.07  0.04 0 0 0.04  0.04 0 0.09  0.07 0.03  0.02

a

Analysis included the sequences obtained by both direct sequencing and cloning. bAnalysis included the sequences obtained by direct sequencing and by excising from ITS sequences generated by cloning. August 2014 | Volume 56 | Issue 8 | 760–773

yielded ITS sequences by direct sequencing, only five species (25%) were resolved. The level of species resolution yielded by ITS2 was similar to that of ITS, although the tree topologies were different (data not shown). In GenBank searches for sequences of barcode loci used in the study from Zingiberaceae, sequences for three Indian accessions in a species were retrieved only for ITS locus. Most of the sequences retrieved were small (50% (500 replicates). Paraphyletic groups identified are colored. Table S1. Plant materials used in the study together with their geographical origin and the GenBank accession numbers of the plastid and ITS sequences generated, and the haplotypes identified Table S2A. Number, name, and distribution of ITS haplotypes identified in 58 accessions belonging to 20 species of the family Zingiberaceae Table S2B. Name and distribution of plastid haplotypes identified in 60 accessions belonging to 20 species of family Zingiberaceae Table S3. Primer sequence, annealing temperature, and expected amplicon size for the barcode loci used in the study

Prospects for discriminating Zingiberaceae species in India using DNA barcodes.

We evaluated nine plastid (matK, rbcL, rpoC1, rpoB, rpl36-rps8, ndhJ, trnL-F, trnH-psbA, accD) and two nuclear (ITS and ITS2) barcode loci in family Z...
2MB Sizes 1 Downloads 3 Views