Briefings in Functional Genomics Advance Access published July 6, 2015 Briefings in Functional Genomics, 2015, 1–7 doi: 10.1093/bfgp/elv026 Review paper

MicroRNA nomenclature and the need for a revised naming prescription Hikmet Budak, Reyyan Bulut, Melda Kantar, and Burcu Alptekin

Abstract A central environment and interface for microRNA (miRNA) registry and repository and a general standardized framework for their systematic annotation was established over a decade ago. However, the numbers of experimentally and computationally identified miRNAs are swiftly accumulating, and new aspects of miRNA-mediated gene regulation are being revealed. Currently, it is of great significance that the annotation framework should be redefined to include newly discovered miRNA species such as the variants of mature miRNAs (isomiRNAs), and organellar miRNAs: cipomiRNAs and mitomiRNAs. It is also of great importance that key terminology referring to the novelty, evolutionary history or biogenesis of miRNAs, as well as the confidence of their identification are standardized in the literature and disseminated in a central miRNA registry. Here, we review the status of miRNA nomenclature, curation and critical points of need for a revision of miRNA nomenclature and terminology. Key words: microRNAs; isomiRNAs; mitomiRs; cipomiRs; bona fide miRNAs; canonical miRNAs

Introduction First microRNA (miRNA) was identified and shown to have a regulatory role in Caenorhabditis elegans in 2001 [1]. Soon after this discovery, several other miRNAs of different animal and plant species were reported [2–4]. Discovery of these tiny RNA molecules revolutionized our understanding of post-transcriptional regulation of gene expression, and miRNAs became a major focus of many research groups. Methods for miRNA identification are progressively improved by several researchers and involve careful utilization of experimental techniques, as well as bioinformatics. Recently, with the rapid development of next-generation sequencing techniques and in silico tools, miRNA identification has gained pace. Currently used wet-lab methods for miRNA identification include direct cloning, as well as high-throughput expression profiling: small RNA (sRNA) library sequencing and hybridization-based techniques. New miRNAs can also be discovered through in silico approaches, which include (1) homology-based

method for conserved miRNA identification, taking into consideration the sequence conservation of mature miRNAs across species, as well as the consistency of precursor miRNA (premiRNA) secondary structures with previously established rules, and (2) algorithms developed for novel miRNA prediction, which use support vector machine to define a set of characteristics of pre-miRNA stem-loops from empirical data, subsequently use it for in silico miRNA identification [5]. Following their discovery, rapid accumulation of a vast amount of new miRNA had been accompanied by the need for a central registry and repository database and an interface for their systematic annotation. Such an environment, ‘microRNA Registry’ (miRBase) was first established in 2002 [6, 7]. miRBase aimed to give standardized names to miRNAs by providing a web interface for the submission of the miRNA sequence during the pre-publication process. To prevent misannotations and overlapping miRNA names, miRBase assigned new names to new miRNA families and miRNAs belonging to known miRNA

Hikmet Budak is a PI in Plant Genetics and Genomics at Sabanci University, Istanbul, Turkey, and is a coordinating committee member of the International Wheat Genome Sequencing Consortium and European Triticeae Genomics Initiative. Reyyan Bulut is a MS candidate in Sabanci University, Molecular Biology, Genetics and Bioengineering program. Melda Kantar is a Postdoctoral researcher at Sabanci University, Molecular Biology, Genetics and Bioengineering program. Burcu Alptekin is a PhD candidate in Sabanci University, Molecular Biology, Genetics and d Bioengineering program. C The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: [email protected] V

1

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

Corresponding author. Hikmet Budak, Biological Sciences & Bioengineering Program, Faculty of Engineering and Natural Sciences, Sabanci University, Tuzla 34956, Turkey. Tel.: þ902164839575; Fax: þ902164839550; E-mail: [email protected]

2

|

Budak et al.

families in a sequential manner under the rules of annotation criteria described by Ambros and his colleagues [8]. The first release of miRBase (v. 1, February 2002) contained only 218 miRNAs from five different species; however, currently, its extensive collection contains more than 35 000 mature miRNA sequences from 223 different species [7].

miRNA reportoires (e.g. miRGator), some of which particularly focus on a specific (e.g. miRCancer) or multiple (e.g. HMDD) disease condition(s). It is of great significance that all currently partial and disparate miRNA-related data are combined into a single repository to provide easy and complete access for comparative and collective analysis.

miRNA annotation criteria

miRNA databases Besides miRBase, being the most comprehensive repository of reported miRNAs, other databases, which provide access to miRNA data sets and/or other miRNA-related information have also been established (Table 1). While most of these are repositories of miRNAs and/or their targets related to animal species (e.g. miRDB), web-based resources that harbor miRNA-related documentation in relation to viruses (e.g. VIRmiRNA) and plants [Plant microRNA Database (PMRD), TarBase] are also present. In fact, PMRD is an important resource of miRNAs from 127 plant species, and TarBase (v. 4.5, November 2013) contains experimentally validated miRNA targets for Arabidopsis thaliana and four viruses, as well as 13 animal species. Although the majority of abovementioned databases serve as repositories of miRNAs and their targets (e.g. miRDB), some others additionally or particularly provide other miRNA-related information such as mature miRNA expression patterns (e.g. miRGator) and miRNA interactions with other molecules (e.g. Starbase), genomic maps of miRNA coding regions (e.g. miRNAmap) or regulatory cis elements (e.g. miRT) of miRNA genes. There are also web-based resources that include information related to particular aspects of miRNA-target interactions, such as adenine to inosine RNA editing (e.g. miR-Editar) [9], or single nucleotide polymorphisms (e.g. miRdSNP) [10] that result in slightly different messenger RNA (mRNA) molecules. Despite the presence of a considerable number of web-based resources that contain valuable information in relation to miRNAs and their targets, these documentations are partial and restricted to only a limited number of species (e.g. miRSEL). In fact, there are repositories that exclusively contain Homo sapiens

miRNA mis-identification Having several similar features in terms of structure, biogenesis and function, siRNAs have been the major contaminants, sneaking into the miRBase database [11, 12]. Indeed, miRNA identification with high-throughput sRNA library sequencing, which recently emerged as one of the most powerful techniques for miRNA discovery, is complicated by the existence of other sRNAs, which can be erroneously designated as miRNAs. Biogenesis of miRNAs and siRNAs both depends on DICER (DCL) activity; in fact, the biogenesis of secondary or natural antisense siRNAs is through DCL1 cleavage, as is the case of miRNAs. Similar to miRNAs, siRNAs (particularly trans-acting siRNAs) also cleave their mRNA targets after binding to RNA-induced silencing complex (RISC) [13]. Despite these shared features, miRNAs and siRNAs also have several distinctive characteristics. For instance, the average lengths of these two subclasses of sRNAs are different: 21 nucleotides for miRNAs and 24 nucleotides for siRNAs. Endogenous siRNAs are formed through an RNA POLYMERASE IV/RNA POLYMERASE V (PolIV/PolV)-dependent pathway while miRNAs are generated by the RNA POLYMERASE II-dependent pathway. Furthermore miRNAs are processed from imperfectly base-paired hairpin structures, while siRNAs are generated from double-stranded and perfectly base-paired exogenous or endogenous RNA precursors. They also differ in terms of binding to their mRNA targets, in contrast to miRNAs, siRNAs require perfect base pairing [14]. The consideration of these characteristics is crucial for distinguishing these two different subclasses of sRNAs and for their accurate annotation. Additionally, it is known that siRNAs generally originate from repeat-related and transposon-derived sequences; hence, putative precursors from repetitive regions must be handled particularly carefully during miRNA identification and annotation [13]. Other types of contaminations mis-annotated as miRNAs include small fragments of mRNAs, long noncoding RNAs, transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) [11, 15]. The presence of these contaminants may pose serious complications in miRNA identification, affecting the specificity and reliability of current miRNA repertoires especially in species for which available genomic and/or transcriptomic sequences are scarce. This situation is particularly exacerbated in plants with large and complex genomes that the availability of sequence information is particularly limited and miRNAs constitute only a small portion of the noncoding RNA repertoire [13]. In organisms with extensive genomic and transcriptomic sequence data, the elimination of these contaminants from sequence data sets before miRNA identification and annotation can greatly reduce mis-annotations. Yet, it is also noteworthy that eliminating such sequences must be processed cautiously because recent research continues to uncover unexpected origins for miRNAs. Intriguingly, miRNAs originating from tRNA or ribozyme sequences with a single mutation were recently reported [16]. Also supporting this hypothesis, Mus musculus miR-712 was shown to be originated from pre-ribosomal RNA [17].

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

A detailed guideline for miRNA identification and annotation was reported in 2003 [8]. This guideline emphasized the importance of providing evidence for both miRNA biogenesis and expression as the ultimate proof for the existence of a miRNA in a given species. Accordingly, to verify expression, sRNA transcripts should be either (1) detected through hybridization to size-fractionated RNA or (2) identified in a library of complementary DNA constructed from size-fractionated RNA. Owing to the functional and biochemical similarities of miRNAs to other classes of sRNAs such as small interfering RNAs (siRNAs), these expression criteria alone are insufficient for reliable miRNA identification. Thus, the application of additional criteria regarding miRNA biogenesis was also emphasized in these original guidelines [8]. Confirmation of miRNA biogenesis may be established through one of the following methods: (1) Prediction of the lowest free energy stem-loop structure of the miRNA precursor (pre-miRNA), which contains the mature miRNA on one arm of the hairpin, (2) Showing phylogenetic conservation of the mature miRNA and its pre-miRNA secondary structure, which is not necessarily the lowest free folding energy stem-loop structure, (3) Providing evidence for precursor miRNA accumulation in backgrounds with reduced Dicer activity.

http://starbase.sysu.edu.cn/

http://mirdb.org/miRDB/ http://mirtarbase.mbc.nctu.edu.tw/ http://www.mir2disease.org/

http://www.umm.uni-heidelberg.de/ apps/zmf/mirwalk/ http://202.38.126.151/hmdd/mirna/md/

Starbase

miRDB TarBase mir2disease

miRWalk

http://microrna.osumc.edu/mireditar http://bioinformatics.cau.edu.cn/PMRD/

http://mirnamap.mbc.nctu.edu.tw/

http://diana.cslab.ece.ntua.gr/mirgen/

http://genie.weizmann.ac.il/pubs/mir07/ mir07_data.html http://cegg.unige.ch/mirortho

mips.helmholtz-muenchen.de/phenomir

http://www.isical.ac.in/bioinfo_miu/ miRT/miRT.php http://www.mirnabodymap.org/

http://crdd.osdd.net/servers/virmirna/ http://services.bio.ifi.lmu.de/mirsel/

miR-Editar Plant microRNA database

miRNAmap

miRGen

PITA

miROrtho

PhenomiR

miRT

miRNAbodymap

VIRmiRNA miRSEL

miRGator

http://mirdsnp.ccr.buffalo.edu/ http://mircancer.ecu.edu/ http://www.ebi.ac.uk/enright-srv/ microcosm/htdocs/targets/v5/ http://mirgator.kobic.re.kr/

Viruses H. sapiens, M. musculus, R. norvegicus

H. sapiens

H. sapiens

H. sapiens

Animals

C. elegans, D. melanogaster, M. musculus, H. sapiens

Several organisms

Mammals

H. sapiens Plants

H. sapiens

H. sapiens, M. musculus, R. norvegicus, C. elegans, D. melanogaster, G. gallus, D. rerio H. sapiens H. sapiens Several organisms

H. sapiens

H. sapiens, M. musculus, R. norvegicus

H. sapiens, M. musculus, R. norvegicus, C. lupus, G. gallus Several organisms including animals and plants H. sapiens

Several organism including plants and animals H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans H. sapiens

Organism

Source for SNPs in miRNA target sites and its interactions with diseases. Database for cancer related miRNAs. A web resource developed by the Enright Lab at the EMBL-EBI containing computationally predicted targets for miRNAs. Web source about miRNA diversity and expression from deep sequencing data. A database of predicted A-to-I edited miRNA binding sites. Source of plant miRNA data including miRNA sequence and their target genes, secondary structure, expression profile and genome browser. Database for genomic maps of miRNA genes and their target genes in mammalian genomes. A database providing information regarding genomic location of miRNA precursors and their associated transcription factors A database of predicted miRNA targets including upstream and downstream regions. A database that contains predictions of precursor miRNA genes covering various animal genomes with orthology of genes and a Support Vector Machine. A database that provides information about differentially regulated miRNA expression in diseases and other biological processes. A database for pri-miRNAs and associated information about transcription start sites (TSSs) including promoters. Web resource of RT-qPCR miRNA expression data and functional miRNA annotation in normal and diseased human tissues. Database on experimental viral miRNA and their targets. A documental database with associations between miRNAs and their targets using bibliographical searches.

Database of published miRNA sequences and their annotations. Database for experimentally identified miRNA expression patterns and predicted miRNA targets. A database that shows interactions of miRNA-lncRNA, miRNA-mRNA, miRNA-circRNA, miRNA-pseudogene, miRNA-sncRNA, proteinlncRNA, protein-sncRNA, protein-mRNA and protein-pseudogene. Database of miRNAs and their targets with a wiki interface. Database of experimentally validated miRNA targets. Source of miRNA deregulation in different human diseases with a description of the miRNA–disease relationship, miRNA expression pattern in the disease state, experimentally identified miRNA target gene and literature reference. A database that provides information about miRNAs from human, mouse and rat including validated binding sites on their target genes. A database that shows the associations of miRNA and disease from literatures. A web resource for animal miRNA–target interactions.

Brief information

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

miRdSNP miRCancer MicroCosm

http://mirecords.biolead.org/

http://www.mirbase.org http://www.microrna.org/microrna/

miRBase Microrna.org

The human microRNA disease database (HMDD) miRecords

Web address

Database name

Table 1. List of current publically available miRNA databases

MicroRNA nomenclature | 3

4

|

Budak et al.

Emerging concerns in the consistency of mirna nomenclature Although it has been more than a decade since the basic rules of miRNA annotation were set [8] and miRBase was established as a repository for miRNAs [6], during this time only a portion of the vast amount of miRNAs discovered were properly annotated and registered to miRBase. In fact, in some publications, miRNAs were assigned with names that do not abide the rules initially determined by Ambros and his colleagues [8]. For instance, recently, two bread wheat miRNAs namely 1136-P3 and PN-2013 were reported [18]. It is crucial that this issue, which has resulted in an overall inconsistency in the sequential annotation of the discovered miRNAs, must be urgently addressed.

As well as distinguishing miRNAs from other sRNAs, consistent naming of miRNAs is equally important to avoid redundancies or mis-annotations. Currently, mature miRNAs are named with ‘miR’ prefix followed by a numerical identifier. Although not indicated in the first miRNA naming guideline, animal miRNAs are distinguished from plant miRNAs with the use of a dash symbol (e.g. miR-14 vs. miR172). Additionally, animal

Figure 1. A simplified diagram of canonical and noncanonical miRNA biogenesis pathways and current nomenclature of associated micoRNAs. (A) The miRNAs originating from the MIR1a/mir-1a genomic loci through canonical miRNA biogenesis pathway were presumed as the first identified miRNAs and named as miR1a-1-3p.1 and miR1a-1-5p.1. According to the annotation criteria, a distinct mature miRNA sequence excised from the same hairpin structure (miRNA-like RNA), resulting from sequential processing was named as miR1a-3p.2. (B) IsomiRNAs of miR1a was named as miR1b and miR1c as the miRBase currently acknowledges these isoforms as distinct sequences. However, the naming scheme used for the products of the sequential processing could be applied to isomiRs as well (miR1a-3p.3 and miR1a-3p.4). This approach will also simplify the genomic loci annotation. It must be noted that these isoforms may have exact same sequences as different miRNA family members, worsening the confusion further. Therefore, we additionally proposed the usage of ‘isomiR’ prefix with the additional numerical suffixes for the mature miRNA isoforms diced from the same precursor (isoforms of miR1a-3p.1: isomiR1a-3p.1.1 and isomiR1a-3p1.2). (C) Identical mature miRNA sequences encoded by distinct genomic loci (mir-1a-2 and miR-1a-3). MIR1a-3/mir-1a-3: Paralogous miRNA: The same primary and mature miRNA sequences result from the intragenomic duplication events. (D) A miR1 family member encoded by a different genomic location (MIR1d/mir-1d). (E) Nat-miRNAs: nat-miRNAs are transcribed from the miRNA target gene antisense strand and post-transcriptionally regulate the same genes’ products. (F) miRtrons: The miRNAs that are diced from the intronic regions and processed by DCL1 after the splicing of an mRNA transcript. Although they have not been fully established yet, future miRNA naming schemes should be designed to embrace organellar miRNAs as well.

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

miRNA nomenclature and related concerns

pre-miRNAs are named with lowercase and italic characters (e.g. mir-14). The names of plant pre-miRNAs differ by the capitalization of the ‘mir’ prefix and the absence of the dash symbol (e.g. MIR172). Identical miRNAs from different organisms are named with the same identifier number except a three-letter prefix defining the species name followed by a dash (e.g. miR172 in Triticum aestivum: tae-miR172 and in Hordeum vulgare: hvu-miR172). Lettered suffixes denote closely related mature miRNA sequences expressed from different precursors or genomic loci and signify miRNA family members (e.g. tae-miR172a, tae-miR172b), and identical mature miRNA sequences within a species are named with numbered suffixes after a ‘dash’ symbol [8]. Still, some of the miRNAs registered in miRBase with this notation do not have the exact same sequence (hsa-miR-19b-2-5p: AGUUUUGCAGGUUUGCAUUUCA and hsa-miR-19b-1-5p: AGUUUUGCAGGUUUGCAUCCAGC). Special care must be taken when considering miRNA genes (MIR), as identical mature miRNA sequences can be encoded by both paralogous MIR loci originated by an intraspecific duplication event or from completely different primary transcripts (Figure 1). As the miRNA genes are named after their mature miRNA product, similarity between the names does not always refer to the sequence correspondence. In some cases, various miRNA species can be diced from a common precursor. Distinct mature miRNA sequences

MicroRNA nomenclature

The necessity for restructuring of previous guidelines based on new findings As miRNA-mediated gene regulation mechanisms are thoroughly scrutinized, the knowledge on miRNA varieties, features and targets is continuously expanding, necessitating the restructuring of existing guidelines for miRNA nomenclature. Ongoing research continues to reveal previously unknown miRNA characteristics, which impede with their nomenclature, leading to ambiguities and confusions. For instance, the nomenclature of two strands generated after unwinding of the duplex that emerges from pre-miRNA subsequent to its processing by Dicer, was recently updated. This duplex was previously thought to contain only one strand, referred as the ‘guide strand’, that is incorporated into RISC and function in posttranscriptional regulation as a mature miRNA. The other strand, referred as the ‘passenger strand’, was believed to be degraded, as it was recurrently observed to exhibit far less accumulation in comparison to the guide strand. This passenger

strand has been conventionally named with an asterix suffix (e.g. miR172*). However, subsequent findings revealed both strands of the duplex are viable and can be functional in targeting mRNA populations [20]. Thus, suffixes -3p and -5p were adopted (miR172-3p; miR172-5p) to indicate two different mature miRNAs generated from the 30 and 50 arms of the premiRNA hairpin, respectively. Recently, nonoverlapping mature miRNA sequences have been shown to originate from the same hairpin [21], an occurrence that was unknown at the time when initial guidelines of miRNA annotation were set. Back then, this type of sRNAs were advised to be handled as siRNAs, or if they were proven to be miRNAs, they were to be named as if they were alternatively processed products of the same gene. In a recent publication, such Arabidopsis miRNAs generated from the same precursor were reported as miRNA-like RNAs [21]. Hence, a new set of standardized rules is urgently needed to indicate nonoverlapping mature miRNA sequences that originate from the same hairpin and avoid confusions. Another recently discovered miRNA species are isomiRNAs (isomiRs), mature miRNA sequence variants from the same genomic locus. These variants are either templated with shifted start and end positions produced by alternative DCL1 cleavage or non-templated, emerging through nucleotide additions, removals or modifications by other enzymes [22]. Recent research highlights that isomiRs constitute a new level of miRNA-exerted gene regulation, pointing out their functional relevance and roles in plant cell metabolism and stress responses [21, 23–25]. However, currently, the annotation of isomiRs is also still ambiguous and awaits to be determined. Organellar miRNA is another newly emerging miRNA class that was unknown when the original guidelines of miRNA annotation were set. Among these are mitomiRs, miRNAs found within mitochondrial matrix [26]. It is unknown whether these miRNAs are encoded by the nuclear genome followed by their transport to mitochondria, or transcribed from the mitochondrial genome. However, mitomiRNAs were shown to have putative targets encoded by mitochondrial genes, which implies that these miRNAs may also be of mitochondrial origin based on the hypothesis on miRNA evolution [16, 27]. New rules should also be established to distinguish organellar miRNAs from their nuclear homologs. Even though they have not been experimentally established yet, chloroplast genome may also encode putative miRNAs, which can be denoted as cipomiRs [28] in reference to mitomiRs. So far, two different miRNA annotation guidelines have been proposed by the leading scientists of the field [8, 13]. Additionally one can refer to the miRBase Web site (http://www. mirbase.org/help/nomenclature.shtml) for an online clarification. For consistency purposes the same authorities and curators should take the initiative for setting the new criteria. Besides the construction of this new guideline as it will embrace all of the currently known miRNA types and prevent uncertainties, its correct adoption and usage by the researchers (through the central registry—miRBase) is of great importance for the accuracy of the future research.

miRNA-related terms and their correct use in literature As well as the annotation of miRNAs, correct use of related terminology is equally important to highlight genuine and new miRNAs and to obtain direct and precise information from

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

generated by sequential processing of the same precursor are distinguished with a number suffix after a dot (e.g. alymiR774a-3p.2: and aly-miR774a-3p.1 or osa-miR159a.1 and osamiR159a.2) [11]. However, the definition for the usage of numerical suffixes followed by a dot is highly inclusive as sequential processing of the same precursor may both result in the formation of isomiRs, which have almost the same sequence of the known canonical miRNA, and other types of miRNAs excised from another region on the precursor (miRNA-like RNAs) and therefore have divergent sequence (Figure 1) (See section ‘The necessity for restructuring of previous guidelines based on new findings’). In this case, while the isomiRs of the canonical miRNA will have almost the same sequence and named as miR1-3p.2, the other miRNAs diced from the distinct parts of the precursor will be named as miR1-3p.3 and will be different from the canonical miRNA (miR1-3p.1 in this case) and its isomiR sequences. Currently, isomiRs are considered as distinct sequences in miRBase, causing an unrealistic plethora in miRNA repertoires. To reduce this complication, we suggest to use ‘isomiR’ prefix with an additional number suffix for the different isoforms of the canonical mature sequence (canonical miRNA: miR1-3p.1, isomiR1: isomiR1-3p.1.1, isomiR2: isomiR13p.1.2) and depositing these sequence variants affiliated with their canonical counterpart in a single record. The assignment of similar names to the miRNAs having unrelated sequences with similar targets further complicates the annotation [13]. Even though the target of a specific miRNA is indicative of its identity, given that miRNAs orchestrate gene regulation by target cleavage or transcriptional inhibition, keeping the annotation based on simply the sequence similarity will simplify the posterior analysis in every sense. Our increasing knowledge on miRNAs has naturally evolved/altered the originally defined miRNA names and sequences. Notwithstanding, refinement and enhancement on the miRNA-related data could not have been followed up by the annotation efforts, bringing forth the usage of the outdated nomenclature [19]. In addition, the plethora of independent databases serving many different purposes (See section ‘miRNA databases’) exponentially impeded the concurrent usage of the updated naming criteria. Thus, keeping track of the current nomenclature during the pre-publication process should be strictly controlled with the help of new online tools like miRBase Tracker [19].

| 5

6

|

Budak et al.

species specific with low expression levels, are noted as ‘young miRNAs’. Emerging knowledge and related terminology regarding miRNAs require a standardization of miRNA-related terms. Consistency of miRNA-related terminology in literature and the availability of such terms along with all currently reported miRNAs in a central repository is necessary to provide easy access to all miRNAs and information in relation to their identification, evolution and biogenesis from a single environment and to distinguish novel and newly identified miRNAs preventing disruption of sequential miRNA annotation.

Key points • Since their discovery, miRNAs became a major re-

search focus, and with the rapid development of highthroughput miRNA identification techniques, our knowledge on miRNAs has been expanding quickly, giving rise to the need of a central registry. • The central role of the miRNAs in post-transcriptional regulation made these molecules the subject of many research areas and they have been deposited in independent databases serving for specific needs. However, this separation made the synchronized editing and update of the miRNA data sets improbable. • Ongoing research revealed new miRNA species and biogenesis pathways, leading to confusions and misannotations within the existing nomenclature guidelines. • As a consequence of the accumulated data and the diversity of miRNA identification techniques, researchers used various terminologies to signify the novelty, confidence level, biogenesis pathway and evolutionary history of the identified miRNAs. Here we concluded the related terms and their correct usage.

Funding This work is funded by Sabanci University.

References 1. Lee RC, Ambros V. An extensive class of small RNAs in Caenorhabditis elegans. Science 2001;294:862–4. 2. Reinhart BJ, Weinstein EG, Rhoades MW, et al., MicroRNAs in plants. Genes Dev. 2002;16:1616–26. 3. Rhoades MW, Reinhart BJ, Lim LP, et al. Prediction of plant microRNA targets. Cell 2002;110:513–20. 4. Llave C, Kasschau KD, Rector MA, et al. Endogenous and silencing-associated small RNAs in plants. Plant Cell 2002;14: 1605–19. 5. Budak H, Khan Z, Kantar M. History and current status of wheat miRNAs using next-generation sequencing and their roles in development and stress. Brief Funct Genomics 2015;14: 189–98. 6. Griffiths-Jones S, The microRNA Registry. Nucleic Acids Res 2004;32:D109–11. 7. Kozomara A, Griffiths-Jones S. MiRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 2014;42:D68–73. 8. Ambros V, Bartel B, Bartel DP, et al. A uniform system for microRNA annotation. RNA 2003;9:277–9.

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

terminology in relation to miRNA biogenesis and evolution. Each strategy of miRNA identification has its own advantages and disadvantages and differs in terms of sensitivity, specificity and novelty of the miRNA data set it generates [29]. As huge amount of partial miRNA repertoires accumulate from disparate studies, multiple and independent terms are adopted in literature to define the ‘confidence of evidence for presence’ and the ‘novelty’of reported miRNAs. Referring to the strength of evidence for the actual presence of the identified miRNA, the terms ‘putative’ and ‘bona fide’ (real, genuine) are currently used. ‘Putative’ indicates that the reported miRNA is a prediction and a potential miRNA. miRNAs identified through in silico strategies have been described as ‘putative’ since computationally derived data may contain false positives and further experimental evidence is required to validate the presence of a predicted miRNA [30]. When such experimental proof is provided, along with an appropriate pre-miRNA stem-loop structure for an miRNA, than it can be reported as ‘bona fide’, meaning ‘real’ [31]. In the latest versions of miRBase, some entries were classified and tagged as ‘high confidence’ based on a criteria set in relation to the number of sRNA library sequences reported aligning to the regions of the stem-loop that contains potential mature miRNAs [7]. Some other terms have also been adopted by researchers to notify the discovery of novel or newly identified miRNAs. When an miRNA is reported for the first time in a given species, it is most commonly referred as ‘newly identified’ [30]. If it also does not have any known homologoues in other species, it is frequently denoted as ‘novel’ [32]. ‘Novel’ miRNAs can be identified only through cloning, sRNA library sequencing, or computational methods developed particularly for novel miRNA prediction. However, ‘newly identified’ miRNAs can also be identified through homology-based experimental or computational methods [5]. miRNAs, being a major research focus of the last decade, not only tremendously increased in number, but also their further investigation has resulted in a great extent of knowledge regarding to miRNA biogenesis and evolution. Noncanonical pathways of miRNA biogenesis were discovered [33] introducing the term ‘canonical miRNA’ to the literature, which indicates that the referred miRNA emerges through the canonical biogenesis pathway (Figure 1). One subclass of noncanonical miRNAs, which are processed through a noncanonical pathway and arise from spliced-out introns of genes was named as ‘mirtrons’. These newly defined molecules were found to skip DCL1 cleavage from primary miRNA (pri-miRNA) to pre-miRNA and feed into the miRNA maturation pathway as pre-miRNAs [34]. Another noncanonical miRNA subclass is natural antisense miRNAs (nat-miRNAs) derived from the processing of the antisense transcript of miRNA target gene [35]. As well as their biogenesis, recently, miRNAs were also investigated in relation to their evolution. Although there are several proposed mechanisms for miRNA origin such as duplication of preexisting miRNA genes or protein-coding genes, generation from transposable elements and the formation of hairpin structure during genome evolution, how miRNA genes actually originated is still unclear [29]. However, to date several miRNAs were found to be evolutionary conserved across major lineages of plants, while many others are believed to be species specific, pointing to a dynamic process of loss and origination of miRNA genes during evolution. Hence, miRNAs deeply conserved, or lost in the transition from ancestor are referred as ‘old miRNAs’, while newly evolved miRNAs, which are mostly

MicroRNA nomenclature

22. Guo L, Chen F. A challenge for miRNA: Multiple isomiRs in miRNAomics. Gene 2014;544:1–7. 23. Ebhardt HA, Tsang HH, Dai DC, et al. Meta-analysis of small RNA-sequencing errors reveals ubiquitous posttranscriptional RNA modifications. Nucleic Acids Res 2009;37: 2461–70. 24. JeongDH, Thatcher SR, Brown RSH, et al.Comprehensive investigation of microRNAs enhanced by analysis of sequence variants, expression patterns, ARGONAUTE loading, and target cleavage. Plant Physiol 2013;162:1225–45. 25. Lu S, Sun YH, Chiang VL. Adenylation of plant miRNAs. Nucleic Acids Res 2009;37:1878–85. 26. Sripada L, Tomar D, Singh R. Mitochondria: one of the destinations of miRNAs. Mitochondrion 2012;12:593–9. 27. Kamarajan BP, Sridhar J, Subramanian S. In silico prediction of microRNAs in plant mitochondria. Int J Bıoautomatıon 2013;16: 251–62. 28. Budak H, Kantar M, Bulut R, et al. Stress responsive miRNAs and isomiRs in cereals. Plant Sci 2015;235:1–13. 29. Zhang B, Wang Q. MicroRNA-based biotechnology for plant improvement. J Cell Physiol 2014;230:1–15. 30. Kurtoglu KY, Kantar M, Budak H. New wheat microRNA using whole-genome sequence. Funct Integr Genomics 2014;14: 363–79. 31. Thiebaut F, Grativol C, Carnavale-Bottino M, et al. Computational identification and analysis of novel sugarcane microRNAs. BMC Genomics 2012;13:290. 32. Xuan P, Guo M, Liu X, et al. PlantMiRNAPred: Efficient classification of real and pseudo plant pre-miRNAs. Bioinformatics 2011;27:1368–76. 33. Ha M, Kim VN. Regulation of microRNA biogenesis. Nat Rev Mol Cell Biol 2014;15:509–24. 34. Ladewig E, Okamura K, Flynt AS, et al. Discovery of hundreds of mirtrons in mouse and human small RNA data. Genome Res 2012;22:1634–45. 35. Lu C, Jeong D, Kulkarni K, et al. Genome-wide analysis for discovery of rice microRNAs reveals natural antisense microRNAs (nat-miRNAs). PNAS 2008;105:4951–6.

Downloaded from http://bfg.oxfordjournals.org/ at North Dakota State University on July 6, 2015

9. Lagana` A, Paone A, Veneziano D, et al. miR-EdiTar: a database of predicted A-to-I edited miRNA target sites. Bioinformatics 2012;28:3166–8. 10. Bruno AE, Li L, Kalabus JL, et al. miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3’UTRs of human genes. BMC Genomics 2012;13:44. 11. Meng Y, Shao C, Wang H, et al. Are all the miRBase-registered microRNAs true? A structure- and expression-based reexamination in plants. RNA Biol 2012;9:249–53. 12. Taylor RS, Tarver JE, Hiscock SJ, et al. Evolutionary history of plant microRNAs. Trends Plant Sci 2014;19:175–82. 13. Meyers BC, Axtell MJ, Bartel B, et al. Criteria for annotation of plant MicroRNAs. Plant Cell 2008;20:3186–90. 14. Carthew RW, Sontheimer EJ. Origins and mechanisms of miRNAs and siRNAs. Cell 2009;136:642–55. 15. Li A, Mao L. Evolution of plant microRNA gene families. Cell Res 2007;17:212–8. 16. Roberts JT, Cooper EA, Favreau CJ, et al.Continuing analysis of microRNA origins: Formation from transposable element insertions and noncoding RNA mutations. Mob Genet Elements 2013;3:e27755. 17. Son DJ, Kumar S, Takabe W, et al. The atypical mechanosensitive microRNA-712 derived from pre-ribosomal RNA induces endothelial inflammation and atherosclerosis. Nat Commun 2013;4:3000. 18. Feng H, Wang X, Zhang Q, et al. Monodehydroascorbate reductase gene, regulated by the wheat PN-2013 miRNA, contributes to adult wheat plant resistance to stripe rust through ROS metabolism. Biochim Biophys Acta 2014;1839:1–12. 19. Van Peer G, Lefever S, Anckaert J, et al., miRBase Tracker: keeping track of microRNA annotation changes. Database (Oxford) 2014;2014:1–8. 20. Baev V, Milev I, Naydenov M, et al. Insight into small RNA abundance and expression in high- and low-temperature stress response using deep sequencing in Arabidopsis. Plant Physiol Biochem 2014;84C:105–14. 21. Zhang W, Gao S, Zhou X, et al. Multiple distinct small RNAs originate from the same microRNA precursors. Genome Biol 2010;11:R81.

| 7

MicroRNA nomenclature and the need for a revised naming prescription.

A central environment and interface for microRNA (miRNA) registry and repository and a general standardized framework for their systematic annotation ...
338KB Sizes 2 Downloads 9 Views