Monika W. Murcha1,*, Reena Narsai2, James Devenish1, Szymon Kubiszewski-Jakubiak1 and James Whelan2 1

Australian Research Council Centre of Excellence in Plant Energy Biology, Bayliss Building M316, University of Western Australia, 35 Stirling Highway, Crawley 6009, Western Australia, Australia 2 Department of Botany, Australian Research Council Centre of Excellence in Plant Energy Biology, School of Life Science, La Trobe University, Bundoora 3083, Victoria, Australia

*Corresponding author: E-mail, [email protected]; Fax, +61-8-6488-4401. (Received July 21, 2014; Accepted November 21, 2014)

Keywords: Arabidopsis  Database  Import  Mitochondria  SAM,  TIM  TOM  Yeast. Abbreviations: AGI, Arabidopsis gene identifier; BLAST, basic local alignment search tool; DB, database; ERMES, endoplasmic reticulum–mitochondria encounter structure; GFP, green fluorescent protein; GI, gene identifier; MINOS, mitochondrial inner membrane organizing system; MPIC, mitochondrial protein import component; PAM, pre-sequence translocase-associated protein import motor; PRAT, PReprotein and Amino acid Transporters; SAM, sorting and assembly machinery; TAIR, the Arabidopsis Information

Resource; TIM, translocase of the inner membrane; TOM, translocase of the outer membrane.

Introduction As an essential powerhouse in the eukaryotic cell, it is not surprising that alterations to mitochondrial function can often result in significantly detrimental phenotypes. One of the most essential functions in plant mitochondria is the import and assembly of proteins, without which mitochondria are unable to function as many components of the respiratory chain are synthesized in the nucleus and rely on import and assembly in the mitochondria (Welchen et al. 2014). The rapid increase of publicly available genomic data for many plant species ranging from green algae to trees initiated an effort to update and comprehensively analyse the mitochondrial import components from plants. Whilst these components have been well characterized in Saccharomyces cerevisiae (yeast) and are often used as a template to identify components in plants, recent studies are showing that several significant differences have emerged in plants. These include the presence of plant-specific mitochondrial protein import components, the apparent lack of direct orthologs to some yeast import components and the expansion of gene family members encoding these components, resulting in subfunctionalization and/or neo-functionalization (Carrie et al. 2010b, Duncan et al. 2013, Murcha et al. 2014a, Murcha et al. 2014b). Furthermore, with the added complexity of an additional energy organelle, i.e. the plastid, plants were obliged to develop more specific protein import apparatus. Due to the emerging examples of neo- and subfunctionalization in Arabidopsis, a multispecies approach is very useful to characterize plant orthologs and define the conserved and/or divergent themes throughout evolution for the 44 mitochondrial protein import components identified in yeast (Cherry et al. 1998). The functional annotation of plant import components often depends largely on sequence orthology to yeast, followed by experimental approaches such as genetic, transcriptomic and proteomic analyses to provide further insight into their specific function.

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186, Advance Access publication on 29 November 2014, available online at www.pcp.oxfordjournals.org ! The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: [email protected]

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

In the 2 billion years since the endosymbiotic event that gave rise to mitochondria, variations in mitochondrial protein import have evolved across different species. With the genomes of an increasing number of plant species sequenced, it is possible to gain novel insights into mitochondrial protein import pathways. We have generated the Mitochondrial Protein Import Components (MPIC) Database (DB; http://www.plantenergy.uwa.edu.au/applications/mpic) providing searchable information on the protein import apparatus of plant and non-plant mitochondria. An in silico analysis was carried out, comparing the mitochondrial protein import apparatus from 24 species representing various lineages from Saccharomyces cerevisiae (yeast) and algae to Homo sapiens (human) and higher plants, including Arabidopsis thaliana (Arabidopsis), Oryza sativa (rice) and other more recently sequenced plant species. Each of these species was extensively searched and manually assembled for analysis in the MPIC DB. The database presents an interactive diagram in a user-friendly manner, allowing users to select their import component of interest. The MPIC DB presents an extensive resource facilitating detailed investigation of the mitochondrial protein import machinery and allowing patterns of conservation and divergence to be recognized that would otherwise have been missed. To demonstrate the usefulness of the MPIC DB, we present a comparative analysis of the mitochondrial protein import machinery in plants and non-plant species, revealing plant-specific features that have evolved.

Special Online Collection – Database Paper

MPIC: A Mitochondrial Protein Import Components Database for Plant and Non-Plant Species

M. V. Murcha et al. | Plant mitochondrial import component database

2

and analysis tools to be utilized from different sites for each species. To address these issues and provide a completely new resource, a database integrating information for the mitochondrial import components of 24 species has been developed, including 17 plant species representative of each taxonomic clade, plus several non-plant model species including, yeast, human, roundworm, mold, slime mold, and red and brown algae. The Mitochondrial Protein Import Components Database (MPIC DB) is a functional genomics database that performs integrated searches for relevant orthologous genes within each protein family from each species. It presents the known or predicted protein subcellular location, transcript and protein expression levels (for rice and Arabidopsis only), conserved protein domains and regulatory elements, as well as presenting the evolutionary relationship of each gene family member by phylogenetic analyses. Furthermore, the direct comparison with yeast and Arabidopsis import components, in relation to the number of genes within each family, facilitates a greater understanding of the evolutionary path undertaken by these family members and may provide insight into the conserved and divergent functions that may have developed over the course of evolution. Thus, the MPIC DB has been designed as a multileveled tool useful for researchers examining various components of mitochondrial biogenesis and function in multiple species. To present the ease of use and functions available in the MPIC DB, specific examples are shown and novel insights are revealed.

Results and Discussion Integrating information at a single website specific for mitochondrial import components A range of data, specific for the mitochondrial protein import components, has been collated and networked, providing for a more rigorous annotation of orthologs and assembling a range of useful data such as sequence information, expression data (Arabidopsis and rice), localization data (predicted and experimental), protein characteristics and evolutionary features generated specifically for the MPIC DB (Supplementary Table S1). The MPIC DB has been designed to allow information to be viewed in parallel, for all import components from each subunit within each protein complex. This provides for a user-friendly interface catering to different needs, either in the identification of a specific orthologs from a species of interest or for the investigation of genetic and protein characteristics from all species for a single import component.

Identification of orthologs from each gene family To gain a comprehensive view of all mitochondrial import components in a selection of species, all known components in yeast (S. cerevisiae) were compiled (Supplementary Table S2). Considering that there are a number of examples in which plant mitochondrial import components vary significantly or are plant specific, a separate list of Arabidopsis import components was generated based on orthology to

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

However, it should be noted that the initial characterization of the translocase of the outer membrane (TOM) complex in plants was based on biochemical approaches (Jansch et al. 1998). Even with the various specialized databases currently available, accessing and integrating information specifically on the mitochondrial import apparatus is still challenging, particularly due to the lack of comprehensive annotation in newly sequenced species. While recent databases have made significant advances in combining information for rice (Oryza sativa), with improved classifications and gathering of transcript and protein characteristics (Narsai et al. 2013, Sakai et al. 2013), further investigation into the members of larger and highly conserved gene families, such as the translocases of the inner membrane (TIMs), is still warranted, with many members being broadly annotated as ‘translocase of the inner mitochondrial membrane-like’, despite known experimental evidence for some of these components as being plastid localized (Murcha et al. 2007, Pudelski et al. 2012, Rossig et al. 2013). In Arabidopsis, identification of the mitochondrial proteome is well advanced, with the number of experimentally confirmed proteins reaching >800 (Lee et al. 2013). However, this still falls short of the 2,300 estimated mitochondrial proteins and, as such, comprehensive biochemical characterization is required to facilitate the discovery of novel proteins (Welchen et al. 2014). One such study identified 27 novel mitochondrial outer membrane proteins, several of which are candidates for protein import receptors (Duncan et al. 2011). While advances have been made towards the rice (Huang et al. 2009), Medicago truncatula (Dubinin et al. 2011, Kiirika et al. 2013) and potato mitochondrial proteome maps (Salvato et al. 2014), no comparably complete proteome map exists for other plant species. Thus, in silico approaches are required to elucidate the function of predicted mitochondrial import components from direct comparison with known orthologous genes in Arabidopsis which is the best developed model for the plant mitochondrial import apparatus. The existing database pertaining to the mitochondrial import apparatus in plants is far outdated and limited in its content and functions, identifying only putative Arabidopsis orthologs to yeast import components with few functional data and limited interactive capabilities (Lister et al. 2003). Since its publication in 2003, major advances have occurred in the identification of additional import components in both yeast and Arabidopsis. Furthermore, due to improved accessibility and affordability of next-generation sequencing, there has been an escalation in plant genome sequencing, facilitating the generation of an invaluable resource presenting a comprehensive overview of the plant mitochondrial import components. Many databases provide useful sequence information, function, expression annotations and other useful tools for analysis such as TAIR (the Arabidopsis Information Resource) for Arabidopsis (Rhee et al. 2003) and the more recent Rice DB (Narsai et al. 2013) and RAP-DB for rice (Narsai et al. 2013, Sakai et al. 2013), However, difficulty arises when specific groups or gene families are to be analyzed, requiring sequence downloads

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

Fig. 1 A schematic representation of the mitochondrial protein import components. Proteins that are plant specific are indicated with the leaf symbol . Yeast-specific import components are indicated in grsy with the yeast symbol . The four major import pathways undertaken by all mitochondrial proteins are indicated. All precursors initially pass through the translocase of the outer membrane (TOM) complex. Insertion of bbarrel proteins into the outer membrane occurs via the sorting and assembly machinery (SAM) (indicated in purple). Carrier pathway proteins translocate into the inner membrane via the small TIMs through to the TIM22 channel (indicated in orange). Cysteine-rich proteins of the intermembrane space are imported via the MIA pathway (mitochondrial intermembrane space and assembly) (indicated in red). For the majority of proteins (located in the matrix and containing an N-terminal cleavable pre-sequence), translocation occurs via the TIM17:23 channel (indicated in blue). The pre-sequence is removed via mitochondrial processing peptidase (MPP) and the subsequent transit peptide is degraded by matrix-located peptidases.

yeast and after manual literature searching (Supplementary Table S3). In this way, it was possible to obtain an illustrative view of the mitochondrial import machinery of yeast and Arabidopsis in parallel (Fig. 1; an interactive version of this can be seen on the MPIC homepage). This interactive illustration allows users to click on specific components of interest in order to gain further information for those components within the MPIC DB. The parallel display of yeast and plant import components in MPIC easily shows the numerous differences between plant and yeast, such as the yeast-specific subunits Tim12, Tim18 and Tim54, and the plant-specific outer membrane 64 (OM64) protein (Fig. 1; interactive diagram on the MPIC homepage). Using this latest, manually compiled list of Arabidopsis mitochondrial import components (from Supplementary Table S3), all protein sequences for these were used for BLASTP analysis to determine orthologs from Ectocarpus siliculosus (brown algae) and Cyanidioschyzon merolae (red algae) within the

respective databases (Matsuzaki et al. 2004, Sterck et al. 2012) and in 17 plant species ranging from green algae Chlamydomonas reinhardtii, the ancient lycophyte, Selaginella moellendorffii to Eucalyptus grandis, with a representative species chosen from within each taxonomic clade using Phytozome 9.1 (Goodstein et al. 2012) (Fig. 2; Supplementary Table S4; see also ‘Data sources’ on the side panel at the MPIC site). Using this approach, we identified a putative novel plant ortholog to the recently identified yeast mitochondrial import component, Mgr2 (Gebert et al. 2012). This subunit has been shown to be a subunit of the translocase of the inner membrane TIM17:23 in yeast, and acts to promote coupling of the TIM17:23 complex to the TOM and the respiratory chain complex (Gebert et al. 2012). Additionally, this subunit is unique in that its import is mediated by a C-terminal extension that is processed by IMP (inner membrane protease) (Ieva et al. 2013). Whilst the identified Arabidopsis ortholog contains low 3

M. V. Murcha et al. | Plant mitochondrial import component database

similarity and identity percentages (ranging from 30% to 36% similarity) compared with yeast, the presence of the ROMO1 (reactive oxygen species modulator 1) conserved domain and the similar predicted protein mass suggest this to be a putative ortholog that can be targeted for further experimental analysis. Interestingly, the plant orthologs lack the C-terminal extension (as do Caenorhabditis elegans and Homo sapiens) and, while the C-terminal extension increases the import efficiency of Mgr2 from yeast, the mature protein is still capable of import and assembly (Ieva et al. 2013), warranting further experimental characterization of this putative plant ortholog. Database searches were also carried out for the subunits of the ERMES (endoplasmic reticulum–mitochondria encounter structure) and MINOS (mitochondrial inner membrane organizing system) complexes. While the subunit of the ERMES and MINOS complexes (ERMES, MINOS; Fig. 1) have yet to be shown to have a direct role in protein import, these have been implicated in a diverse range of functions ranging from maintenance of mitochondrial morphology, lipid metabolism, metabolite and ion transport, to tethering the mitochondria and the endoplasmic reticulum (van der Laan et al. 2012). The subunits of the ERMES and MINOS complexes were included within MPIC 4

largely due to their known interactions with other components of the protein import apparatus such as Sam50, Mia40 and Tom7 (Meisinger et al. 2004, von der Malsburg et al. 2011, Alkhaja et al. 2012). Notably, there are also some evolutionary relationships to import components, such as that between the ERMES complex subunit, Mdm10, which is part of the larger eukaryotic b-barrel membrane proteins superfamily containing porin, and the translocase of the outer membrane Tom40, the main protein channel of the outer membrane (Flinner et al. 2013). These database searches in MPIC has identified putative plant orthologs to several MINOS subunits such as Fzo1, Mdm1, Mio10 and Mgm1, and provide exciting targets for further experimental analysis, in order to determine the functional role of these proteins in plants, and also possibly to clarify whether these are novel plant subunits of these complexes.

Evolutionary insights into the expansion of the mitochondrial import components from algae to trees Protein sequences were grouped according to their larger gene families, resulting in a total of 1,956 genes and 2,211 locus

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

Fig. 2 An overview of the species phylogeny of all the organisms used in this study. The phylogenetic tree was compiled using the taxonomy database at NCBI and drawn using Phylodendron (http://iubio.bio.indiana.edu/treeapp/treeprint-form.html).

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

Fig. 3 The user-friendly web interface of the MPIC DB. (A) A screenshot showing the MPIC DB interface with links to an interactive diagram of the mitochondrial protein import components. Links are also provided to information regarding data types and sources and the phylogeny tree of species investigated. The right-hand columns show how to search specific import components, either by clicking on the interactive diagram, which will highlight the import component chosen, i.e. Tim17, or by query in the search box. Output is displayed as shown in (B) when ‘Tim17’ is searched. Details of each data type are shown. (C) Selected examples of data output: (i) ExPASy predictions from all Tim17 orthologs from 24 species identified; (ii) phylogenetic tree of Tim17 from 24 species; (iii) conserved protein domains from Tim17; and (iv) Tim17 expression profiles throughout development in Arabidopsis.

identifiers encoding putative mitochondrial import components from 75 different protein families across 24 species (Fig. 2). To demonstrate the data analyzed and presented in MPIC, the translocase of the inner mitochondrial membrane (Tim17) is shown as an example in Fig. 3. First, a screenshot of the interactive mitochondrial import component diagram on the front page of MPIC is shown, with the search box and click functions indicated (Fig. 3A). When Tim17 was entered into the search box or clicked within the interactive diagram (Fig. 3A), the resulting output contains many levels of data, from basic protein attributes such as molecular mass, amino

acid length and pI (in blue), to RNA expression (yellow), as well as detailed protein domain (orange) and localization information (violet) (Fig. 3B). A representation of the resulting protein attributes [Fig. 3C (i)] and protein alignment is shown in Fig. 3C (ii) for the Tim17 orthologs. The percentage similarity and identity scores were analyzed using MatGAT (Campanella et al. 2003), and the resulting matrix is included in MPIC. Phylogenetic analysis for each gene family involved using MEGA5 with 1,000 iterations (see the Materials and Methods; Tamura et al. 2011) and the resulting tree was presented as a branched model with the scores included [e.g. Fig. 3C (ii) for Tim17]. Phylogenetic trees can be viewed by clicking the tree 5

M. V. Murcha et al. | Plant mitochondrial import component database

6

and Amino acid Transporters (PRAT) family that originates from a single eubacterial ancestor (Rassow et al. 1999, Murcha et al. 2007). Notably, a single gene each encodes Tim17, Tim23 and Tim22 in yeast and other lower eukaryotes. In contrast, multiple orthologs exist for each PRAT protein in plants, with three genes encoding Tim17 and Tim23 in Arabidopsis thaliana and four in O. sativa, Zea mays and Brassica rapa (Fig. 4). This expansion is further exemplified by the presence of multiple genes encoding additional PRAT proteins—plant-specific proteins with an unknown function, e.g. PRAT1&2 (encoded by At5g55510 and At4g26670, respectively), PRAT3&4 (encoded by At3g49560 and At5g24650) and PRAT5 (encoded by At3g25120), which were named for the purposes of presentation in MPIC. PRAT3&4 genes are present in green algae C. reinhardtii and Volvox carteri, and all other uncharacterized PRATs appear in the early land plants Physcomitrella patens and S. moellendorffii (Fig. 4). Interestingly, revealing these PRATs as plant specific suggests not only an expansion of known PRAT protein families, such as Tim17 and Tim23 genes, but also a possible neofunctionalization of PRAT proteins in plants. Given the vital role of mitochondrial protein import in plants (and other eukaryotes), it is very useful to have comparative information like this for these genes, as presented in MPIC. Current plant genome databases generalize all PRATs as Tim17/23/22 orthologs, while MPIC provides for a more detailed classification of orthologs assigning all PRAT proteins to their respective sister clades, grouped as Tim17, Tim23, Tim22, B14.7, PRAT1&2, PRAT3&4 and PRAT5. In this way, through the data presented in MPIC, it was also revealed that Tim17, an essential protein in both yeast and Arabidopsis (Dekker et al. 1993, Maarse et al. 1994), is not present in E. grandis. This suggests that other PRATs such as Tim23 may have taken on the role of Tim17, as both of these typically form part of the TIM17:23 translocation channel. Furthermore, M. truncatula appears to lack the pre-sequence translocaseassociated protein import motor 18 (Pam18) and the b-subunit of mitochondrial processing peptidase (b-MPP) and thus it should carefully be examined whether the depth or annotation of the sequence data may have missed these components. It is important to note that while there appears to be an expansion of mitochondrial import component encoding genes in plants, the only exception seen for this is for the Icp55-encoding gene family, where this has three family members in yeast, but is encoded mostly by two genes in several other plant species, raising more questions about how this protein may have changed in evolution from yeast to plants (Fig. 4). Overall, MPIC facilitates the analysis of mitochondrial import components in a manner that reveals gene family expansions, possible neofunctionalizations and gene loss throughout the course of evolution from yeast to higher plants and humans (Figs. 1–4).

Prediction of suborganellar localization and targeting signals Determining the function of a protein first requires insight into its subcellular location but, without experimental evidence,

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

icon, and the specific sequence of each protein can be obtained with each ortholog hyperlinked to their respective source, e.g. TAIR, Rice DB, SGD, Maize GDB or Phytozome. A representation of this function is shown in Fig. 3C (ii). For the identification of conserved, predicted and transmembrane domains within all protein import components in the MPIC database, CDD, InterPro5, DAS and TMHMM were utilized as listed in Supplementary Table S1. The resulting data were compiled from all species and presented within the database in the orange columns [Fig, 3B, C (iii)]. A representative diagram containing the CCD conserved domain prediction was drawn for each gene family from yeast and Arabidopsis [Fig. 3C (iii)]. Details of the domains contained in the yeast (Sc) and Arabidopsis (At) Tim17 are also shown in Fig. 3C (iii), while an example of the expression pattern across Arabidopsis development for all Tim17-encoding genes is shown in 3C (iii). In this way, MPIC can be used quickly and efficiently to display an import component (Fig. 3A, B), present its protein properties [Fig. 3C (i)], reveal the level of conservation across different species [Fig. 3C (ii)], identify other family members/ isoforms [Fig. 3C (iii)] and illustrate which of these shows the highest/lowest level of transcript expression across development [3C (iv)]. Based on this, it can be seen that TIM17-2 appears to be the largest member of the TIM17 family in Arabidopsis, shows the greatest level of conservation with other species and has the highest level of transcript expression (Fig. 3). Thus, the overall knowledge gained from MPIC can be used as a foundation to guide hypotheses about specific import components. One of the most insightful ways to reveal the significance of protein function is to analyze its conservation across different species. To provide an overview of the gain (and loss) of mitochondrial import components during eukaryotic divergence, the number of genes within each family for each species was represented as a heat map diagram (Fig. 4, right panel on the homepage). The heat map provides a global view of the increase in gene number from early eukaryotic evolution through to land plant evolution, highlighting the increase in complexity and expansion of many components of the protein import apparatus from moss to monocots and dicots (Fig. 4). It can immediately be seen that most mitochondrial import components are conserved across the 24 different species in MPIC, indicating the significance of their function (gray boxes; Fig. 4). Interactive results for any or all of these species can be included in MPIC search results by ticking the relevant species of interest in the left-hand panel on the homepage. Several import components were also revealed to be specific only to non-plant species, particularly yeast-specific components that are not present in any of the plant or algal species (pink boxes; Fig. 4). By carrying out this analysis for MPIC, there were several interesting findings, including revealing that several import components are plant specific, such as TOM9, OM64, Tim21-like protein, PRAT1&2, PRAT3&4, PRAT5 and OOP (blue boxes; Fig. 4). This analysis also revealed a large plant-specific expansion of the inner membrane translocase families, Tim17 and Tim23, which belong to a larger gene family termed the PReprotein

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

Fig. 4 Gene family members for import components from yeast to plants. Phylogenetic distribution of the mitochondrial protein import component-encoding genes from yeast across 24 organisms. The numbers of orthologs in each gene family are highlighted by color intensity (light to dark green) based on the number of genes. The letter code represents each species, and the color code in the first column indicates conservation. For Arabidopsis, published phenotype information is also indicated, whereby knocking out one or more of these genes results in an embryo-lethal phenotype (indicated by an asterisk*) or delayed growth (indicated by ^). ERMES, endoplasmic reticulum–mitochondria encounter structure, MINOS, mitochondrial inner membrane organizing system.

prediction programs are a useful starting point, with various algorithms available specifically for mitochondrial proteins (Claros 1995, Emanuelsson et al. 2000, Boden and Hawkins 2005). These programs must be used with caution as they are

based on the assumption of proteins containing cleavable Nterminal targeting peptides, and only approximately 70% of mitochondrial proteins contain these (Vogtle et al. 2009). Thus, predictions do not account for proteins with internal 7

M. V. Murcha et al. | Plant mitochondrial import component database

expression across development. When expression of all the mitochondrial import components is viewed in parallel for Arabidopsis, it rapidly becomes apparent that the genes encoding import components are highly expressed during germination (Supplementary Table S5). Having the expression data in the MPIC DB like this allows users easily to determine which genes are most highly expressed at the transcript level. This can be particularly useful, as distinct relationships between these have been observed previously. For example, of the three genes encoding TIM17, Tim17-2 is the most abundant at the transcript level compared with Tim17-1 and Tim17-3 (Supplementary Table S5). Tim17-2 is also the most highly abundant protein isoform of the three, and the only one that has been shown to be essential for seed viability in Arabidopsis (Murcha et al. 2014b, Wang et al. 2014). Similarly, Tom40-1 is the most abundant isoform of Tom40 at both the transcript and protein levels, and it has also been shown to be essential for seed viability in Arabidopsis (Murcha et al. 2014b). Even when this is not the case, it is still useful for researchers to know which of the isoforms show the highest level of transcript expression to determine if neo-functionalization or neo-specialization may exist within gene families. The MPIC DB shows expression data for each import component for both Arabidopsis and rice [Fig. 3C (iv); Supplementary Tables S5, S6].

Putative and known regulatory elements of genes encoding mitochondrial import components in Arabidopsis As can be seen for both rice and Arabidopsis, there are patterns in the expression of genes encoding the mitochondrial import components (Supplementary Tables S5, S6). Considering this, the MPIC DB also provides data regarding putative and known regulatory elements within upstream regions of each gene encoding import components. This allows analysis of coregulated genes sourced from Athena A. thaliana expression network analysis (O’Connor et al. 2005). The MPIC DB also provides direct links to the geneMANIA web server (WardeFarley et al. 2010), providing a global view of all co-expressed genes or experimental interactome evidence. Having information on transcript expression and putative regulatory elements can help with more targeted experimental designs to determine the function and regulation of mitochondrial import components. Thus, in this way, MPIC maximizes the possible information that can be gained about import components by providing links to other useful databases.

Conclusions Expression analysis for mitochondrial import component genes in Arabidopsis and rice Given the expansion of gene families in plants compared with yeast for mitochondrial import components, it is not surprising that the pattern of tran script expression varies between the different members of each family. Fig. 3C (iv) shows the expression of the three genes encoding Tim17 in Arabidopsis, whereby it can be seen that Tim17-1 (At1g20350) showed the lowest 8

We have established a multitude of functional data relating to mitochondrial protein import components from yeast, human, algae and plant lineages ranging from green algae to trees. Researchers can easily access and search this information, to reveal orthologs and transcriptomic, proteomic and evolutionary data relevant to each species in one database. Thus, our MPIC DB presents a simple and centralized data resource for all components of the mitochondrial import apparatus that can

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

non-cleavable targeting signals. Furthermore, some proteins can have multiple locations, such as Mia40, which is dual located both in mitochondria and in peroxisomes of higher plants such as Arabidopsis and rice (Carrie et al. 2010a), but not in the moss P. patens (Carrie et al. 2010a, Xu et al. 2013). This dual-targeting ability also appears to have been acquired alongside the dual targeting of the Mia40 substrate proteins superoxide dismutase (CSD1) and copper chaperone for SOD1 (Ccs1) in Arabidopsis (Xu et al. 2013), which suggests that in plants Mia40 has extended its function to peroxisomes though its exact role has yet to be determined (Carrie et al. 2010a). Several protein transporters have also been found to be located in both mitochondria and chloroplasts such as PRAT3 and the translocase of the inner chloroplast membrane 20 (Tic20), which have been shown to be dual localized in both organelles by green fluorescent protein (GFP) tagging (Murcha et al. 2007, Kasmati et al. 2011). Furthermore, PRAT3 was also shown to be located within the mitochondrial outer membrane by proteomic approaches (Duncan et al. 2011). Note that for PRAT3, one study has reported a chloroplast-only location and a role in the import of proteins that do not contain cleavable transit peptides (Rossig et al. 2013); thus, with the observed expansion of this gene family, specific studies are required to define accurately the roles of these proteins. Furthermore, the high false-positive rate of subcellular location predictors was recently highlighted, with low overlap observed between predicted and experimentally confirmed data sets (Narsai et al. 2013). Thus, experimental evidence pertaining to all published locations including confirmed intramitochondrial localization, such as by tandem mass spectrometry (MS/MS), GFP targeting and in vitro import assays, is listed for each import component, after extensive manual searching and collating information from the latest literature. Furthermore, links to localization information and the respective publication via NCBI (National Center for Biotechnology Information) is provided for all Arabidopsis components via SUBA (Tanz et al. 2013) and all rice components via Rice DB (Narsai et al. 2013). The MPIC DB includes results from six targeting prediction algorithms displayed side by side in a matrix with all included species for comparative evaluation, as not all species have respective databases providing collated experimentally confirmed localization like SUBA and Rice DB. The main data table of the MPIC DB summarizes these predictions alongside the availability of experimentally shown location data. Note that all protein localization data in MPIC will be periodically updated as further experimental evidence becomes available.

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186

be used to gain insight into each gene/protein function(s) throughout evolution.

Materials and Methods Database design

Identification and annotation of mitochondrial import components Orthologs to yeast mitochondrial import components (Supplementary Table S2) were downloaded from the Saccharomyces Genome Database (Cherry et al. 1998) and used to identify direct orthologs by sequence homology using BLASTP (Altschul and Koonin 1998) in Homo sapiens, Caenorhabditis elegans, Dictyostelium discoideum, Cyanidioschyzon merolae, Ectocarpus siliculosus, Chlamydomonas reinhardtii, Volvox carteri and Arabidopsis thaliana from each respective database source as listed in Supplementary Table S4. For the identification of plant orthologs, Phytozome 9.1 TBLASTN was utilized using the compiled lists of yeast and Arabidopsis mitochondrial import components (Supplementary Tables S2, S3) using default settings. In addition to BLAST searches against each respective genomic database, a profile hidden Markov model-based HMMER method (Wang et al. 2014) was applied to search the UniProtKB protein database. An e-value threshold of 1e–10 was used for the majority of import components. For gene families that exhibited high conservation to other known proteins, the threshold was raised to 1e-50– 100 to identify true orthologs correctly and remove related proteins. Output data were further manually curated with regards to predicted protein size, identification of protein motifs, transmembrane domains and shared structural characteristics. For small proteins such as Tom5, 6, 7 and Tim8, 9, 10 and 13, a set e-value threshold was not applied and instead putative orthologs were chosen based on sequence identity and protein length similarity.

Molecular weight and pI Putative molecular weights and theoretical pI (isoelectric point) were calculated using ExPASy (Bjellqvist et al. 1993) for the protein sequences curated for the MPIC DB data table.

Subcellular localization

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

The MPIC DB consists of an indexed collection of import components and data types running as an in-browser JavaScript web application. The database files are formatted as HTML (website), JSON (data) and JavaScript (interactivity) supplemented with interactive images in PDF and SVG (graphics), which were generated from source data with algorithms written in the Scala programming language. See Supplementary Methods S1 for details. The MPIC DB interactive website presents users with a scrollable schematic illustration of mitochondrial import components and pathways, plus a data table that offers in-built sorting and filtering. The schematic illustration is overlaid with an HTML image map outlining each import component. In compatible web browsers (those with ‘HTML5’ features) this image map is augmented with an SVG overlay allowing the user to highlight individual components and families for database extraction. Although the MPIC DB website is primarily designed for desktop computers, interactive features have been designed also to work on popular mobile tablets. The data table displays basic information about each component, such as its nominal location and model database protein (Arabidopsis or yeast), with colorcoded icons linking to the collated and curated MPIC DB content and hyperlinks to other online resources such as TAIR (for Arabidopsis model details). Color coding, data icons and species icons were applied throughout the MPIC DB. For comprehensive cross-species comparisons available only in the MPIC DB, the collated and curated MPIC DB content was formatted as data downloads, web pages and PDFs with novel interactive features such as interactive MatGAT.

microarray data sets that were downloaded from the Gene Expression Omnibus or MIAME Array Express Databases (for each species) and analyzed using the same method as in previous studies (Narsai et al. 2011). The expression data shown in the MPIC DB represent exactly a subset of the expression data analyzed for the Rice DB (Narsai et al. 2013). To summarize, the data for each species were imported and normalized together for consistency, and the expression levels shown represent the GC-RMA-normalized expression intensities (log2) across development, for both rice and Arabidopsis. The data sets included for Arabidopsis were for all wild-type plants analyzed in the AtGenExpress developmental series (E-AFMX-9) (Schmid et al. 2005) and a germination time course (GSE30223; Narsai et al. 2011), resulting in a total of 73 tissues analyzed. For rice, the data sets included were aerobic germination (EMEXP-1766; Howell et al. 2009), anaerobic germination (E-MEXP-2267; Narsai et al. 2009), coleoptiles (GSE6908; Lasanthi-Kudahettige et al. 2007), embryo, endosperm, leaf and root (GSE11966; Xue et al. 2009), flower, stigma and ovary (GSE7951; Li et al. 2007) and leaf and inflorescence (GSE6893; Jain et al. 2007), resulting in 41 tissues analyzed across development. See the Rice DB expression datapage for details: (http://ricedb.plantenergy.uwa.edu.au/about/data/ PebRnaExpressionProfiles_)

To determine subcellular location, several methods were utilized; publications presenting experimental evidence (including GFP localization, in vitro import assays, immunodetection and proteomic analysis where available), computational prediction programs (Supplementary Table S1) and the location of direct orthologs in Arabidopsis and rice according to SUBA (Tanz et al. 2013), Rice DB (Narsai et al. 2013) and GelMap (Klodmann et al. 2011). Information indicating a proteins intramitochondrial experimentally-shown locations were collated into stand-alone tables and summarized by icon in the main data table. Computational targeting prediction programs were run en bloc for all included species using each algorithm’s default parameters applicable per species kingdom (Claros 1995, Emanuelsson et al. 1999, Emanuelsson et al. 2000, Neuberger et al. 2003, Small et al. 2004, Boden and Hawkins 2005). The highest scoring matches from each predictor were collated across all included species and are presented as a stand-alone matrix with iconic summarization in the main data table. Intramitochondrial locations were curated from experimental publications or from putative orthology to the experimentally shown intramitochondrial location of their direct yeast orthologs (Supplementary Tables S2, S3). These locations are depicted in the MPIC DB’s interactive diagram and are annotated in the main data table.

Phylogenetic analysis Protein alignments were carried out using Clustal Omega (Sievers et al. 2011) with default settings, and the corresponding alignment was shaded using boxshade v3.2.1 (http://www.ch.embnet.org/software/BOX_form.html). Percentage similarity and identity scores were generated using MatGAT (Campanella et al. 2003). Phylogenetic analysis was carried out using MEGA 5.2.2 (Tamura et al. 2011). The full-length protein alignments were first generated using the Clustal option and the phylogenetic tree was created using the maximum likelihood tree method and the Jones–Thornton–Taylor model after 1,000 replications.

Domain structure Protein domains and motifs were analyzed using programs (e.g. InterPro5 and CDD) as listed in Supplementary Table S1 with default settings unless otherwise stated (Schultz et al. 1998, Zdobnov and Apweiler 2001, Gough et al. 2001, Mi et al. 2005, de Castro et al. 2006, Finn et al. 2010, Marchler-Bauer et al. 2011, Attwood et al. 2012, Lees et al. 2014). Transmembrane regions were determined using DAS (Cserzo et al. 1997) and TMHMM (Krogh et al. 2001) as detailed in Supplementary Table S1.

Transcript data: microarray analysis for expression annotations

Data sources

The expression data shown for Arabidopsis and rice in the MPIC DB represent gene expression across plant development based on a range of publicly available

To view all the data sources used in the MPIC DB, refer to Supplementary Tables S1 and S4.

9

M. V. Murcha et al. | Plant mitochondrial import component database

Availability and requirements Operating system: platform independent (tested on Windows, Linux, Mac and iOS) Other requirements: web browser (tested on Chrome, Safari, Firefox and Internet Explorer) The database can be accessed at: http://www.plantenergy.uwa.edu.au/applications/mpic

Supplementary data Supplementary data are available at PCP online.

Funding

Acknowledgments M.W.M. and J.W. designed and implemented the database. M.W.M., R.N. and J.D. were responsible for data collection. J.D. constructed the web interface and online tools. S.K.J. and J.D. were responsible for graphics and all authors were involved in the writing of the manuscript.

Disclosures The authors have no conflicts of interest to declare.

References Alkhaja, A.K., Jans, D.C., Nikolov, M., Vukotic, M., Lytovchenko, O., Ludewig, F. et al. (2012) MINOS1 is a conserved component of mitofilin complexes and required for mitochondrial function and cristae organization. Mol. Biol. Cell 23: 247–257. Altschul, S.F. and Koonin, E.V. (1998) Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases. Trends Biochem. Sci. 23: 444–447. Attwood, T.K., Coletta, A., Muirhead, G., Pavlopoulou, A., Philippou, P.B., Popov, I. et al. (2012) The PRINTS database: a fine-grained protein sequence annotation and analysis resource—its status in 2012. Database (Oxford) 2012: bas019. Bjellqvist, B., Hughes, G.J., Pasquali, C., Paquet, N., Ravier, F., Sanchez, J.C. et al. (1993) The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 14: 1023–1031. Boden, M. and Hawkins, J. (2005) Prediction of subcellular localization using sequence-biased recurrent networks. Bioinformatics 21: 2279–2286. Campanella, J.J., Bitincka, L. and Smalley, J. (2003) MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences. BMC Bioinformatics 4: 29. Carrie, C., Giraud, E., Duncan, O., Xu, L., Wang, Y., Huang, S. et al. (2010a) Conserved and novel functions for Arabidopsis thaliana MIA40 in assembly of proteins in mitochondria and peroxisomes. J. Biol. Chem. 285: 36138–36148. Carrie, C., Murcha, M.W. and Whelan, J. (2010b) An in silico analysis of the mitochondrial protein import apparatus of plants. BMC Plant Biol. 10: 249.

10

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

This work was supported by the Australian Research Council [Future Fellowship FT130100112 to M.W.M.; Discovery Project grant DP130102384 to J.W.; Centre of Excellence in Plant Energy Biology grant CE140100008 to J.W.].

Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T. et al. (1998) SGD: Saccharomyces Genome Database. Nucleic Acids Res. 26: 73–79. Claros, M.G. (1995) MitoProt, a Macintosh application for studying mitochondrial proteins. Comput. Appl. Biosci. 11: 441–447. Cserzo, M., Wallin, E., Simon, I., von Heijne, G. and Elofsson, A. (1997) Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 10: 673–676. de Castro, E., Sigrist, C.J., Gattiker, A., Bulliard, V., LangendijkGenevaux, P.S., Gasteiger, E. et al. (2006) ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 34: W362–W365. Dekker, P.J., Keil, P., Rassow, J., Maarse, A.C., Pfanner, N. and Meijer, M. (1993) Identification of MIM23, a putative component of the protein import machinery of the mitochondrial inner membrane. FEBS Lett. 330: 66–70. Dubinin, J., Braun, H.P., Schmitz, U. and Colditz, F. (2011) The mitochondrial proteome of the model legume Medicago truncatula. Biochim. Biophys. Acta 1814: 1658–1668. Duncan, O., Murcha, M.W. and Whelan, J. (2013) Unique components of the plant mitochondrial protein import apparatus. Biochim. Biophys. Acta 1833: 304–313. Duncan, O., Taylor, N.L., Carrie, C., Eubel, H., Kubiszewski-Jakubiak, S., Zhang, B. et al. (2011) Multiple lines of evidence localize signaling, morphology, and lipid biosynthesis machinery to the mitochondrial outer membrane of Arabidopsis. Plant Physiol. 157: 1093–1113. Emanuelsson, O., Nielsen, H., Brunak, S. and von Heijne, G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300: 1005–1016. Emanuelsson, O., Nielsen, H. and von Heijne, G. (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci. 8: 978–984. Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E. et al. (2010) The Pfam protein families database. Nucleic Acids Res. 38: D211–D222. Flinner, N., Ellenrieder, L., Stiller, S.B., Becker, T., Schleiff, E. and Mirus, O. (2013) Mdm10 is an ancient eukaryotic porin co-occurring with the ERMES complex. Biochim. Biophys. Acta 1833: 3314–3325. Gebert, M., Schrempp, S.G., Mehnert, C.S., Heisswolf, A.K., Oeljeklaus, S., Ieva, R. et al. (2012) Mgr2 promotes coupling of the mitochondrial presequence translocase to partner complexes. J. Cell Biol. 197: 595–604. Goodstein, D.M., Shu, S., Howson, R., Neupane, R., Hayes, R.D., Fazo, J. et al. (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 40: D1178–D1186. Gough, J., Karplus, K., Hughey, R. and Chothia, C. (2001) Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. Biol. 313: 903–919. Howell, K.A., Narsai, R., Carroll, A., Ivanova, A., Lohse, M., Usadel, B. et al. (2009) Mapping metabolic and transcript temporal switches during germination in rice highlights specific transcription factors and the role of RNA instability in the germination process. Plant Physiol. 149: 961–980. Huang, S., Taylor, N.L., Narsai, R., Eubel, H., Whelan, J. and Millar, A.H. (2009) Experimental analysis of the rice mitochondrial proteome, its biogenesis, and heterogeneity. Plant Physiol. 149: 719–734. Ieva, R., Heisswolf, A.K., Gebert, M., Vogtle, F.N., Wollweber, F., Mehnert, C.S. et al. (2013) Mitochondrial inner membrane protease promotes assembly of presequence translocase by removing a carboxy-terminal targeting sequence. Nat. Commun. 4: 2853. Jain, M., Nijhawan, A., Arora, R., Agarwal, P., Ray, S., Sharma, P. et al. (2007) F-box proteins in rice. Genome-wide analysis, classification, temporal and spatial gene expression during panicle and seed development, and regulation by light and abiotic stress. Plant Physiol. 143: 1467–1483.

Plant Cell Physiol. 56(1): e10(1–12) (2015) doi:10.1093/pcp/pcu186

Narsai, R., Law, S.R., Carrie, C., Xu, L. and Whelan, J. (2011) In-depth temporal transcriptome profiling reveals a crucial developmental switch with roles for RNA processing and organelle metabolism that are essential for germination in Arabidopsis. Plant Physiol. 157: 1342–1362. Neuberger, G., Maurer-Stroh, S., Eisenhaber, B., Hartig, A. and Eisenhaber, F. (2003) Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences. J. Mol. Biol. 328: 567–579. O’Connor, T.R., Dyreson, C. and Wyrick, J.J. (2005) Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences. Bioinformatics 21: 4411–4413. Pudelski, B., Schock, A., Hoth, S., Radchuk, R., Weber, H., Hofmann, J. et al. (2012) The plastid outer envelope protein OEP16 affects metabolic fluxes during ABA-controlled seed development and germination. J. Exp. Bot. 63: 1919–1936. Rassow, J., Dekker, P.J., van Wilpe, S., Meijer, M. and Soll, J. (1999) The preprotein translocase of the mitochondrial inner membrane: function and evolution. J. Mol. Biol. 286: 105–120. Rhee, S.Y., Beavis, W., Berardini, T.Z., Chen, G., Dixon, D., Doyle, A. et al. (2003) The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. 31: 224–228. Rossig, C., Reinbothe, C., Gray, J., Valdes, O., von Wettstein, D. and Reinbothe, S. (2013) Three proteins mediate import of transit sequence-less precursors into the inner envelope of chloroplasts in Arabidopsis thaliana. Proc. Natl Acad. Sci. USA 110: 19962–19967. Sakai, H., Lee, S.S., Tanaka, T., Numa, H., Kim, J., Kawahara, Y. et al. (2013) Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol. 54: e6. Salvato, F., Havelund, J.F., Chen, M., Rao, R.S., Rogowska-Wrzesinska, A., Jensen, O.N. et al. (2014) The potato tuber mitochondrial proteome. Plant Physiol. 164: 637–653. Schmid, M., Davison, T.S., Henz, S.R., Pape, U.J., Demar, M., Vingron, M. et al. (2005) A gene expression map of Arabidopsis thaliana development. Nat. Genet. 37: 501–506. Schultz, J., Milpetz, F., Bork, P. and Ponting, C.P. (1998) SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl Acad. Sci. USA 95: 5857–5864. Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W. et al. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7: 539. Small, I., Peeters, N., Legeai, F. and Lurin, C. (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4: 1581–1590. Sterck, L., Billiau, K., Abeel, T., Rouze, P. and Van de Peer, Y. (2012) ORCAE: online resource for community annotation of eukaryotes. Nat. Methods 9: 1041. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. and Kumar, S. (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28: 2731–2739. Tanz, S.K., Castleden, I., Hooper, C.M., Vacher, M., Small, I. and Millar, H.A. (2013) SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis. Nucleic Acids Res. 41: D1185–D1191. van der Laan, M., Bohnert, M., Wiedemann, N. and Pfanner, N. (2012) Role of MINOS in mitochondrial membrane architecture and biogenesis. Trends Cell Biol. 22: 185–192. Vogtle, F.N., Wortelkamp, S., Zahedi, R.P., Becker, D., Leidhold, C., Gevaert, K. et al. (2009) Global analysis of the mitochondrial N-proteome identifies a processing peptidase critical for protein stability. Cell 139: 428–439. von der Malsburg, K., Muller, J.M., Bohnert, M., Oeljeklaus, S., Kwiatkowska, P., Becker, T. et al. (2011) Dual role of mitofilin in

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

Jansch, L., Kruft, V., Schmitz, U.K. and Braun, H.P. (1998) Unique composition of the preprotein translocase of the outer mitochondrial membrane from plants. J. Biol. Chem. 273: 17251–17257. Kasmati, A.R., Topel, M., Patel, R., Murtaza, G. and Jarvis, P. (2011) Molecular and genetic analyses of Tic20 homologues in Arabidopsis thaliana chloroplasts. Plant J. 66: 877–889. Kiirika, L.M., Behrens, C., Braun, H.P. and Colditz, F. (2013) The mitochondrial complexome of Medicago truncatula. Front. Plant Sci. 4: 84. Klodmann, J., Senkler, M., Rode, C. and Braun, H.P. (2011) Defining the protein complex proteome of plant mitochondria. Plant Phsyiol. 157: 587–598. Krogh, A., Larsson, B., von Heijne, G. and Sonnhammer, E.L. (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305: 567–580. Lasanthi-Kudahettige, R., Magneschi, L., Loreti, E., Gonzali, S., Licausi, F., Novi, G. et al. (2007) Transcript profiling of the anoxic rice coleoptile. Plant Physiol. 144: 218–231. Lee, C.P., Taylor, N.L. and Millar, A.H. (2013) Recent advances in the composition and heterogeneity of the Arabidopsis mitochondrial proteome. Front. Plant Sci. 4: 4. Lees, J.G., Lee, D., Studer, R.A., Dawson, N.L., Sillitoe, I., Das, S. et al. (2014) Gene3D: multi-domain annotations for protein sequence and comparative genome analysis. Nucleic Acids Res. 42: D240–D245. Li, M., Xu, W., Yang, W., Kong, Z. and Xue, Y. (2007) Genome-wide gene expression profiling reveals conserved and novel molecular functions of the stigma in rice. Plant Physiol. 144: 1797–1812. Lister, R., Murcha, M.W. and Whelan, J. (2003) The Mitochondrial Protein Import Machinery of Plants (MPIMP) database. Nucleic Acids Res. 31: 325–327. Maarse, A.C., Blom, J., Keil, P., Pfanner, N. and Meijer, M. (1994) Identification of the essential yeast protein MIM17, an integral mitochondrial inner membrane protein involved in protein import. FEBS Lett. 349: 215–221. Marchler-Bauer, A., Lu, S., Anderson, J.B., Chitsaz, F., Derbyshire, M.K., DeWeese-Scott, C. et al. (2011) CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 39: D225–D229. Matsuzaki, M., Misumi, O., Shin, I.T., Maruyama, S., Takahara, M., Miyagishima, S.Y. et al. (2004) Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428: 653–657. Meisinger, C., Rissler, M., Chacinska, A., Szklarz, L.K., Milenkovic, D., Kozjak, V. et al. (2004) The mitochondrial morphology protein Mdm10 functions in assembly of the preprotein translocase of the outer membrane. Dev. Cell 7: 61–71. Mi, H., Lazareva-Ulitsky, B., Loo, R., Kejariwal, A., Vandergriff, J., Rabkin, S. et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33: D284–D288. Murcha, M.W., Elhafez, D., Lister, R., Tonti-Filippini, J., Baumgartner, M., Philippar, K. et al. (2007) Characterization of the preprotein and amino acid transporter gene family in Arabidopsis. Plant Physiol. 143: 199–212. Murcha, M.W., Kmiec, B., Kubiszewski-Jakubiak, S., Teixeira, P.F., Glaser, E. and Whelan, J. (2014a) Protein import into plant mitochondria: signals, machinery, processing, and regulation. J. Exp. Bot. 65: 6301–6335. Murcha, M.W., Wang, Y., Narsai, R. and Whelan, J. (2014b) The plant mitochondrial protein import apparatus—the differences make it interesting. Biochim. Biophys. Acta 1840: 1233–1245. Narsai, R., Devenish, J., Castleden, I., Narsai, K., Xu, L., Shou, H. et al. (2013) Rice DB: an Oryza information portal linking annotation, subcellular location, function, expression, regulation, and evolutionary information for rice and Arabidopsis. Plant J. 76: 1057–1073. Narsai, R., Howell, K.A., Carroll, A., Ivanova, A., Millar, A.H. and Whelan, J. (2009) Defining core metabolic and transcriptomic responses to oxygen availability in rice embryos and young seedlings. Plant Physiol. 151: 262–274.

11

M. V. Murcha et al. | Plant mitochondrial import component database

mitochondrial membrane organization and protein biogenesis. Dev. Cell 21: 694–707. Wang, Y., Law, S.R., Ivanova, A., van Aken, O., Kubiszewski-Jakubiak, S., Narsai, R. et al. (2014) The mitochondrial protein import component, translocase of the inner membrane 17-1 (Tim17-1) plays a role in defining the timing of germination in Arabidopsis thaliana. Plant Physiol. 166: 1429–1435. Warde-Farley, D., Donaldson, S.L., Comes, O., Zuberi, K., Badrawi, R., Chao, P. et al. (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 38: W214–W220.

Welchen, E., Garcia, L., Mansilla, N. and Gonzalez, D.H. (2014) Coordination of plant mitochondrial biogenesis: keeping pace with cellular requirements. Front. Plant Sci. 4: 551. Xu, L., Carrie, C., Law, S.R., Murcha, M.W. and Whelan, J. (2013) Acquisition, conservation, and loss of dual-targeted proteins in land plants. Plant Physiol. 161: 644–662. Xue, L.J., Zhang, J.J. and Xue, H.W. (2009) Characterization and expression profiles of miRNAs in rice seeds. Nucleic Acids Res. 37: 916–930. Zdobnov, E.M. and Apweiler, R. (2001) InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17: 847–848.

Downloaded from http://pcp.oxfordjournals.org/ by guest on March 22, 2015

12

MPIC: a mitochondrial protein import components database for plant and non-plant species.

In the 2 billion years since the endosymbiotic event that gave rise to mitochondria, variations in mitochondrial protein import have evolved across di...
2MB Sizes 0 Downloads 5 Views