Article

Protein Science DOI 10.1002/pro.3193

Structural conservation of the PIN domain active site across all domains of life M. Senissar, M. C. Manav, and D. E. Brodersen* Centre for Bacterial Stress Response and Persistence, Department of Molecular Biology and Genetics, Aarhus University, Gustav Wieds Vej 10c, DK-8000, Denmark *

To whom correspondence should be addressed. E-mail: [email protected], phone: +45

21669001, ORCID ID: 0000-0002-5413-4667

Short title: Structural conservation of the PIN domain

Abbreviations: PIN domain, Pil N-terminus domain; TA, toxin-antitoxin; VapBC, Virulence associated proteins B and C; NMD, nonsense-mediated decay; RNAi, RNA interference; PTC, premature termination codon; RITS, RNA-induced initiation of transcriptional gene silencing; MCPIP1, MCP 1-induced protein 1; miRNA, micro RNA; ASU, asymmetric unit; SRL, sarcin-ricin loop

Keywords: PIN domain, ribonuclease, RNA-binding protein, toxin-antitoxin, enzyme structure

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1002/pro.3193 © 2017 The Protein Society Received: Mar 17, 2017; Revised: May 08, 2017; Accepted: May 08, 2017 This article is protected by copyright. All rights reserved.

Protein Science

ABSTRACT The PIN domain (Pil N-terminus) is a compact RNA-binding protein domain present in all domains of life. This 120-residue domain consists of a central and parallel β sheet surrounded by α helices, which together organise 4-5 acidic residues in an active site that binds one or more divalent metal ions and in many cases has endoribonuclease activity. In bacteria and archaea, the PIN domain is primarily associated with toxin-antitoxin loci, consisting of a toxin (the PIN domain nuclease) and an antitoxin that inhibits the function of the toxin under normal growth conditions. During nutritional or antibiotic stress, the antitoxin is proteolytically degraded causing activation of the PIN domain toxin leading to a dramatic reprogramming of cellular metabolism to cope with the new conditions. In eukaryotes, PIN domains are commonly found as parts of larger proteins and are involved in a range of processes involving RNA cleavage, including ribosomal RNA biogenesis and nonsensemediated mRNA decay. In this review, we provide a comprehensive overview of the structural characteristics of the PIN domain and compare PIN domains from all domains of life in terms of structure, active site architecture, and activity.

2 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 2 of 41

Page 3 of 41

Protein Science

INTRODUCTION The PIN (Pil N-terminus) domain is a compact, approximately 120-140 residue protein domain originally described as being required for type IV pilus fibre polymerisation in motile, Gram-negative bacteria.1 However, comparative protein sequence analysis has revealed that PIN domains are conserved from archaea2 to eukaryotes3 and in most cases not associated with cell motility. Currently, the PIN domain family as defined by the PFAM01850 class (NCBI COG 4113) contains nearly 5,000 sequences.4,5 Conservation of 45 acidic residues distributed throughout the sequence suggested early on an ability to bind divalent metal ions and sequence homology to a number of proteins involved in eukaryotic nonsense-mediated decay (NMD) and the 5'-3' exonuclease domains of DNA polymerases further suggested that PIN domains would bind RNA and be active as nucleases.3 Based on this as well as secondary structure prediction, the PIN domain was proposed to share its three-dimensional structure with T4 RNase H, the enzyme that harbours the 5'-3' exonuclease activity of the DNA polymerase machinery required for lagging strand RNA primer removal in organisms with multi-enzyme DNA replication systems.6 Crucially, the T4 RNase H reaction mechanism is not dependent on a free RNA 5'-phosphate group and the enzyme is therefore in practice functional as an endonuclease.7 This family also includes the Flap endonuclease-1 (Fen-1 nuclease) enzymes, which harbour the corresponding eukaryotic 5'-3' exonuclease activity required for DNA replication and repair. Fen-1 nucleases are structurespecific endonucleases capable of cleaving single-stranded DNA or RNA at the bifurcated end of a base-paired duplex8 and together with the polymerase 5'-3' exonuclease domains and the PIN domains, they constitute the PIN-domain like superfamily [CATH 3.40.50.1010, Fig. 1(A)].9 Members of this group share a common core fold consisting of a three-layered α/β/α core motif with a central, parallel five-stranded β-sheet [Fig. 1(B), 1(C)]. A characteristic feature of the PIN domain subgroup (SC:2) is that they show minimal sequence conservation but relatively high structural similarity. The core elements of the structure are thus conserved, including the central β-sheet and the N-terminal parts of α1, α3, and α6 as well as the loop following β4.10 PIN domains are ubiquitous in all branches of life and as a general rule, archaeal and bacterial PIN domains are found as stand-alone proteins, typically acting as the toxin element of toxin-antitoxin systems while eukaryotic PIN domains tend to exist as parts of larger proteins and tightly regulated through protein-protein interactions as well as subcellular

3 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

localisation.2 Overall, PIN domains are emerging as central players in a range of cellular processes including eukaryotic RNA degradation and surveillance and bacterial stress response and pathogenesis.11 Despite the availability of a range of structures of PIN domains several key questions remain, including how are they able to fold into such similar structures despite very low sequence identity, what is the basis for recognition of very specific, structured RNA targets and finally, what are the details of the catalytic mechanism? In this paper, we provide a comprehensive review of known structures of PIN domain proteins from all branches of life and present the current state of knowledge regarding their function. PIN DOMAINS IN ARCHAEA Before any PIN domain structures were determined, comparative genomics clearly identified the domain as relatively common in archaea and also found that many species encode multiple copies of PIN domains, both as stand-alone proteins and as parts of larger proteins.2 Thus, the hyperthermophilic archaeon, Sulfolobus solfataricus, encodes at least 26 PIN domain proteins most of which we now know constitute the toxin part of a set of bona fide toxin-antitoxin systems that have been shown to be involved in adaptation of cell growth to thermal stress conditions.12 Moreover, in several archaea, such as Acidianus hospitalis, toxinantitoxin gene pairs of the vapBC type have been shown to surround the CRISPR/Cas/Cmr immunity systems and have been proposed to maintain CRISPR components in the genome and help the cell to adapt to environmental stress.13 In 2004, the first structure of a PIN domain protein, the 133-residue PAE2754 from the hyperthermophilic crenarchaeon, Pyrobaculum aerophilum, confirmed the predicted structural homology with T4 RNase H consisting of a compact five-stranded, parallel and highly twisted β-sheet surrounded by seven α-helices [PDB 1V8O; Fig. 2(A), 2(B)].14 Clustering of a set of conserved, acidic residues (Asp8, Glu38, Asp92, and Asp110) in a similar position to the active site of T4 RNase H and the demonstration of Mg2+/Mn2+-dependent nuclease activity on a doublestranded DNA substrate containing a 5' single-stranded overhang in vitro further confirmed the predicted functional homology to the 5'-3' exonuclease domains. Apart from the acidic residues, this structure also identified a highly conserved threonine residue close to the active site, likely critical for substrate recognition [Fig. 2(C), 8(C)].3,14 PAE2754 forms a dimer of dimers in the crystal with the four active sites located on the inside. This observation was used to conclude that the enzyme would be specific towards 5' overhanging DNA segments and likely involved in DNA/RNA editing.14 One of the dimer interfaces is significantly

4 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 4 of 41

Page 5 of 41

Protein Science

weaker than the other and given what we know about PIN domain topology today, it appears most likely that the protein is a dimer in vivo similar to the bacterial PIN toxin domains. No divalent metal ions were found in the active site, but based on homology with T4 RNase H it was concluded that the active site configuration would be compatible with either one or two bound ions.14 Sulfolobus solfactarius SS01118 is an archaeal PIN domain protein with unknown function whose structure revealed a β-sheet very similar to PAE2754 (PDB 2MDT). This protein is a monomer and is missing several of the conserved active site residues which led the authors to conclude that it likely is not active as a nuclease.15 Another structure of an archaeal PIN domain, Archaeoglobus fulgidus AF0591 (PDB 1O4W), shows a similar core fold and active site architecture as PAE2754 but has an unusually long, C-terminal β-strand extension. This extension interacts with the corresponding part of another molecule to form an elongated dimer, which was suggested to be a crystallisation artefact.16 More canonical is the structure of the protein PH0500 form Pyrococcus horikoshii (PDB 1YE5), which has a sequence well conserved in the Pyrococcus genus but with low overall PIN domain homology.17 Both in solution and in the crystal structure, the protein forms a dimer, which is very similar to the tightest dimer in the PAE2754 structure and thus the higher order structures later found among bacterial type II toxin-antitoxin PIN domains. The structure of the PAE0151 from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum (PDB 2FE1) was the first archaeal PIN domain structure determined with a divalent metal ion in the active site, however, the Mn2+ is located more peripherally than expected from T4 RNase H and is not fully occupied [Fig. 2(D)].10 The structure has a single copy of the PIN domain in the crystallographic asymmetric unit (ASU), but forms a homodimer by crystallographic symmetry [Fig. 2(E)]. Intriguingly, the authors demonstrate that this PIN domain can form a tight protein-protein complex with the product of the adjacent open reading frame (PAE0152), suggesting the locus constitutes a bona fide toxin-antitoxin system similar to those found in bacteria. In vitro, this "toxin-antitoxin" complex behaves as a heterotetramer with two copies of each protein, which the authors proposed constitutes a VapBC-like toxin-antitoxin system.10 Indeed, this architecture has later been found to be common among bacterial toxin-antitoxin systems.18 BACTERIAL TOXIN-ANTITOXIN SYSTEMS

5 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Page 6 of 41

The vast majority of bacterial PIN domain proteins represent the “toxin” component of type II toxin-antitoxin (TA) systems. These are particularly common among pathogenic bacteria such as Mycobacterium tuberculosis that contains a large number of PIN domains, which we now know are all associated with TA systems.11,19 TA systems play important roles for many prokaryotic organisms in survival during stress, such as starvation and antibiotic pressure.20 According to one model, stress leads to increased levels of the cellular alarmone, (p)ppGpp, which then in turn activates specific proteases that degrade the antitoxin molecules.24 In an alternative model, it the signal is rather the lowered cellular ATP concentration that results from amino acid depletion or stress.25 In either case, activation of the toxin PIN domain endoribonuclease causes a general halt in growth leading to a dormant or persistent state, which appears to be highly relevant for understanding bacterial pathogenesis and behaviour during chronic and recurring infections.21,22 In the most well-characterised and abundant class, the type II TA systems, both toxin and antitoxin are proteins that form a tight complex in which the activity of the toxin is inhibited during normal growth. At the genomic level, type II TA genes are arranged in bicistronic loci with the antitoxin preceding the toxin, ensuring efficient inhibition of the toxin through translational coupling [Fig. 3(A)]. Expression from TA loci is further regulated at the transcriptional level through direct binding of the TA complex to pseudo-palindromic sequences in the TA promoter region .26 So far, more than 600 TA type II loci have been identified which have been divided into 10 families with Virulence associated proteins B and C (VapBC) and the "delayed relaxed

phenotype"-associated

RelBE

representing

the

two

largest

(constituting

approximately 40% and 25% of the loci, respectively).19,27 Of these, only the VapC toxins adopt a PIN domain fold3,14 while RelE, although still an endonuclease, is structurally related to bacterial RNase T1.28 VapC toxins are widely distributed in the bacterial domain, but especially common in pathogenic bacteria. For instance, the Mycobacterium tuberculosis genome encodes no less than 48 VapC toxins, which represents the largest collection of PIN domain toxins found so far in any one organism. The reason for this prevalence of similar and orthologous protein domains is still not fully understood, but it is believed to be part of a multi-faceted and fine-grained defence mechanism for pathogenic bacteria that ensures a relatively high and consistent rate of persister cell formation.29 Intriguingly, individual VapC toxins appear to harbour RNase activity against specific stable RNA species, including unique tRNA and rRNA species.30-33 For example, the VapC expressed by Shigella flexneri virulence plasmid pMYSH6000 was shown to specifically target initiator tRNAfMet in enteric 6 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 7 of 41

Protein Science

bacteria and result in a single cleavage event in the anticodon stem loop.33 Likewise, Mycobacterium tuberculosis VapC20 and VapC26 have specificity for the sarcin-ricin loop (SRL) of 23S ribosomal RNA, a classical hot-spot for ribosome inactivation.32,34 Overall, specific RNA targets have been identified for 13 of the 48 VapC toxins in Mycobacterium tuberculosis, comprising 11 tRNA-cleaving and two SRL-cleaving toxin.31 Intriguingly, a recent study found that transcription from some VapC genes is significantly up-regulated when M. tuberculosis forms persister cells, which is compatible with a direct causal relationship between induction of TA systems and persistence.35 Bacterial PIN domain toxins are a little longer than the minimal PIN domain, approximately 140 amino acid residues, and are generally found as stable dimers in vitro. The active site is characterised by four canonical acidic residues (D, E, D, D/N) that (as seen in archaea) are clustered in three dimensions to form an active site able to cleave RNA in a sequence and/or structure-specific and divalent cation-dependent manner.36 When bound to their cognate VapB antitoxins, the VapC PIN domain active sites are often inactivated by insertion of a positively charged arginine residue from the antitoxin that displaces the divalent metal ions and thus renders the PIN domain inactive.36-38 Moreover, the antitoxins contain a DNA-binding motif required for interaction of the TA complex with the promoter and transcriptional auto-regulation of cellular TA levels. Genomic co-localisation of PIN domains with short open reading frames containing DNA-binding motifs can thus be used to locate TA systems in genome-wide searches.19 The first bacterial PIN domain structure was that of N. gonorrhoeae FitAB (Fast intracellular trafficking) bound to its DNA operator sequence (PDB 2H1C and 2H1O).38 FitAB is a VapBC type TA pair that has been implicated in the known persistent state of gonorrhoeae in humans that causes the disease to spread between symptom-free individuals. FitB is highly homologous to VapC toxins and contains a classical PIN domain fold consisting of a fivestranded parallel β-sheet surrounded by four α-helices on one side and three on the other. Structural comparison with free PIN domains showed for the first time that the dimer is relatively unperturbed by binding of the antitoxin, which merely wraps the PIN domain, covering its active site [Fig. 3(B)].38 In this state, the PIN domain shows no nuclease activity. Within the DNA-bound complex, FitA and FitB form a heterodimer that organises in a tetramer with a circular architecture through dimerisation of both the FitB PIN domains on

7 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

one hand and the FitA DNA-binding domains on the other [Fig. 3(B)]. Such higher order structures are a hallmark of bacterial TA systems in their inhibited state.18 Importantly, the dimer interface between the PIN domains observed in the FitAB heterooctamer is identical to the one observed for the isolated, archaeal PIN domains and involve interactions between α3 from one PIN domain and α5 of the other [Fig. 3(B)].38 As mentioned above, sequence conservation between PIN domains is extremely weak and in fact only the four acidic active site residues are conserved in FitB while the remaining sequence is highly variant. PIN domain toxins have an exposed hydrophobic groove near helices α1, α3, and α4 that is used for anchoring the antitoxin in the inhibited state [Fig. 3(C)]. The antitoxin also contributes a helix (α3AT), which interacts using a "leucine zipper"-like motif with hydrophobic residues on the PIN domain surface.39 In addition to compensating exposed hydrophobic residues, the resulting protein complex is more globular than the isolated PIN domain toxin, increasing the stability of the interaction.38 In the FitAB TA complex, the antitoxin also physically blocks the negatively charged PIN domain active site by placing a positive arginine residue (Arg 68) inside in [Fig. 3(D)]. This likely also displaces any bound divalent metal ions required for catalysis.37 In accordance with this, two solvent molecules were observed in the active site of a trypsinised FitAB complex in which both the antitoxin DNA-binding domain and C-terminus are missing.38 Despite having a complete active site, no RNase activity has been demonstrated for N. gonorrhoeae FitB. Structures of VapC PIN domain toxins from Mycobacteria and other pathogens. A recent, systematic search for RNA targets of VapC toxins in M. tuberculosis revealed that many bacterial PIN domain toxins, at least in pathogenic bacteria, have unique RNA targets that include specific tRNA species as well as ribosomal RNA.31 The first structure of a mycobacterial VapC toxin was that of M. tuberculosis VapC-5, a 150-residue PIN domain in complex with a long 33-residue α-helical fragment of the VapB-5 antitoxin (PDB 3DBO) [Fig. 4(A)].40 The core VapC-5 PIN domain consists of four parallel β strands surrounded by five α helices and forms a canonical homodimer as observed before. The protein also features a helical extension referred to as the "clip" or "lid", which is involved in binding the antitoxin. VapC-5 is structurally similar to both FitB and the archaeal PAE2754 with the exception of α2, which is shifted about 10 Å and broken in two.40 Likewise, the interactions with the antitoxin differs among these homologous PIN domain toxins. Of particular interest is the observation that the structural equivalent of the positively charged arginine used by the

8 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 8 of 41

Page 9 of 41

Protein Science

antitoxin FitA to inhibit FitB originates from VapC-5 itself in this case [Fig. 4(A)]. Based on this, the authors propose a possible mechanism of inhibition that involves the antitoxin acting in an indirect manner by reorganising the toxin to inhibit itself using the arginine residue. No metal ions were found in the structure of VapC-5, nevertheless a weak RNase activity was demonstrated in vitro for the intact TA complex, which is somewhat surprising. Four additional structures of PIN domain toxins from M. tuberculosis have been determined since then, VapC-3 (PDB 3H87),41 VapC-15 (PDB 4CHG),42 VapC30 (PDB 4XGQ),43 and VapC21 (PDB 5SV2),44 all in complex with their corresponding antitoxins, except VapC21, which was determined in isolation. Of these, the VapC-3 (Rv0301) structure is particularly interesting because of the differences observed among the VapC active sites in the crystal. One active site (PIN A) features an electrostatic interaction with Arg73 of the antitoxin (similar to the FitAB interaction) while the other (PIN B) is bound to a single Mg2+ ion [Fig. 4(B)].41 These variations appear to be the result of two different antitoxin conformations ("open" and "closed") that may hint at important and biologically relevant structural changes taking place during toxin activation. Despite the differences in their active sites, both toxins are found as canonical two-fold symmetrical homodimers as observed previously. Inside the active site of the Mg2+-bound toxin, the metal ion is coordinated by three conserved residues (D99, D117, and D119) of the canonical DEDD motif.41 Although the structure thus may suggest that the VapC endonuclease only requires a single metal ion for catalysis it is also compatible with a two-metal ion model. The position of the metal ion is not far from that observed in the archaeal PAE0151 PIN domain, although the latter is more peripheral and only coordinated by two residues, D100 and D118, which correspond to D99 and D117 in VapC-3. As for other VapBC complexes (and FitAB), the likely biological unit is a heterooctamer consisting of two T2A2 heterotetramers, which was also confirmed by biochemical analysis of the complex in this case.41 A specific target for M. tuberculosis VapC-3 has not been identified yet but the nuclease has been reported to cleave singlestranded MS2 RNA in vitro.45 The structure of M. tuberculosis VapBC-15 (genes Rv-2010 and Rv-2009) was the first to reveal stoichiometric heterogeneity in the toxin-antitoxin complex formation as this structure contains both a toxin-antitoxin heterotetramer (T2A2) as well as two heterotrimers (T2A) in the crystallographic ASU.42 Furthermore, this structure showed for the first time two metal ions in the toxin active site, which were identified using x-ray anomalous dispersion as one Mg2+ and one Mn2+ ion [Fig. 4(C)]. The preference for binding two different metal ions may 9 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Page 10 of 41

thus explain why only one ion is observed in some cases as both ions are not always present in the crystallisation buffer. Originally, VapC-15 was shown to degrade E. coli total RNA42 but a more recent study concluded that the specific target in Mycobacteria is in fact tRNALeu3(CAG), one of several leucine-specific tRNAs.31 Curiously, in this case, a glutamate residue from the antitoxin participates in binding the metal ions in the TA complex suggesting that the antitoxin may not have to completely dissociate from the toxin to allow metal ion binding. Alternatively, this glutamate residue could mimics an interaction made with RNA during catalysis [Fig. 4(C)].42 Curiously, the glutamate residue occupies the exact equivalent position of the arginine that otherwise would cause removal of the metal ions (see above). Another interesting conclusion drawn from the VapBC-15 structure is that one VapB antitoxin was observed to inhibit two toxins simultaneously in a 2:1 molar interaction where the C-terminus of one antitoxin molecule spans both toxins of a homodimer [2:1 inhibition, Fig. 4(C)]. This type of interaction has been observed several times later in the structures of R. felis VapBC2 bound to DNA46 and in the C. crescentus VapBC1 complex39 and thus appears to be a common theme in TA interaction. For VapBC-15, this interaction places a glutamate in the first toxin active site (compatible with Mg2+/Mn2+ binding) and an arginine in the second active site (with no bound ions).42 The full implications of this charge reversal for toxin activation in vivo are not yet understood. M. tuberculosis VapC30 deviates from the canonical PIN domain fold in that its C-terminal segment (residues 125-131) does not form a helix (α7) like in most other structures.43 This part of the structure contains one of the key conserved active site residues (Asp119), which is, however, still in place as the segment covers one side of the active site [Fig. 4(D)]. A single Mg2+ ion was found in three of the four active site sites in a position equivalent to what had previously been observed and, moreover, it was found that both Mg2+ and Mn2+ are required for activity, supporting the idea that one of each ion might bind in vivo. M. tuberculosis VapC30 was originally shown to cleave tRNAfMet in a Mg2+-dependent manner43 but a more recent analysis finds that two serine isoacceptor tRNAs (tRNASer,

TGA

and

tRNASer, CGA) most likely constitute the real biological targets.31 VapC30 interacts in a unique way with its cognate antitoxin, VapB30, in that the antitoxin does not inhibit the active site of the toxin molecule it is associated with (its cognate toxin), rather the more distant PIN domain of the toxin dimer [distal inhibition, Fig. 4(D)]. For the Shigella flexneri virulence plasmid pMYSH6000 VapBC complex, the PIN domain has been shown to specifically target tRNAfMet in enteric bacteria by cleaving its anticodon loop.33 This toxin also has a variant 10 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 11 of 41

Protein Science

active site consisting of the residues D-E-D-N as opposed to the canonical D-E-D-D.36 The active site of VapC from S. flexneri was shown to adopt two conformational states; the "closed" and "open" conformations. In the open conformation, no metal ion is found in the active site and Asp98 (DEDN) was found to be shifted suggesting that this conformation represents the post-cleavage state and that Asp98 could play a role in the release of the product.36 The recent structure of M. tuberculosis VapC21 was the first structure of a Mycobacterial VapC toxin determined in absence of its cognate antitoxin.44 This structure confirmed what had been observed for the S. flexneri VapC and other isolated VapC structures, namely that the toxins exist as dimers in isolation with a structure that is nearly identical to the one found in complex with the antitoxin.36 In both these cases, the toxin was expressed in an inactive form by mutation of one of the key active residues to alanine to avoid toxicity in the expression host. However, this modification also appears to disturb the metal ion affinity of the PIN domain and thus none of these isolated structures have been found with ions in the active site, despite the crystals being soaked in high concentrations of divalent metal ions in both cases.36,44 Finally, it is also possible that the antitoxin or RNA is required to stabilise the ion(s). Two bacterial VapBC structures have been determined in complex with operator DNA, namely R. felis VapBC2 (PDB 3ZVK)46 and Caulobacter crescentus VapBC1 (PDB 5K8J), of which the latter is the first structure determined both on and off DNA.39 These structures are all heterooctameric and highlight the structural variability and flexibility observed among various TA pairs as well as between their functional states. The R. felis VapBC2 structure is the most compact of the three, yet able to bind DNA in two adjacent major grooves.46 In C. crescentus VapBC1, the antitoxins inhibit the toxins in the 2:1 molar ratio like observed for VapBC-15 but surprisingly, it was found that the involved antitoxin termini swap positions upon binding DNA as a result of an overall stretching of the complex.39 Moreover, it was demonstrated that the sequence motif interacting with the toxin(s) in the 2:1 mode in many cases is (pseudo-) palindromic, which allows similar interactions across the two-fold symmetric toxin homodimer. Curiously, the most common palindromic motif is R-E...E-R, thus including both a residue that would promote metal ion binding in the PIN domain (E) as well as one that would prevent it (R). This intriguing discovery might suggest that the antitoxins in fact can function as molecular switches that, depending on the position of the Cterminal extension in the toxin active site, would either activate or deactivate the PIN domain

11 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

nuclease.39 As mentioned above, this also opens up the controversial possibility that the antitoxin doesn't have to be fully removed for the toxin to become activated. PIN DOMAINS IN EUKARYOTES Eukaryotic PIN domains were first found to be associated with RNA surveillance mechanisms such as nonsense-mediated decay (NMD) and RNA interference (RNAi), which share a need for both ATP-driven helicases as well as various nucleases to degrade RNA species targeted for destruction.3 However, PIN domains have been associated with a much wider range of processes in eukaryotic cells that relate to nucleic acids, including telomere maintenance,47,48 mRNA turnover,49-55 rRNA maturation,56-59 and tRNA processing.60 The SMG genes, which are common to both the NMD and RNAi pathways can be found across higher eukaryotes from C. elegans to Drosophila and therefore likely represent an ancient RNA processing machinery. Early analysis of the sequences of the proteins SMG5 and SMG6 in Drosophila initially suggested that these two proteins may contain PIN domains.3 Another key protein in eukaryotic RNA decay, DIS3 (yeast Rrp44p), forms is a central active subunit of the eukaryotic RNA exosome and harbours both 3'-5' and endonucleolytic activities, of which a PIN domain is responsible for endonucleolysis.52 This highlights a common feature of eukaryotic PIN domains namely that they often form part of a larger proteins with multiple functions and interactions. Several structures of eukaryotic PIN domain-containing proteins are available to date, including the S. cerevisiae exosome component Dis3p (Rrp44p, PDB 2WP8),61 human SMG5 (hEST1B, PDB 2HWY) and SMG6 (hEST1A, PDB 2HWW and 2DOK),49,62 S. pombe Chp1-Tas3 (PDB 3TIX),63 human MCPIP1 (PDB 3V32),64 A. thaliana RNase P1 (PDB 4G23),65 S. cerevisiae Utp23p (PDB 4MJ7),66 and mouse Regnase-1 (PDB 5H9V).67 Despite displaying much more structural variability than bacterial PIN domains, the core domain folds can be superimposed revealing that the active sites contain the same D-E-D-D tetrad of acidic residues as in bacteria and archaea. Functionally, however, the eukaryotic PIN domains vary quite widely and while some proteins display a high level of specificity towards a single RNA target both in vivo and in vitro, others are sequence-independent such as Drosophila SMG6, which should potentially be able to cleave all mRNAs harbouring a premature stop codon. This observation highlights the fundamental outstanding question namely how target specificity, or lack thereof, arises in the PIN domain.

12 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 12 of 41

Page 13 of 41

Protein Science

Eukaryotic PIN domains that target a wide range of RNA PIN domains associated with mRNA decay. The first structures of eukaryotic PIN domains were those from human SMG5 (hEST1B) and SMG6 (hEST1A), which are required to remove mRNA transcripts containing premature termination codons (PTCs) during NMD [Fig. 5(A), 5(B)].49,62 The NMD machinery is conserved from yeast to humans, but diverges and is more complex in metazoans.68 SMG2 (yeast Upf1p) is a key factor responsible for recognition of incorrect translation termination events and for assembling a core complex together with SMG3 and SMG4 (yeast Upf2p and Upf3p).69 Following recognition, SMG2 becomes phosphorylated causing recruitment of degradation enzymes, including SMG5 and SMG6, both of which contain PIN domains in their C termini. Both proteins are around 1000 residues and are thus much larger than the typical bacterial and archaeal PIN domain proteins, nevertheless, they both have classical PIN domain folds consisting of a central fivestranded parallel β sheet flanked by seven α-helices. Due to their modular nature, most eukaryotic PIN domains are found as monomers, both in solution and in the crystal.49,62 This difference appears to be not only a result of the overall context of the PIN domains but also due to lack of one of the α-helices centrally involved in dimerisation (α4) in the archaeal/bacterial proteins.49 Furthermore, two of the β strands in SMG6 (β1 and β5) form an extended structure, which gives this human PIN domain an overall elliptical shape and has been suggested to help orient the domain relative to the rest of the protein [Fig. 5(B)].49 Not surprisingly, SMG5 and SMG6 are more similar to each other than to the archaeal/bacterial/phage proteins and SMG5 lacks all but a single aspartic acid residue in the active site and consequently is inactive as an nuclease in vitro [Fig. 5(C)]. SMG6, on the other hand, contains a classical PIN domain active site with three conserved aspartic acid residues that structurally superimposes well with those of T4 RNase H and the archaeal PIN domain proteins AF0591 and PAE2754 [Fig. 5(D)].14,16 Consistent with its more complete active site, SMG6 was found to display robust ribonuclease activity towards single-stranded poly-U RNA in the presence of Mn2+ in vitro, while no activity was observed in the presence of Mg2+ alone.49,62 Most importantly, however, no differences in degradation pattern were observed based on whether the radioactive label was placed in the 5' or 3' terminus of the RNA strand, demonstrating for the first time that PIN domains are active as endonucleases. In accordance with this, expression of SMG6 in Drosophila lowers mRNA half-lives in vivo significantly, an effect that is not observed with SMG5.69 Finally, activity

13 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

was only observed for single-stranded RNA demonstrating that PIN domains are functionally distinct from RNase H, which is specific for DNA-RNA hetero-duplexes.70 PIN domains associated with gene silencing. The chromodomain protein Chp1 belongs to the RNA-induced initiation of transcriptional gene silencing (RITS) complex, which is involved in physically tethering small non-coding RNAs to chromatin in fission yeast (S. pombe). Chp1 contains a PIN domain in its C-terminus, deletion of which leads to increased levels of transcription without altering RNA polymerase II occupancy on chromatin, suggesting that the Chp1 PIN domain is involved in post-transcriptional processing of subtelomeric transcripts.63 The crystal structure of the C-terminal half of Chp1 including of the PIN domain and two other domains (SPOC and Domain II) was determined in complex with the N-terminal domain of Tas3, another multi-domain protein in the RITS complex (Figure 6A). The Chp1 PIN domain sits at one end of a very elongated protein complex and only makes tenuous interactions with the preceding domain in Chp1, Domain II, which displays a Rossmann-like fold.63 The fold of the Chp1 PIN domain is more similar to human SMG6/hEST1A than its bacterial/archaeal counterparts, suggesting that eukaryotic PIN domains may constitute a structurally unique group separate from the prokaryotic domains. It is also noteworthy that both proteins, Chp1 and SMG6/hEST1A, have been implicated in telomere maintenance in fission yeast and humans, respectively, suggesting that PIN domains have other roles than RNA cleavage in eukaryotes.49 When structurally aligned with canonical, bacterial PIN domains, the Chp1 domain active site appears to consist of a single aspartate and a glutamate residue and thus lacking two of the conserved aspartate residues, consistent with the observed lack of endoribonuclease activity in vitro [Fig. 6(B)].63 Together, this suggests that the Chp1 PIN domain could have another, more architectural role, such as mediating protein-protein interactions. PIN domains associated with RNA degradation. In some instances, eukaryotic PIN domains have been shown to have a role in recruitment of protein partners to larger complexes, rather than function as nucleases. One example is DIS3 (yeast Rrp44p), which is a central active component of the eukaryotic exosome [Fig. 6(C)].52 Deletion of the Rrp44p PIN domain is lethal in the yeast S. cerevisiae but mutations which only abolish the PIN domain activity do not have the same effect suggesting that the Rrp44p PIN domain is required for formation of the exosome complex.52 It has furthermore been shown that Rrp44p/DIS3 is essential for the structural integrity of the core exosome by being involved in

14 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 14 of 41

Page 15 of 41

Protein Science

protein-protein interactions with Rrp41p, Rrp43p, and Rrp45p.71 Intriguingly, Rrp44p has an intact PIN domain active site including all of the conserved, acidic residues (D91, E120, D171, and D198) and the protein is active as a 3’-5’ exoribonuclease both in vitro and in vivo [Fig. 6(C), 6(D)].52,53 In another study, the Rrp44p PIN domain was shown to be able cleave tRNA precursors in vivo, and mutation in the domain causes accumulation of a significant amounts of tRNA precursors in the cell, suggesting a specific function for Rrp44p within the scope of the more general RNA degrading role of the exosome.60 The observed specificity for tRNA is thus intriguingly reminiscent of the bacterial VapC toxins and might point to an ancestral function of Rrp44p as a combined 3'-5' exonuclease and tRNA endonuclease, with the latter function similar to what is found among bacterial VapC toxin PIN domains. Swt1 is another PIN domain protein associated with mRNA metabolism. No structure of Swt1 is available but the protein has been genetically linked to the TRanscription-EXport (TREX) complex72 and the putative PIN domain active site mutation D135N has been shown to cause accumulation of poly-adenylated RNA in the nucleus suggesting that the protein in this case is an active endonuclease and plays a role in RNA surveillance.54 In summary, it thus appears that eukaryotic PIN domains differ from their bacterial counterparts by 1) not always being active, 2) targeting a wide range of RNA substrates, and 3) often forming a part of a larger protein. However, it remains possible that a specific region of these target RNAs adopt a common structure which is recognised by the proteins during cleavage. Structure determination of a eukaryotic PIN domain bound to its target RNA, such as a tRNA-like stem loop,32 is therefore a critical next step to understand how specificity, if it exists, is achieved at the molecular level. Eukaryotic PIN domains targeting specific RNAs PIN domains associated with inflammatory and immune response. Some eukaryotic PIN domains display a high target specificity similarly to what has been observed for the bacterial VapC toxins, such as human protein MCP-1-induced protein 1 (MCPIP1, Zc3h12a), an immune response regulator that is induced upon pathogen invasion through the Toll-like receptor pathway.73 Deletion of MCPIP1 in mice causes specific accumulation of the mRNAs encoding IL-6 (interleukin-6) and IL-12β (interleukin 12β) and the protein was shown to down-regulate the immune response by specifically cleaving the two mRNAs in their 3' UTR.73 Curiously, the protein has also been implicated in micro RNA (miRNA) biogenesis by targeting a terminal loop in some pre-miRNAs.74 MCPIP1 contains a PIN domain in its N-

15 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

terminus (residues 100-300), just upstream of a CCCH zinc-finger domain, which is known to bind directly to the 3' UTR region of targeted transcripts.75 Like other PIN domains, MCPIP1 is more active when incubated with Mn2+ than Mg2+ and a single mutation (D141N) in human MCPIP1 completely abolishes RNase activity in vivo.73 In vitro, the isolated PIN domain cleaves

32

P-labeled IL-6 3'-UTR RNA in the presence of Mg2+, however, both the

zinc-finger domain but in particular the 111 N-terminal residues preceding the PIN domain appear to significantly stimulate RNase activity.64 The crystal structure of the human MCPIP1 PIN domain revealed a classical fold that superimposes well with both human SMG6 and phage T5 D15 5'-exonuclease [Fig. 7(A)].62,76 Curiously, MCPIP1 contains a positively charged β hairpin extension, which is inserted after β3 of the central PIN domain β sheet in a similar position to a positively charged arch in T5 5'-exonuclease known to be important for RNA specificity [Fig. 7(A)].76 Supporting that this region might be involved in targeting the protein to RNA, mutation of one of several basic residues in the hairpin results in a loss of RNase activity in vivo.64 The PIN domain active site was located by determining structures of Mg2+-soaked protein, which has revealed that the residues D141, D225, D226, and D244 are centrally involved and mutation of any of these renders the protein inactive in vivo [Fig. 7(A)].64 Unfortunately, the zinc-finger domain is completely disordered in the MCPIP structure, so not much can be concluded about its possible role in RNA recruitment. However, structures of the mouse homologue of MCPIP1, Regnase-1, the core of the Nterminal domain (residues 45-89), the core PIN domain (134-295), the zinc-finger domain (301-326), and a small domain from the C-terminus (544-596) were each determined independently by x-ray crystallography and NMR [Fig. 7(B)].67 Like its human orthologue, the isolated PIN domain displays a limited RNase activity in vitro, which is stimulated by intramolecular interaction with the N-terminal domain. However, in contrast to human MCPIP1, the mouse PIN domain was found as a dimer in the crystal structure, an interaction which depends on the before-mentioned β hairpin extension.67 This dimer has the same overall dimensions as the ones found for bacteria and archaeal PIN domains but with a different packing [Fig. 7(C)]. Together these data suggest that in this case, the PIN domain dimer is activated by the N-terminal domain while the zinc-finger domain increases RNA affinity. PIN domains associated with ribosomal biogenesis. Maturation of eukaryotic ribosomal RNA from the initial transcripts is an intricate process that requires the precise action of several types of nucleases.77 One such protein is Nob1p, a PIN domain endonuclease 16 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 16 of 41

Page 17 of 41

Protein Science

responsible for cleaving the D-loop of 20S pre-rRNA and generating the mature 3'-end of 18S rRNA.57,58 The Nob1p PIN domain is active as an endonuclease both in vivo and in vitro, and mutation of one of the conserved aspartate residues within its PIN domain abolishes RNase activity in vivo.56,59 No structure of the eukaryotic Nob1 is available, however, the ribosomal RNA maturation machinery is highly conserved, and homologous proteins can be identified in fungi, plants, protists, and many archaea.59 Nob1p from the thermophilic archaeon Pyrococcus horikoshii (PhNob1) is a minimalist ribosome maturation enzyme consisting of only the PIN domain (1-117) and a zinc ribbon domain (129-159) and shares 67% sequence similarity with the eukaryotic proteins in these regions. The structure of PhNob1 revealed a classical PIN domain fold consisting of a parallel five-stranded β-sheet, six α-helices and three aspartic acid residues in the active site, D12, D82, and D100 [Fig. 7(D)].59 In vitro, PhNob1 is able to specifically cleave pre-rRNA mini-substrates of 50-70 nucleotides that contain the natural cleavage context in an exposed loop structure in a Mn2+dependent manner. Like other eukaryotic PIN domains, no activity has been observed in the presence of Mg2+. Mapping of the RNA-interacting residues in PhNob1 using NMR revealed that all contacts to RNA are located close to the active site of the PIN domain [Fig. 7(D)].59 The zinc ribbon domain did not interact with RNA in this case but was shown to be required for the initial targeting of the nuclease domain to helix 40 of the small subunit pre-rRNA. This helix is placed at some distance from the cleavage site consistent with the location of the zinc ribbon domain at the end of a long linker in the crystal structure.59 Thus, in this case, the observed specificity of the Nob1 PIN domain towards a single site in pre-rRNA is not the result of the affinity of the PIN domain alone, but is dependent on another domain conferring additional RNA affinity. Ribosomal biogenesis in the yeast S. cerevisiae is known to depend on two other PIN domain endonucleases, Utp23p and Utp24p.77 Both proteins form part of the nucleolar 90S preribosome (also called the small subunit processome), however, while Utp24p is required for several pre-rRNA cleavage events, the PIN domain of Utp23p is degenerate and inactive as a endonuclease and this protein is therefore thought to be important for release of the snR30 snoRNA during 90S pre-ribosomal maturation.78,79 Utp23p is a 254-residue protein consisting of an N-terminal PIN domain (residues 1-159) followed by a region of low-complexity, which is likely structurally disordered.66 The structure of Utp23p revealed a PIN domain fold consisting of six β strands, which are all parallel except the last (β6) [Fig. 7(E)]. The Utp23p PIN domain contains eight α helices, comprising the six canonical helices plus one additional 17 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Page 18 of 41

helix in either end of the sequence. Of these, the first additional helix (α1) contains several conserved, basic residues suggesting an ability to interact with RNA, although this has not been shown experimentally. Only two of the conserved acidic residues in the active are conserved (Asp31 and Asp123) but surprisingly, these appear not to be required for the function of Utp23p in vivo.66 Most remarkably, however, Utp23p contains a unique zincfinger domain formed by three Cys and one His residue originating from both helices and loops (i.e. α2-α7 and β1-β5) of the core PIN domain structure.66 All three Cys residues are absolutely required for pre-ribosomal processing and thus, yeast growth, in vivo. Together, these data suggest that the nuclease activity of Utp23p is not required and has degenerated while another function as an RNA interaction protein involving the zinc-finger and the additional helices (α1 and α8), has taken over. This unique combination of a PIN domain and a zinc-finger motif is conserved in Utp24p, which is even more intriguing as this protein has a complete active site. CONSERVATION OF THE PIN DOMAIN ACROSS ALL DOMAINS OF LIFE The

core

PIN

domain

motif

consists

of

alternating

β-strands

and

α-helices

(β1α1β2α2β3α3β4α4β5) that fold into a compact structure consisting of a central five-stranded, parallel β-sheet with two helices on either side. This topology is thus similar to the classical Rossmann fold (β1α1β2α2β3), which is well known to bind nucleotides and nucleotide-based co-factors in a range of enzymes.80 Studies of protein evolution have further identified the PIN domain as a member of the HUP (HIGH-signature proteins, UspA, and PP-ATPase) superfamily, a sub-class of Rossman-like folds that has at least two helices on either side of a five-stranded, parallel β-sheet [Fig. 8(A), left].81,82 Most PIN domains contain additional helices outside the core HUP motif and a common topology is (β1α1'α1β2α2'α2β3α3'α3β4α4β5), i.e. with one additional helix inserted immediately after each of the first four β strands [α1', α2', and α3', Fig. 1(C), 8(A), right). In the FLAP endonucleases and PIN domain toxins, these first two helical insertions form a lid (or "clip"), which is thought to control interactions with RNA substrates as well as access to the active site [Fig. 8(A), right].81 The conserved, acidic residues that make up the active site cavity are all located at the C-terminal end of the βstrands, or early in the helices immediately following the strands. The first conserved, acidic residue, which is invariably an Asp [Asp4/Asp5, yellow in Fig. 8(B), 8(C)], occurs at the end of β1 before the first helix. This residue is critical for nuclease activity of the PIN domain as it coordinates one of the metal ions in the active site thought to be directly involved in catalysis

18 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 19 of 41

Protein Science

[Fig. 8(B)].81 This is followed by another important residue, which always occurs after one turn of the first helix (α1) and often is an Asn [Asn7, Fig. 8(C), pink]. However, in many PIN domains, particularly the bacterial toxins, this residue is a Ser instead [Ser6, Fig. 8(B)]. In either case, this residue is thought to be involved in controlling access to the active site by structural modulation of lid region. In the FLAP endonuclease family and some PIN-domain like proteins, this residue is located two residues after the initial Asp, while in the canonical PIN domains, it is located only one residue later. This subtle distinction has been used to classify the former as a unique group, called the N4BP1, YacP-like Nuclease (NYN) domain.81 One example of a NYN domain is VPA0982 from Vibrio parahaemolyticus (PDB 2QIP, Tan et al., Midwest Center for Structural Genomics, unpublished). The third residue of the active site (2nd acidic residue) occurs in the helix (α2) following β2. It is always a Glu [Glu42, Fig. 8(B), 8(C), light green] and is structurally the most distant of the core acidic residues. Its side chain is held in place by the residue located between the first two active site residues, which is often a Thr [Thr5/Thr6, Fig. 8(B), 8(C), light blue]. The fourth residue (3rd acidic residue) is always an Asp [Asp96/Asp104, Fig. 8(B), 8(C), dark green] and it is located on the other side of the active site at the end of α3. This Asp appears to have an important role in bridging the two metal ions but it varies between structures which ion is most closely coordinated.41,46 Finally, the active site contains one or sometimes two additional aspartic acid residues, which both occur in the loop following β4 and in the beginning following helix [Asp114/Asp122, Fig. 8(B), 8(C), orange and Asp116/Ser125, red]. Together, these form one side of the active site, opposite from the glutamate (2nd acidic residue) and are closely involved in coordination of the second metal ion. Another potentially important residue in this region is located two residues before the first of the two Asp, and is often either a serine or threonine [Thr120, Fig. 8C, cyan]. This residue has been described as being involved in organising the acidic residues of the active site through hydrogen bond interactions while for T4 RNase H, it was proposed to act as a proton donor during catalysis.6 In the first archaeal PIN domain structure, this residue (a threonine) was also described as a candidate for catalysis but initially considered too buried to allow direct involvement.14 Nevertheless, the idea of a catalytically important residue is consistent with the presence of a histidine residue at this location in a subgroup of structures, including several VapC PIN domain toxins from M. tuberculosis [His112, Fig. 8(B), cyan].41,42 Additional structural studies of PIN domains in complex with their target RNA molecules as well as careful

19 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

biochemical and mutational studies are currently necessary to complete our understanding of the catalytic mechanism. CONCLUDING REMARKS After years of research, the PIN domain remains as enigmatic as it is plentiful. Found in all domains of life, we observe this compact protein fold as both part-taker in enzymatic reactions of central RNA molecules of the cell and as protein interaction partner in large, multi-domain complexes. Many structures have been determined, both of PIN domains as isolated toxin nucleases and as part of larger, eukaryotic proteins, but surprisingly, no substrate-bound form has yet been characterised. This may be a result of relatively weak interactions between the PIN domain and its target RNA and thus difficulties in formation of a stable protein-RNA complex in vitro. Nevertheless, such a structure now appears more important than ever to allow us to understand fundamental principles of recognition and catalysis. The nature of the PIN domain active site with its conserved, acidic residues, means that it must have very low, if any, intrinsic affinity for negatively charged RNA. Following the same argument, the divalent metal ions known to occupy one or two positions between protein and RNA are thus very likely to be critical, not only for catalysis, but also RNA binding itself, through a charge reversal process in which negatively charged residues in the PIN domain attract positively charged ions, which then in turn are able to attract negatively charged RNA. To stabilise a protein-RNA complex for structural studies, changes must be made, like for any enzyme, either to the active site or the RNA substrate to prevent cleavage and subsequent product dissociation. Mutation of one or more of the acidic residues in the active site to alanine would be the first choice, but such removal of a negatively charged residue in the protein is likely to affect metal ion binding and thus, RNA binding. Therefore, stabilisation of a PIN domain protein-RNA complex must likely be achieved through other means and remains a conundrum that the field must tackle in order to elucidate the underlying principles of recognition and cleavage and finally uncover the secrets of the PIN domain. ACKNOWLEDGEMENTS This work was supported by the Danish National Research Foundation grant DNRF120 (Centre for Bacterial Stress Response and Persistence, BASP).

20 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 20 of 41

Page 21 of 41

Protein Science

REFERENCES 1.

Wall D, Kaiser D (1999) Type IV pili and cell motility. Mol Microbiol 32:1-10.

2.

Makarova KS, Aravind L, Galperin MY, Grishin NV, Tatusov RL, Wolf YI, Koonin EV (1999) Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. Genome Res 9:608628.

3.

Clissold PM, Ponting CP (2000) PIN domains in nonsense-mediated mRNA decay and RNAi. Curr Biol 10:R888-890.

4.

Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K and others (2010) The Pfam protein families database. Nucleic Acids Res 38:D211-222.

5.

Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631-637.

6.

Mueser TC, Nossal NG, Hyde CC (1996) Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell 85:1101-1112.

7.

Nowotny M, Gaidamakov SA, Crouch RJ, Yang W (2005) Crystal structures of RNase H bound to an RNA/DNA hybrid: substrate specificity and metal-dependent catalysis. Cell 121:1005-1016.

8.

Lyamichev V, Brow MA, Dahlberg JE (1993) Structure-specific endonucleolytic cleavage of nucleic acids by eubacterial DNA polymerases. Science 260:778-783.

9.

Sillitoe I, Lewis TE, Cuff A, Das S, Ashford P, Dawson NL, Furnham N, Laskowski RA, Lee D, Lees JG and others (2015) CATH: comprehensive structural and functional annotations for genome sequences. Nucleic Acids Res 43:D376-381.

10.

Bunker RD, McKenzie JL, Baker EN, Arcus VL (2008) Crystal structure of PAE0151 from Pyrobaculum aerophilum, a PIN-domain (VapC) protein from a toxin-antitoxin operon. Proteins 72:510-518.

11.

Anantharaman V, Aravind L (2003) New connections in the prokaryotic toxinantitoxin network: relationship with the eukaryotic nonsense-mediated RNA decay system. Genome Biol 4:R81.

12.

Cooper CR, Daugherty AJ, Tachdjian S, Blum PH, Kelly RM (2009) Role of vapBC toxin-antitoxin loci in the thermal stress response of Sulfolobus solfataricus. Biochem Soc Trans 37:123-126. 21 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

13.

You XY, Liu C, Wang SY, Jiang CY, Shah SA, Prangishvili D, She Q, Liu SJ, Garrett RA (2011) Genomic analysis of Acidianus hospitalis W1 a host for studying crenarchaeal virus and plasmid life cycles. Extremophiles 15:487-497.

14.

Arcus VL, Backbro K, Roos A, Daniel EL, Baker EN (2004) Distant structural homology leads to the functional characterization of an archaeal PIN domain as an exonuclease. J Biol Chem 279:16471-16478.

15.

Xuan J, Song X, Chen C, Wang J, Feng Y (2013) A PilT N-terminus domain protein SSO1118 from hyperthemophilic archaeon Sulfolobus solfataricus P2. J Biomol NMR 57:363-368.

16.

Levin I, Schwarzenbacher R, Page R, Abdubek P, Ambing E, Biorac T, Brinen LS, Campbell J, Canaves JM, Chiu HJ and others (2004) Crystal structure of a PIN (PilT N-terminus) domain (AF0591) from Archaeoglobus fulgidus at 1.90 A resolution. Proteins 56:404-408.

17.

Jeyakanthan J, Inagaki E, Kuroishi C, Tahirov TH (2005) Structure of PIN-domain protein PH0500 from Pyrococcus horikoshii. Acta Cryst F61:463-468.

18.

Bendtsen KL, Brodersen DE (2017) Higher-order structure in bacterial VapBC toxinantitoxin complexes. Subcell Biochem 83:381-412.

19.

Pandey DP, Gerdes K (2005) Toxin-antitoxin loci are highly abundant in free-living but lost from host-associated prokaryotes. Nucleic Acids Res 33:966-976.

20.

Gerdes K, Maisonneuve E (2012) Bacterial persistence and toxin-antitoxin loci. Annu Rev Microbiol 66:103-123.

21.

Maisonneuve E, Gerdes K (2014) Molecular mechanisms underlying bacterial persisters. Cell 157:539-548.

22.

Page R, Peti W (2016) Toxin-antitoxin systems in bacterial growth arrest and persistence. Nat Chem Biol 12:208-214.

23.

Maisonneuve E, Castro-Camargo M, Gerdes K (2013) (p)ppGpp controls bacterial persistence by stochastic induction of toxin-antitoxin activity. Cell 154:1140-1150.

24.

Winther KS, Gerdes K (2012) Regulation of enteric vapBC transcription: induction by VapC toxin dimer-breaking. Nucleic Acids Res 40:4347-4357.

25.

Shan Y, Brown Gandt A, Rowe SE, Deisinger JP, Conlon BP, Lewis K (2017) ATPdependent persister formation in Escherichia coli. MBio 8.

26.

Gerdes K, Christensen SK, Lobner-Olesen A (2005) Prokaryotic toxin-antitoxin stress response loci. Nat Rev Microbiol 3:371-382.

22 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 22 of 41

Page 23 of 41

Protein Science

27.

Shao Y, Harrison EM, Bi D, Tai C, He X, Ou HY, Rajakumar K, Deng Z (2011) TADB: a web-based resource for Type 2 toxin-antitoxin loci in bacteria and archaea. Nucleic Acids Res 39:D606-611.

28.

Neubauer C, Gao YG, Andersen KR, Dunham CM, Kelley AC, Hentschel J, Gerdes K, Ramakrishnan V, Brodersen DE (2009) The structural basis for mRNA recognition and cleavage by the ribosome-dependent endonuclease RelE. Cell 139:1084-1095.

29.

Fasani RA, Savageau MA (2013) Molecular mechanisms of multiple toxin-antitoxin systems are coordinated to govern the persister phenotype. Proc Natl Acad Sci USA 110:E2528-2537.

30.

Cruz JW, Sharp JD, Hoffer ED, Maehigashi T, Vvedenskaya IO, Konkimalla A, Husson RN, Nickels BE, Dunham CM, Woychik NA (2015) Growth-regulating Mycobacterium tuberculosis VapC-mt4 toxin is an isoacceptor-specific tRNase. Nat Commun 6:7480.

31.

Winther K, Tree JJ, Tollervey D, Gerdes K (2016) VapCs of Mycobacterium tuberculosis cleave RNAs essential for translation. Nucleic Acids Res 44:9860-9871.

32.

Winther KS, Brodersen DE, Brown AK, Gerdes K (2013) VapC20 of Mycobacterium tuberculosis cleaves the sarcin-ricin loop of 23S rRNA. Nat Commun 4:2796.

33.

Winther KS, Gerdes K (2011) Enteric virulence associated protein VapC inhibits translation by cleavage of initiator tRNA. Proc Natl Acad Sci USA 108:7403-7407.

34.

Moazed D, Robertson JM, Noller HF (1988) Interaction of elongation factors EF-G and EF-Tu with a conserved loop in 23S RNA. Nature 334:362-364.

35.

Ignatov DV, Salina EG, Fursov MV, Skvortsov TA, Azhikina TL, Kaprelyants AS (2015) Dormant non-culturable Mycobacterium tuberculosis retains stable lowabundant mRNA. BMC Genomics 16:954.

36.

Xu K, Dedic E, Brodersen DE (2016) Structural analysis of the active site architecture of the VapC toxin from Shigella flexneri. Proteins 84:892-899.

37.

Dienemann C, Boggild A, Winther KS, Gerdes K, Brodersen DE (2011) Crystal structure of the VapBC toxin-antitoxin complex from Shigella flexneri reveals a hetero-octameric DNA-binding assembly. J Mol Biol 414:713-722.

38.

Mattison K, Wilbur JS, So M, Brennan RG (2006) Structure of FitAB from Neisseria gonorrhoeae bound to DNA reveals a tetramer of toxin-antitoxin heterodimers containing pin domains and ribbon-helix-helix motifs. J Biol Chem 281:37942-37951.

39.

Bendtsen KL, Xu K, Luckmann M, Winther KS, Shah SA, Pedersen CNS, Brodersen DE (2016) Toxin inhibition in C. crescentus VapBC1 is mediated by a flexible 23 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

pseudo- palindromic protein motif and modulated by DNA binding. Nucleic Acids Res, in press. 40.

Miallau L, Faller M, Chiang J, Arbing M, Guo F, Cascio D, Eisenberg D (2009) Structure and proposed activity of a member of the VapBC family of toxin-antitoxin systems. VapBC-5 from Mycobacterium tuberculosis. J Biol Chem 284:276-283.

41.

Min AB, Miallau L, Sawaya MR, Habel J, Cascio D, Eisenberg D (2012) The crystal structure of the Rv0301-Rv0300 VapBC-3 toxin-antitoxin complex from M. tuberculosis reveals a Mg(2)(+) ion in the active site and a putative RNA-binding site. Protein Sci 21:1754-1767.

42.

Das U, Pogenberg V, Subhramanyam UK, Wilmanns M, Gourinath S, Srinivasan A (2014) Crystal structure of the VapBC-15 complex from Mycobacterium tuberculosis reveals a two-metal ion dependent PIN-domain ribonuclease and a variable mode of toxin-antitoxin assembly. J Struct Biol 188:249-258.

43.

Lee IG, Lee SJ, Chae S, Lee KY, Kim JH, Lee BJ (2015) Structural and functional studies of the Mycobacterium tuberculosis VapBC30 toxin-antitoxin system: implications for the design of novel antimicrobial peptides. Nucleic Acids Res 43:7624-7637.

44.

Jardim P, Santos IC, Barbosa JA, de Freitas SM, Valadares NF (2016) Crystal structure of VapC21 from Mycobacterium tuberculosis at 1.31 A resolution. Biochem Biophys Res Commun 478:1370-1375.

45.

Ramage HR, Connolly LE, Cox JS (2009) Comprehensive functional analysis of Mycobacterium tuberculosis toxin-antitoxin systems: implications for pathogenesis, stress responses, and evolution. PLoS Genet 5:e1000767.

46.

Mate MJ, Vincentelli R, Foos N, Raoult D, Cambillau C, Ortiz-Lombardia M (2012) Crystal structure of the DNA-bound VapBC2 antitoxin/toxin pair from Rickettsia felis. Nucleic Acids Res 40:3245-3258.

47.

Reichenbach P, Hoss M, Azzalin CM, Nabholz M, Bucher P, Lingner J (2003) A human homolog of yeast Est1 associates with telomerase and uncaps chromosome ends when overexpressed. Curr Biol 13:568-574.

48.

Snow BE, Erdmann N, Cruickshank J, Goldman H, Gill RM, Robinson MO, Harrington L (2003) Functional conservation of the telomerase protein Est1p in humans. Curr Biol 13:698-704.

49.

Takeshita D, Zenno S, Lee WC, Saigo K, Tanokura M (2007) Crystal structure of the PIN domain of human telomerase-associated protein EST1A. Proteins 68:980-989. 24 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 24 of 41

Page 25 of 41

Protein Science

50.

Lebreton A, Tomecki R, Dziembowski A, Seraphin B (2008) Endonucleolytic RNA cleavage by a eukaryotic exosome. Nature 456:993-996.

51.

Eberle AB, Lykke-Andersen S, Muhlemann O, Jensen TH (2009) SMG6 promotes endonucleolytic cleavage of nonsense mRNA in human cells. Nat Struct Mol Biol 16:49-55.

52.

Schneider C, Leung E, Brown J, Tollervey D (2009) The N-terminal PIN domain of the exosome subunit Rrp44 harbors endonuclease activity and tethers Rrp44 to the yeast core exosome. Nucleic Acids Res 37:1127-1140.

53.

Schaeffer D, Tsanova B, Barbas A, Reis FP, Dastidar EG, Sanchez-Rotunno M, Arraiano CM, van Hoof A (2009) The exosome contains domains with specific endoribonuclease, exoribonuclease and cytoplasmic mRNA decay activities. Nat Struct Mol Biol 16:56-62.

54.

Skruzny M, Schneider C, Racz A, Weng J, Tollervey D, Hurt E (2009) An endoribonuclease functionally linked to perinuclear mRNP quality control associates with the nuclear pore complexes. PLoS Biol 7:e8.

55.

Makino DL, Baumgartner M, Conti E (2013) Crystal structure of an RNA-bound 11subunit eukaryotic exosome complex. Nature 495:70-75.

56.

Fatica A, Tollervey D, Dlakic M (2004) PIN domain of Nob1p is required for D-site cleavage in 20S pre-rRNA. RNA 10:1698-1701.

57.

Lamanna AC, Karbstein K (2009) Nob1 binds the single-stranded cleavage site D at the 3'-end of 18S rRNA with its PIN domain. Proc Natl Acad Sci USA 106:1425914264.

58.

Pertschy B, Schneider C, Gnadig M, Schafer T, Tollervey D, Hurt E (2009) RNA helicase Prp43 and its co-factor Pfa1 promote 20 to 18 S rRNA processing catalyzed by the endonuclease Nob1. J Biol Chem 284:35079-35091.

59.

Veith T, Martin R, Wurm JP, Weis BL, Duchardt-Ferner E, Safferthal C, Hennig R, Mirus O, Bohnsack MT, Wohnert J and others (2012) Structural and functional analysis of the archaeal endonuclease Nob1. Nucleic Acids Res 40:3259-3274.

60.

Gudipati RK, Xu Z, Lebreton A, Seraphin B, Steinmetz LM, Jacquier A, Libri D (2012) Extensive degradation of RNA precursors by the exosome in wild-type cells. Mol Cell 48:409-421.

61.

Bonneau F, Basquin J, Ebert J, Lorentzen E, Conti E (2009) The yeast exosome functions as a macromolecular cage to channel RNA substrates for degradation. Cell 139:547-559. 25 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

62.

Glavan F, Behm-Ansmant I, Izaurralde E, Conti E (2006) Structures of the PIN domains of SMG6 and SMG5 reveal a nuclease within the mRNA surveillance complex. EMBO J 25:5117-5125.

63.

Schalch T, Job G, Shanker S, Partridge JF, Joshua-Tor L (2011) The Chp1-Tas3 core is a multifunctional platform critical for gene silencing by RITS. Nat Struct Mol Biol 18:1351-1357.

64.

Xu J, Peng W, Sun Y, Wang X, Xu Y, Li X, Gao G, Rao Z (2012) Structural study of MCPIP1 N-terminal conserved domain reveals a PIN-like RNase. Nucleic Acids Res 40:6957-6965.

65.

Howard MJ, Lim WH, Fierke CA, Koutmos M (2012) Mitochondrial ribonuclease P structure provides insight into the evolution of catalytic strategies for precursor-tRNA 5' processing. Proc Natl Acad Sci USA 109:16149-16154.

66.

Lu J, Sun M, Ye K (2013) Structural and functional analysis of Utp23, a yeast ribosome synthesis factor with degenerate PIN domain. RNA 19:1815-1824.

67.

Yokogawa M, Tsushima T, Noda NN, Kumeta H, Enokizono Y, Yamashita K, Standley DM, Takeuchi O, Akira S, Inagaki F (2016) Structural basis for the regulation of enzymatic activity of Regnase-1 by domain-domain interactions. Sci Rep 6:22324.

68.

Conti E, Izaurralde E (2005) Nonsense-mediated mRNA decay: molecular insights and mechanistic variations across species. Curr Opin Cell Biol 17:316-325.

69.

Fukuhara N, Ebert J, Unterholzner L, Lindner D, Izaurralde E, Conti E (2005) SMG7 is a 14-3-3-like adaptor in the nonsense-mediated mRNA decay pathway. Mol Cell 17:537-547.

70.

Hollingsworth HC, Nossal NG (1991) Bacteriophage T4 encodes an RNase H which removes RNA primers made by the T4 DNA replication system in vitro. J Biol Chem 266:1888-1897.

71.

Wang HW, Wang J, Ding F, Callahan K, Bratkowski MA, Butler JS, Nogales E, Ke A (2007) Architecture of the yeast Rrp44 exosome complex suggests routes of RNA recruitment for 3' end processing. Proc Natl Acad Sci USA 104:16844-16849.

72.

Rother S, Clausing E, Kieser A, Strasser K (2006) Swt1, a novel yeast protein, functions in transcription. J Biol Chem 281:36518-36525.

73.

Matsushita K, Takeuchi O, Standley DM, Kumagai Y, Kawagoe T, Miyake T, Satoh T, Kato H, Tsujimura T, Nakamura H and others (2009) Zc3h12a is an RNase

26 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 26 of 41

Page 27 of 41

Protein Science

essential for controlling immune responses by regulating mRNA decay. Nature 458:1185-1190. 74.

Mizgalska D, Wegrzyn P, Murzyn K, Kasza A, Koj A, Jura J, Jarzab B, Jura J (2009) Interleukin-1-inducible MCPIP protein has structural and functional properties of RNase and participates in degradation of IL-1beta mRNA. FEBS J 276:7386-7399.

75.

Lai WS, Kennington EA, Blackshear PJ (2002) Interactions of CCCH zinc finger proteins with mRNA: non-binding tristetraprolin mutants exert an inhibitory effect on degradation of AU-rich element-containing mRNAs. J Biol Chem 277:9606-9613.

76.

Ceska TA, Sayers JR, Stier G, Suck D (1996) A helical arch allowing single-stranded DNA to thread through T5 5'-exonuclease. Nature 382:90-93.

77.

Tomecki R, Dziembowski A (2010) Novel endoribonucleases as central players in various pathways of eukaryotic RNA metabolism. RNA 16:1692-1724.

78.

Bleichert F, Granneman S, Osheim YN, Beyer AL, Baserga SJ (2006) The PINc domain protein Utp24, a putative nuclease, is required for the early cleavage steps in 18S rRNA maturation. Proc Natl Acad Sci USA 103:9464-9469.

79.

Hoareau-Aveilla C, Fayet-Lebaron E, Jady BE, Henras AK, Kiss T (2012) Utp23p is required for dissociation of snR30 small nucleolar RNP from preribosomal particles. Nucleic Acids Res 40:3641-3652.

80.

Rao ST, Rossmann MG (1973) Comparison of super-secondary structures in proteins. J Mol Biol 76:241-256.

81.

Anantharaman V, Aravind L (2006) The NYN domains: novel predicted RNAses with a PIN domain-like fold. RNA Biol 3:18-27.

82.

Aravind L, Anantharaman V, Koonin EV (2002) Monophyly of class I aminoacyl tRNA synthetase, USPA, ETFP, photolyase, and PP-ATPase nucleotide-binding domains: implications for protein evolution in the RNA. Proteins 48:1-14.

83.

Sink R, Kotnik M, Zega A, Barreteau H, Gobec S, Blanot D, Dessen A, ContrerasMartel C (2016) Crystallographic study of peptidoglycan biosynthesis enzyme MurD: Domain movement revisited. PLoS One 11:e0152075.

27 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

FIGURE LEGENDS Figure 1. Overview of the PIN domain-like superfamily. A. A simplified view of the classification and subdivision of the PIN domain-like superfamily, CATH 3.40.50.1010, into structural clusters and functional families (FunFams).9 B. Structural superposition of members of the PIN domain-like superfamily showing the core secondary structure elements with colours (α-helices, light blue, β-strands, dark blue, loop regions, grey). C. Topology of the PIN domain with secondary structure elements indicated. The canonical secondary structure elements are numbered α1, α2, α3... and β1, β2, β3..., while the additional helices are numbered α1', α2', and α3'. Figure 2. Archaeal PIN domain structures. A. Crystal structure of P. aerophilum PAE2754 (PDB 1V8O) with β-strands in dark blue and α-helices in lighter blue.14 N and C termini as well as conserved acidic residues of the active site are shown in sticks. B. Crystal structure of T4 RNase H, structurally aligned to the PIN domain in A.6 Termini and conserved active site residues are highlighted. The dashed line represents a part of the structure that was not resolved in the crystal structure. C. Structure-based sequence alignment of two PIN domain proteins (PAE0151 and PAE2754) with T4 RNase H. Conserved active site residues are shown with red colours and the conserved threonine near the active site is shown in green. D. Crystal structure of P. aerophilum PAE0151 (PDB 2FE1), structurally aligned to PAE2754 in A. Active site residues and the bound Mn2+ ion are shown in detail. E. The homodimer observed in the crystal structure of PAE0151 represents the canonical PIN domain dimer. The monomers are shown in two shades of grey with their active sites highlighted in detail. Figure 3. Structure of the FitAB complex. A. Overview of type II toxin-antitoxin regulation and expression. The antitoxin (A) and toxin (T) are transcribed from the same promoter (P) into a bicistronic mRNA that is translated into two proteins, A and T. Under normal cellular conditions, these associate into a stable TA complex, which, during stress, breaks up by proteolytic degradation of the antitoxin. This produces the active toxin, which functions as a specific tRNase/rRNase. B. Structure of the heterooctameric FitAB complex bound to DNA (PDB 2H1O, DNA not shown).38 One FitA2B2 heterotetramer is highlighted in colour with the FitB PIN domains in blue/yellow and the FitA antitoxins in red. The helices involved in dimer formation (α3 and α5) are highlighted in yellow and orange, respectively. C. Details of the interactions of the FitA C-terminus with the FitB (PIN domain) active site. Helices α1, α3 and α4 of the toxin required for antitoxin binding are highlighted in 28 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 28 of 41

Page 29 of 41

Protein Science

yellow and the FitA C-terminus is shown with sticks. α3AT is the third helix of the antitoxin. D. Close-up view of the FitB active site with the conserved acidic residues highlighted with blue sticks. Arg68 in the FitA C-terminus that specifically interacts with the FitB active site, is shown with red sticks. Figure 4. PIN domain proteins from M. tuberculosis. A. Crystal structure of M. tuberculosis VapC-5 (PDB 3DBO).40 The PIN domain is shown as cartoon (β-strands in dark blue and α-helices in lighter blue) with a semi-transparent surface representation on top. The "clip" domain (also known as "lid" domain) is highlighted in yellow and the active site as well as VapB-5 C-terminal tail are shown with grey and red sticks, respectively. B. The two different conformations of the PIN domain found in the structure of M. tuberculosis VapC-3 (PDB 3H87).41 Both domains are structurally aligned to those in panel A show how in one molecule, the active site is bound to Mg2+ (left) while in another molecule, it is blocked by Arg73 from another VapB-3 antitoxin (green). C. Details of the active site found in M. tuberculosis VapC-15 (PDB 4CHG) with the VapB-15 antitoxin shown in red.42 Both a Mn2+ and a Mg2+ are bound and stabilised by Glu67 from the antitoxin (red). The inset schematic shows the observed 2:1 inhibition in which one VapB-15 antitoxin is able to interact with two VapC-15 active sites. D. Details of the active site found in M. tuberculosis VapC-30 (PDB 4XGQ) with the VapB-30 antitoxin shown in red.43 A single Mg2+ ion is bound and the active site interacts, although weakly, with the antitoxin in the distal mode as shown in the inset. Figure 5. PIN domains involved in nonsense-mediated decay. A. The crystal structure of the PIN domain from human SMG5 (hEST1B, PDB 2HWY)62 corresponding to residues 855-923 of the full length protein. Only one active site residue is conserved (Asp860). Flexible parts of the structure are shown with dashed lines. B. Crystal structure of the human SMG6 (hEST1B, PDB 2HWW)49,62 corresponding to residues 1239-1418 of the full length protein. SMG6 has a complete PIN domain active site (Asp1353, Asp1392, and Asp1251) and an extended structure which is highlighted in orange. Flexible regions are shown with dashed lines. C. Close-up view of the SMG5 active site. D. Close-up of the SMG6 active site (green) superimposed on the active site of T4 RNase H (blue, PDB 1TFR).6 Conserved, acidic residues in the two proteins are labelled and the two Mg2+ ions in the RNase H structure are shown with green spheres. Figure 6. Eukaryotic PIN domain targeting a wide range of RNAs. A. Overview of the Chp1/Tas3 structure (PDB 3TIX) with the Chp1 Spoc, linker, and Domain II regions in light 29 John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

grey and the Tas3 N-terminal domain (NTD) in dark grey.63 The Chp1 PIN domain is shown in teal. B. Close-up of the Chp1 PIN domain active site (teal) overlaid on the active site of P. aerophilum PAE2754 with corresponding active site residues marked. C. Overview of the S. cerevisiae Rrp44p/Dis3 structure (PDB 2WP8) with the RNase II domain in grey (RNB) and the PIN domain in teal.61 D. Structural superposition of the Dis3 PIN domain (teal) and the P. aerophilum PAE2754 PIN domain (blue). Relevant active site residues are marked. Figure 7. Eukaryotic PIN domains targeting specific RNAs. A. Left, crystal structure of the human MCPIP1 PIN domain (PDB 3V32) with conserved active site residues and basic β hairpin (blue) indicated;64 right, in the same orientation, the crystal structure of bacteriophage T5 exonuclease (PDB 1EXN).76 B. Crystal structure of mouse Regnase-1 (PDB 5H9V) with the core PIN domain in purple and the extended, basic region shown in blue.67 Active site residues are indicated and shown with sticks. C. Dimer of Regnase-1 found in the crystal packing of this protein.67 D. Structure of P. horikoshii Nob1 (PhNob1, PDB 2LCQ), with the core PIN domain in light cyan and the associated zinc ribbon domain in green cyan.59 Active site residues (light cyan), RNA interacting residues (dark red), and the cysteines involved in zinc binding are shown with sticks. The zinc ion is shown with a grey sphere. E. Structure of yeast Utp23p (PDB 4MJ7).66 The core PIN domain is shown in blue with the active site residues indicated and the two additional helices in red with basic residues potentially involved in RNA binding highlighted. The zinc finger is shown with orange sticks. Figure 8. A conserved domain and active site. A. Left, structure of E. coli MurD ligase (PDB 5A5F), an example of the HUP fold, which is a minimal version of the PIN domain fold.83 The bound molecule of uridine-5'-diphosphate-N-acetylmuramoyl-l-alanine (UMA) is shown with sticks; right, the PIN domain of M. tuberculosis VapC-15 (PDB 4CHG) aligned to the HUP fold showing that the active site of these Rossmann-like folds are located in similar places at the C-terminal end of the parallel β-strands.42 The core helices and strands of the motif (which correspond to the HUP domain) are coloured blue and the "lid" region is indicated with a dashed circle. B. Active site of M. tuberculosis VapC-15 (PDB 4CHG) highlighting several conserved residues. The bound Mg2+ and Mn2+ ions are shown with green and purple spheres, respectively.42 C. Active site of N. gonorrhoeae FitB (PDB 2H1C) shown in the same orientation as in panel B.38 Apart from the conserved, acidic residues, this PIN domain has an Asn (7, pink) where VapC-15 has a Ser (6), a Thr (120, cyan) where VapC-15 has a His (112), and a Ser (125, red) where VapC-15 has an Asp (116).

30 John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 30 of 41

Protein Science

Figure 1. Overview of the PIN domain-like superfamily. A. A simplified view of the classification and subdivision of the PIN domain-like superfamily, CATH 3.40.50.1010, into structural clusters and functional families (FunFams).9 B. Structural superposition of members of the PIN domain-like superfamily showing the core secondary structure elements with colours (α-helices, light blue, β-strands, dark blue, loop regions, grey). C. Topology of the PIN domain with secondary structure elements indicated. The canonical secondary structure elements are numbered α1, α2, α3... and β1, β2, β3..., while the additional helices are numbered α1', α2', and α3'. 104x58mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 32 of 41

Page 33 of 41

Protein Science

Figure 2. Archaeal PIN domain structures. A. Crystal structure of P. aerophilum PAE2754 (PDB 1V8O) with βstrands in dark blue and α-helices in lighter blue.14 N and C termini as well as conserved acidic residues of the active site are shown in sticks. B. Crystal structure of T4 RNase H, structurally aligned to the PIN domain in A.6 Termini and conserved active site residues are highlighted. The dashed line represents a part of the structure that was not resolved in the crystal structure. C. Structure-based sequence alignment of two PIN domain proteins (PAE0151 and PAE2754) with T4 RNase H. Conserved active site residues are shown with red colours and the conserved threonine near the active site is shown in green. D. Crystal structure of P. aerophilum PAE0151 (PDB 2FE1), structurally aligned to PAE2754 in A. Active site residues and the bound Mn2+ ion are shown in detail. E. The homodimer observed in the crystal structure of PAE0151 represents the canonical PIN domain dimer. The monomers are shown in two shades of grey with their active sites highlighted in detail. 199x221mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Figure 3. Structure of the FitAB complex. A. Overview of type II toxin-antitoxin regulation and expression. The antitoxin (A) and toxin (T) are transcribed from the same promoter (P) into a bicistronic mRNA that is translated into two proteins, A and T. Under normal cellular conditions, these associate into a stable TA complex, which, during stress, breaks up by proteolytic degradation of the antitoxin. This produces the active toxin, which functions as a specific tRNase/rRNase. B. Structure of the heterooctameric FitAB complex bound to DNA (PDB 2H1O, DNA not shown).38 One FitA2B2 heterotetramer is highlighted in colour with the FitB PIN domains in blue/yellow and the FitA antitoxins in red. The helices involved in dimer formation (α3 and α5) are highlighted in yellow and orange, respectively. C. Details of the interactions of the FitA Cterminus with the FitB (PIN domain) active site. Helices α1, α3 and α4 of the toxin required for antitoxin binding are highlighted in yellow and the FitA C-terminus is shown with sticks. α3AT is the third helix of the antitoxin. D. Close-up view of the FitB active site with the conserved acidic residues highlighted with blue sticks. Arg68 in the FitA C-terminus that specifically interacts with the FitB active site, is shown with red sticks. 170x169mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 34 of 41

Page 35 of 41

Protein Science

Figure 4. PIN domain proteins from M. tuberculosis. A. Crystal structure of M. tuberculosis VapC-5 (PDB 3DBO).40 The PIN domain is shown as cartoon (β-strands in dark blue and α-helices in lighter blue) with a semi-transparent surface representation on top. The "clip" domain (also known as "lid" domain) is highlighted in yellow and the active site as well as VapB-5 C-terminal tail are shown with grey and red sticks, respectively. B. The two different conformations of the PIN domain found in the structure of M. tuberculosis VapC-3 (PDB 3H87).41 Both domains are structurally aligned to those in panel A show how in one molecule, the active site is bound to Mg2+ (left) while in another molecule, it is blocked by Arg73 from another VapB-3 antitoxin (green). C. Details of the active site found in M. tuberculosis VapC-15 (PDB 4CHG) with the VapB-15 antitoxin shown in red.42 Both a Mn2+ and a Mg2+ are bound and stabilised by Glu67 from the antitoxin (red). The inset schematic shows the observed 2:1 inhibition in which one VapB-15 antitoxin is able to interact with two VapC-15 active sites. D. Details of the active site found in M. tuberculosis VapC-30 (PDB 4XGQ) with the VapB-30 antitoxin shown in red.43 A single Mg2+ ion is bound and the active site interacts, although weakly, with the antitoxin in the distal mode as shown in the inset. 168x161mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Figure 5. PIN domains involved in nonsense-mediated decay. A. The crystal structure of the PIN domain from human SMG5 (hEST1B, PDB 2HWY)62 corresponding to residues 855-923 of the full length protein. Only one active site residue is conserved (Asp860). Flexible parts of the structure are shown with dashed lines. B. Crystal structure of the human SMG6 (hEST1B, PDB 2HWW)49,62 corresponding to residues 12391418 of the full length protein. SMG6 has a complete PIN domain active site (Asp1353, Asp1392, and Asp1251) and an extended structure which is highlighted in orange. Flexible regions are shown with dashed lines. C. Close-up view of the SMG5 active site. D. Close-up of the SMG6 active site (green) superimposed on the active site of T4 RNase H (blue, PDB 1TFR).6 Conserved, acidic residues in the two proteins are labelled and the two Mg2+ ions in the RNase H structure are shown with green spheres. 155x174mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 36 of 41

Page 37 of 41

Protein Science

Figure 6. Eukaryotic PIN domain targeting a wide range of RNAs. A. Overview of the Chp1/Tas3 structure (PDB 3TIX) with the Chp1 Spoc, linker, and Domain II regions in light grey and the Tas3 N-terminal domain (NTD) in dark grey.63 The Chp1 PIN domain is shown in teal. B. Close-up of the Chp1 PIN domain active site (teal) overlaid on the active site of P. aerophilum PAE2754 with corresponding active site residues marked. C. Overview of the S. cerevisiae Rrp44p/Dis3 structure (PDB 2WP8) with the RNase II domain in grey (RNB) and the PIN domain in teal.61 D. Structural superposition of the Dis3 PIN domain (teal) and the P. aerophilum PAE2754 PIN domain (blue). Relevant active site residues are marked. 145x149mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Protein Science

Figure 7. Eukaryotic PIN domains targeting specific RNAs. A. Left, crystal structure of the human MCPIP1 PIN domain (PDB 3V32) with conserved active site residues and basic β hairpin (blue) indicated;64 right, in the same orientation, the crystal structure of bacteriophage T5 exonuclease (PDB 1EXN).76 B. Crystal structure of mouse Regnase-1 (PDB 5H9V) with the core PIN domain in purple and the extended, basic region shown in blue.67 Active site residues are indicated and shown with sticks. C. Dimer of Regnase-1 found in the crystal packing of this protein.67 D. Structure of P. horikoshii Nob1 (PhNob1, PDB 2LCQ), with the core PIN domain in light cyan and the associated zinc ribbon domain in green cyan.59 Active site residues (light cyan), RNA interacting residues (dark red), and the cysteines involved in zinc binding are shown with sticks. The zinc ion is shown with a grey sphere. E. Structure of yeast Utp23p (PDB 4MJ7).66 The core PIN domain is shown in blue with the active site residues indicated and the two additional helices in red with basic residues potentially involved in RNA binding highlighted. The zinc finger is shown with orange sticks.

John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 38 of 41

Protein Science

Figure 8. A conserved domain and active site. A. Left, structure of E. coli MurD ligase (PDB 5A5F), an example of the HUP fold, which is a minimal version of the PIN domain fold.83 The bound molecule of uridine-5'-diphosphate-N-acetylmuramoyl-l-alanine (UMA) is shown with sticks; right, the PIN domain of M. tuberculosis VapC-15 (PDB 4CHG) aligned to the HUP fold showing that the active site of these Rossmannlike folds are located in similar places at the C-terminal end of the parallel β-strands.42 The core helices and strands of the motif (which correspond to the HUP domain) are coloured blue and the "lid" region is indicated with a dashed circle. B. Active site of M. tuberculosis VapC-15 (PDB 4CHG) highlighting several conserved residues. The bound Mg2+ and Mn2+ ions are shown with green and purple spheres, respectively.42 C. Active site of N. gonorrhoeae FitB (PDB 2H1C) shown in the same orientation as in panel B.38 Apart from the conserved, acidic residues, this PIN domain has an Asn (7, pink) where VapC-15 has a Ser (6), a Thr (120, cyan) where VapC-15 has a His (112), and a Ser (125, red) where VapC-15 has an Asp (116). 148x138mm (300 x 300 DPI)

John Wiley & Sons This article is protected by copyright. All rights reserved.

Page 40 of 41

Structural conservation of the PIN domain active site across all domains of life.

The PIN (PilT N-terminus) domain is a compact RNA-binding protein domain present in all domains of life. This 120-residue domain consists of a central...
3MB Sizes 0 Downloads 8 Views