Biochimica et BiophysicaActa. 1076(1991)225-232

225

© 1991ElsevierSciencePublishersB.V.(BiomedicalDivision)0167-4838/91/$03.50 ADONIS 016748389100052F BBAPRO3382:5

Sequence similarities within the family of dihydrolipoamide acyltransferases and discovery of a previously unidentified fungal enzyme George C. Russell and J o h n R. Guest Krebs Institute, Departmentof MolecularBiology and Biotechnology, Universityof Sheffield, WesternBank, Sheffield (U.K.)

(Received12July t990) Keywords: 2-Oxoaciddehydrogenasecomplex;Dihydrolipoamideacyltransferase;Mitochondrialribosomalprotein; Proteinsequencesimilarity A composite protein sequence database was searched for amino acid sequences similar to the C-terminal domain o[ the dihydrolipoamide acetyltransfernse subunit 0g2p) of the pyruvate dehydrogenase complex of Eseherlchia coll. Nine sequences with extensive similarity were found, of which eight were E2 subunits. The other was for a putative mitoehondrial ribosomal protein, MRP3, from Neurospora crassa. Alignment of the MRP3 and E2 sequences showed that the similarity extends through the entire MRP3 sequence and that MRP3 is most closely related to the E2p sulamit of the pynlvate dehydroganase complex from Sacc~aeomyees cerevisiae, with 54% identical residues and a tmlher 36% that are conservatively substituted. Other features of the MRP3 gene and protein are also consistent with it being the aeyitransferase subunit of a 2-oxo acid dehydrogenase complex. A multiple alignment of 13 E2 sequences indicated that 120 (34%) of 353 equlvalenced residues are identical or show some degree of conservation, it also identified residues that are potentially important for the structure, catalytic activity and substrate-specifieity of the acyltransferases. Introduction The 2-oxe acid dehydrogenase complexes are composed of multiple copies of three enzymic subunits which catalyse the oxidative decarboxylation of the 2-oxo acid, with the formation of acyi.CoA, CO 2 and NADH. Central to the structure and function of these multienzyme complexes are the dihydrolipoamide acyltransferase subunits (E2). They form the structural cores of the complexes, provide binding sites for the decarboxylase (El) and dihydrolipoamide dehydrogenase subunits (E3), carry the lipoyl cofaetors which convey reaction intermediates between active sites and catalyse the transfer of acyl groups from lipoamide to CoA [1,2]. The dihydrolipoamide acetyltransferase subunits (E2p) of the pyruvate dehydrogenase (PDH) multie~lzyme complex of Escherichia coil possess a higtdy segmented structure, consisting of a number of indepen-

Abbreviations:PDH, pymvatedehydrogenase;ODH, 2-oxoglutarate dehydrogenase. Correspondence:J.R. Guest.KrebsInstitute,Departmentof Molecular Biologyand Biotechnology.Universityof Sheffield,WesternBank, SheffieldSIO2TN, U.K.

dendy folding domains separated by short stretches of polypeptide chain with a high degree of conformational mobility [1-3] (Fig. 1). The N-terminal half of the E2p polypeptide chain comprises three tandemly repeated lipoyl domains of about 80 residues, each containing a lipoylatable lysine residue. The lipoyl domains are connected to a 50 residue subunit-binding domain, required for binding the E3 component. The remaining 240 Cterminal residues represent the core-forming catalytic domain which contains the acetyltransferase activity and determinants for self-assembly and E1 binding. The interdomain sequences are unusually rich in alanine, proline and charged residues and are thought to be responsible for the ~H-NMR-detected polypeptide chain mobility. The domain sequences are sufficiently well-conserved in evolution for analogous segments to be clearly identified in the acyltransferase subunits (E2) of all of the 2-oxo acid dehydrogenase complexes so far examined. Thus the E2 subunits of the pyruvate (E2p), 2-oxoglutarate (E2o) and branched,chain 2-oxo acid (E2b) dehydrogenase complexes represent a family of enzymes sharing the same arrangement of domains and tinker sequences (Fig. 1). The only variations are in the number of lipoyl domains per E2 chain, and in the apparent lengths and compositions of the intcrdomain linker

226 sequences [1,2]. In this paper E2 sequences of mammalian, fungal and bacterial origin have been compared in order to identify residues which might be important in the structure and catalytic mechanism of the dihydrolipoamide acyltransferases, and a putative ribosomal protein has been assigned to this family of enzymes.

parison, since similarity is much reduced in the interdomain linker regions, but information from proteolytic or biochemical studies was also used, where available [10,12-14].

Methods

ldemificaiion of a new member of the family of dihydrolipoamide acyltansferases Using the sequence of the catalytic domain of the dihydrolipoamide acetyltransferase (E2p) of E. coil as a probe to search the OWL protein sequence database, a sequence with high similarity designated MRP3 (mitochondrial ribosomal protein 3) was detected in addition to eight previously identified E2 sequences. The MRP3 sequence had been derived from a eDNA clone isolated by immunological screening of a Neurospora crassa eDNA library using an antiserum raised against N, crassa mitochondrial ribosomal proteins [15]. Comparing the amino acid sequences of MRP3 and E. coil E2p showed that MRP3 contains regions analogous to the lipoyl, subunit-binding and catalytic domains present in all of the E2 sequences examined (Fig. 1). Furthermore, the putative domains of MRP3 are flanked by short stretches of poiypeptide chain rich in alanine, proline, threonine and charged amino acids, similar to the interdomain linkers of the known E2 subunits. A dendrogram (Fig. 2), based on sequence comparisons between MRP3 and twelve E2 sequences of mammalian, fungal and bacterial origin [3,5-10,1522], generated by CLUSTAL, indicated that the MRP3 sequence is most similar to the E2p sequence of Sac-

Sequence analysis was carried out using the Science and Engineering Research Council SEQNET facility at Da~esbury. All of the E2 sequences in the composite OWL protein sequence database [4] were detected using the searching program SWEEP with the highly conserv.'d 240-residue C4emunal domain of E2p from E. coli [3] as the 'probe' sequence. The E2o sequences of Bacillus subtilis [5] and Azotobacter vinelandii [6], the human placental E2p sequence [7,8], the C-terminal region of human E2b [9], and the N-terminal region of Bacillus stearoth~rmophilus E2p [10] were added manually. The sequences were analysed initially using the multiple alignment program CLUSTAL [11]. This program ~enerates a dendrogram by cluster analysis of pairwise sequence similarity scores and produces a multiple ali:~,r.ment of the clustered sequences. The alignment was then refined to take into account variable features, such as leader peptides on the six eukaryotic E2 sequences; multiple lipoyi domains in some of the E2p sequences; and the interdomain linkers which show little sequence similarity [1,2]. Domain boundaries in the alignment were defined mainly by sequence corn-

subunit-binding and catalytic domains

lipoyl domains

f--

! I 1

2

Results and Discussion

I

3

~ : : ~ " " - ~ ' ~ ~ l f ~

[ E2pEco. E2pAvt

I ~2pnuM

i

i

~

~

i~i E2plqAT

I E2oAv|, E2oBsu, E2oEco. E2bPpu. E2bBOV, E2bHUM. E2pSce. MEP3 ~ : ~ ~2pest Fig. t. Schematicrepresentation of the domain organizations of dihydrolipoamideacyhransferase subunhs (E2). Conserveddomains are shown as boxes and interdomain linkers are indicated by zig.za$ lines. The sources and designations of the sequencesare as follows"E2bBOV,bovine[17]. E2bHUM, human I9,16], E2bPpu, Pseudomonasputida [19]; E2pHUM1, ht,raan liver [22]; E2pHUMp, human placenta [7,8]; E2pRAT, rat (derived from an incompleteeDNAclone)[7,8]; E2pSce, Saccharomyces¢erevisiae [18]; MRP3, Neurosporacrassa mitocbondrialribosomalprotein 3 [15l; E2pAvi, Azotobactervinelandii [20]; E2pEco, Escherichiacoil [3]; F.2pBst, Bacillusstearothermophilus (2It amino-terminal residues)[10]; E2oBsu, Bacillussubtilis [5]; E2pAvi, Azolobacletvinelandii [612and F.2oEco,Escherichiacoil [51].

227 E2pHUM E2p~AT - ~ ' E2pSce MRP3 E2oBsu

I

E2oAvi

E2oEeo

!-Ancestral E2

E2bBOV E2bHUM ~ E2bPpu ERpBst

I [

E~pAvi E2pEco [ Fig. 2. Dendrogramproducedby the CLUSTALprogrambased on the extent of pairwisesimilaritiesbetweenE2 sequences.The sequence nomenclatureis givenin the legendto Fig.1. Thepositionof the incompleteE2pBstsequenceis basedon comparisonswithequivalent~ n t s of the otherproteins.

charomyces cerevisiae. In the alignment of MRP3 and :,he yeast E2p (E2pSce) sequences, some 240 (54%) of 446 equivalenced residues are identical and a further 160 (36%) are conservatively substituted (Fig. 3). The main differences are net deletions of three residues in linker 3 of E2pSce and 26 residues in linker 4 of MRP3. Both sequences have leader peptides which are processed at the same position upon entering the mitochondrion [15,18]. Interestingly, the leader sequences are not as well conserved as the mature polypcptides, having only 8 identities in 25 aligned residues (Fig. 3). When compared with another mitochondrial ribosomal protein (MRP15) isolated by the same method [15], MRP3 was anomalous with respect to its codon usage, expression and distribution. The codon usage of the MRP3 gene corresponds to that of an efficiently expressed gene, unlike the MRP15 gene. The MRP3 mRNA was approximately 3-fold more abundant than that of MRPlS, The MRP3 protein was 30 to 50-fold yore abundant than other mitochondrial ribosomal proteins, and only 2-3~ of the total MRP3 protein in the mitochondria was associated with tile ribosomes, while all of the MRPl5 polypepfide was ribosomebound. The MRP3 protein also exhibited an anomalously low mobility in SDS-PAGE, which is typical of E2 subunits and is probably due to the interdomain linker sequences [23]. These observations are consistent with the MRP3 cDNA clone of N. crassa encoding a dihydrofipoamide acyltransferase component of a 2-oxo acid dehydrogenase complex. This cDNA clone was presumably

isolated because of the abundance of the MRP3 mRNA and protein.

Multiple sequence alignment for the acyltransferasefamily General features. The multiple alignment shown in Fig. 3 is composed of three distinct sections: the lipoyl domains (20 sequences of approx. 82 residues); the subunit-binding domains (14 sequences of 36 residues); and the catalytic domains (13 sequences of about 235 residues). The leader sequences and interdomain linkers were aligned only where there is a high degree of similarity. In the alignment, the residues at 20 positions are identical in all of the sequences (* in Fig. 3). These include the lipoyl-lysine residues, and putative active site residues (e.g., His-602 of E2pEco) [1,24,25]. The residues at 60 positions are identical in at least twothirds of the sequences, the number of mismatches at each position being indicated below the alignment in Fig. 3. At a further 40 positions there are fewer identities but the residues are substituted conservatively, In all, 120 (34~) of the 353 afigned residues show some degree of conservation. These presumably include residues which are important for the structural integrity, subunit interactions and catalytic activity of the E2 components. The most conserved residues are potential targets for site-directed mutagenesis, which may, in the absence of a molecular structure, shed some fight on their function. Two human E2p sequences are included in the multiple alignment (Fig. 3). They derive from liver and

228

E2bBOV E2DHUM E2pHUMI E2pHUMp E2pSce MRP3

AAALVLRTWSRAAGQLICVRYFQTCGNVHVFKPKYVCFFGYPPFKYSHPYQWLKTTAALQ ..... LRTWSRNAGKLICVRYFQTCGNVHVLKPNYVCFFGYPSFKYSHPHHFLKTTAALR RVTSRSGPAPARRNSVTTGYGGVRALCGWTPSSGATPRNRLLLQLLGSPGRRYY SPHCSTTYLRTLGRTTMFWKTTEGRDGKMAVQEFSEFG--LLLQLLGSPGRRYy SAFVRWPR--ISRSSVLTRSLRLQLRCY --IVPVLSRQALRHAsVARVALPSLTRWY < ..................... Leader peptide .....................

60 55 54 52 27 27

E2pAv~ E2pEco

.... SE-IIRVPDIG-GDGE--VIELLVKTGDLIEVEQGLWLESAKASM~VPSPKAGWKSVSVKLGDKLKEGDAIIELE ..... AIEIKVPDIG-AD-EVEITEILVKVGDKVEAEQSLITVEGDKASMEVpSPQAGIVKEIKVSVGDKTQTGALIMIFD 2 . . . . I ~4 . . . . 4.1~3 . 2 . . 4. . < ............................ Lipoyl domain 1 ..................................

E2pAvi E2pECO

PAAGAAAAPAEAAAVPAAPTQAVDEAEAPSPGASATPAPAAASQE SADGAADAAPAQAEEKKEAAPAAAPAAAAAKD < ............. Linker 1 ....................

E2pHUMI E2pHUMp E2pAvi E2pEco

-SLpPHQKVPLPSLSpTMQAGTIARWKKKEGD•INEGDLIAEVETDKATVGFESLEECYMAKILVAEGTRDVPIGAIICITV -SLpPHQk~VpLpSLSpTMQAGTIARWF~KKEGDKINEGDLIAEVETDKAT-~GFESLEECYMAKIL~AKGTRDVpIGAIICITV ........ VRVPDIG-SAGKARVIEVLVKAGDQVQAEQSLIVLESDF~.SMEIPSPASGWESVAIQLNAEVGTGDLILTLR........ VNVPPIG-SD-EVEVTEILVKVGDKVEAEQSLITVEGDKASME~PAPFAGTVKEIKVNVGDKVSTGSLIMVFE2 . . . . i e4 . . . . 4.1~3 . 2 . . 4. , < ................................... Lipoyl domain 2 ...........................

E2pHUMI E2pHUMp E2pRAT E2pAvi E2pEco

GKPEDIEAFKNYTLDSSAAPTPQAAPAPTPA~TASPPTPSAQAPGSSYPPH ~KPEDIEAFKNYTLDSSAAPTPQAAP%PTPA%TASPPTPSAQAPGSSYPPH G-P---EAFKNYTLDSATAAT-QAA~PAKAPAAAPAAPSASAPGSSYP%q~ TTGAQAQPTAPAAAAAASPAPAPLAP;AK VAGEAGAAAPAAKQEAAPAAAPAPAA < ............ Linker 2 ...........................

> 73 74 >

118 106 > 135 !33 190 177 >

186 184 46 219 203 >

E2bBOV E2bHUM E2bPpu E2pHUMI E2pHUMp E2pRAT E2pSce MRP3 E2pAvi E2pEco E2pBst E2oBsu E2oAvi E20Eco

---G~IvQFKLSDIGEGIREvTVKEWY~KEGDTV~QFD~I~EVQSDKASVTITsRYDG~IKKL¥YNLDDT~YVGKPLVDIET ---GQIVQFKLSDIGEGIREVTVKEWYVKEGDTVSQFDSICEVQSDKASVTITSRYDGVIKKLYYNLDDIAY~GK~LVDIET .... GTHVIKMPDIGEGIAQVELVEWFVKVGDIIAEDQVVADVMTDKATVEIPSPVSGKVLALGGQPGEVMAVGSELIRIE...... MQVLLPALSPTMTMGTVQRWEKKVGEKLSEGDL~IETDKATIGFEVQEEGYLAKILVPEGTRDVPLGTPLCIIV ...... MQVLLPALSPTMTMGTVQRWEKKVGEKL~ZGDLLAEIETD~ATIGFEVQEEGYLAKILVPEGTRDVPLGTPLCIIV ...... MQIVLPALSPTMTMGTVQRWEKKVGEKLSEGDLLAEIETDKATIGFEVQEEGYLAKILVPEGTRDVPLGTPLCIIV ASY•EHTIIGM•ALSPT•TQGNLAAWTKKEGDQLSPGEVIAEIETDKAQMDFEFQEDGYLAKILVPEGTKDIPVNKPIA•YV ASYPPHTVVKM~ALSPTMTSGGIGAWQK~PGDKIEPGEVLVEIETDKAQHDFEFQEEGVLAKILKDSGEKDVAvGNPIAILV .... GPQEVKVPDIG-SAGKARVIEVLVKAGDQVQAEQSLIVLESDKASMEIPSPAAGVVESVAVQLNAEVGTGDQILTLRV .... GVKEVh"~PDIG-GD-EVEVTEV~GDKVAAEQSLITVEGDKASMEVPAPFAGVVKEI/CVNVGDK%qKTGSLIMIFEV ..... AFEFKLPDTGEG!HEGEIVKWFVKPGDEVNEDDVLCEVQNDKAVVEIPSPVKGKVLEILVPEGTVATVGOTLITL__ ...... AEIKVPELAESISEGTIAQWLKQPGDYVEQGEYLLELETDKVNVELTAEESGVLQEVLKDSGDT.VQVGEIIGTIS ..... AIDIKAPTFP£SIADGTVATWHKKPGEPVERDELIVDIETDKVVMEVLAEADGVIAEIVKNEGDT-VLSGELLGKLT .... SSVDILVPDLPESVADATVATWHKKPGDAVVRDEVLVEIETDKVVLEVPASADGILDAVLEDEGTT-VTSRQILGRLR 2 . . . . 1 ~4 . . . . 4,1~3 • 2 • • 4, , < .................................... Lipoyl domain 3 ............................. >

E2bBOV E2bHUM E2bPIDu E2pHUMI E2pHUMp E2pRAT E2pSce MRP3 E2pAvi E2pEco E2pBst E2oBsu E2oAvi E2oEco

EALKDSEEDWETPAVSHDEHTHQEIKGQKTLA EALKDSEEDWETPAVSHDEHTHQEIKGRKTLA VEGSGNHVDVPQAKPAEVPAAPVAAKPEPQKDVKPAAYQASASHEAAPIVpRQpGDKPLA EF~DISAFADYRPTEVTDLKP0VPPPTPPPV.~AVPPTPQPLAPTPSAPCPATPAGPKGRVFV EKEADISAFADYRPTEVTDLKPQVppPTppPVAAVPPTPQPLAPTPSAPCPATPAGPKGRVFV EKQEDIAAFADYRPTEVTSLKPQAPPPVPPPVAAVPPIPQPLAPTPSAA .... PAGPKGRVFV EDKADVPAFKDFKI~DSGSDSKTSTKAQPAEPQAEKKQEAPAEET],~SAPEAKKSDVAAPQGRIFA EEGTD'47~AFKDFTLKDAGGETS ...... PAVPKDEPKNESTASAPTPAPTPAPEPENTSFTGR-FQTALEREPNAL AGAAPSL~PPARGSPGQ~%~PGAA~PAPVCAPSRNGAKVHA EGAAPA/APAKQEAAAPAPAAKAEAPAAAPAAKAEGKSEFAENDAYVHA DAPGYENMTFKGQEQEEAKKEEKTETVSKEEKVDAVAPNAP~EAEAGPNRRVIAM EGAGESSAPAPTEKTESKESVKEEKQAEPAAQEVSEEAQSEIKSRTIA EGGAATAAPAAAPAPAAAAPAAAEAPIL EGNSAGKETSAKSEEKASTPAQRQQASLEEQNNDAL < ............................... Linker 3 .................................

E2bBOV E2bHUM E2bPpu E2pHUMI E2pHUMp E2pRAT E2pSce MRP3 E2pAvi E2pEco E2pBst E20Bsu E2oAvi E2oEco

TPAVRRLAMENNIKLSEVIGSGKDGRILKEDILNYL TPAVRRLAM.ENNIKLSEVVGSGKDGRILKEDILNYL SPAVRkI%ALDAGIELRY~HGSGPAGRILHEDLDAFSPLAKKLAVEKGIDLTQVKGTGPDGRITKKDIDSFV ~PLAKKLAVEKGIDLTQVKGTGPDGRITKKDIDSFV S P L A K K L A A E K G I D L T Q V K G T G P E G R I I K K D I DS F V S P L ~ K T I A L E K G I S L K D V H G T G P R G R I T K A D I ES Y L - PAAKRLAREKGI D~KGS GPGGKITEEDVKKAL GPAVR~ LAREFGVELAAI NSTGPRGRI LKE DVQAYV T P L IR R L ~ d l E F G V N [ A K V K G T G R K G R I L R E D V Q A Y V - PSVRKYAREKGVDI RLVQGTGKNGRVLKEDI DAF L S PSARKIAREKGI DLSQVPTGDPLGRVRKQDVEAY E S PAARKIAEENAI AADS ITGTGKGGRVTKED- -AVA S PAIRRLLAEHNLDASAI KGTGVGGRLTREDVE - - . 41

1 44

3

3

2.]

< ...... S u b u n i t - h i n d i n g

~14

4

domain

..... >

208 203 i72 361 359 218 211 213 374 364 166 159 138 146

172 167 137 325 323 182 175 178 338 328 131 123 104 113 >

139 134 77 262 260 123 109 109 296 279 75 75 76 77

229 E2bBOV Z2bh"JM E2bPpu E2pHIJM1 E2pHUMp

MSKPQSAAGQTPNGYARRTDSEQVP

E2pRAT E2pSCe ~P3

EKSSKQSSQTSGAAAATPAAATSSTTAGSAPS PSSTAS- ~EDVp

EKQTGAI LPPSPKAE IMPPPPKPKDRTI PI Pl SKPPVFIGKDRTEP EKQTGAI LPPSPICVE I M P P P P K P K D M ~ P I LVSKPPVFTG KDKTE P PS |,5/APAPAAWPPTGPGMA PVPTGVFT DIP P S K V A P A P A A W P P T G P G M A P V P T G V FT DIp P T K A A P A A A A A A P P G P R V A P T P A G V F IDI P

25~ 249 197 392 390 2~ 8 254

........................... ASAPAAGAA/dkA~TDVP

230

E2pAvi

KAMMQKAKEAPAAGAASGAG IPPIPPVDFARYGE IEEVP

413

E2pEeo

ILF2kI KPd%F2kAPAATG GG I PGMLPW PKVDFS KFGE ~ ~EVE AGGAKPAPAAAE EKAAPAAAKPATTEGE F PETRE K KPAS K-~APQQR~P0AQ I(AQQS FDK pVEVQK AAE AKES A P A G Q P A P A A T A A P L F A A G D R V E ~ v p KHLAKA PAKESAPAAAA P A A Q P A r ~ S EKRVP ................. Linker 4 ................ >

403 201 190 17 2 179

E2pBst

E2oBsu E2oAvi

E2oEco

e ~

v w

,roe

v

v

v

T

%,"

o

E2bBOV E2bHUM E2bPpu E2pHUMI E2pHUMp E2pRAT E2pSce MRP3 E2pAvi E2pEco E2pBst E2oBsu E2oAvi E2oEco

VKGFH~K~S~-~IP~G~CDEVDLTELVK~EELKPIAF~--GI~LSF~FFL~S~LLQ~PI~ASVDE~

3~ !

! KGFQ KA~

250

256

481 476 422 615 612 456 481 457 637 629 416 398 404

Fig 3. Multiple alignment for dihydrolipoamide acyltransferase sequences. The alignment reflects the domain organization of the E2 polypeptides shown in Fig. 1. The leader sequences and interdomain linkers are included at their correct positions but not aligned. Conserved positions are indicated as follows: asterisk ( * ), identical in all sequences; numbers (1-4), identical residues in at least two-thirds o[ the sequences (values indicate the number of sequences with mismatched residues); full-stop (.), residues identical in less than two-thirds of the aligned sequences but otherwise conservatively substituted. Note that the lipoyl domains were aligned as a single group and then split up for ihe insertion of the interdomain linker sequences. Residues in the catalytic domain which are identical or functionally conse~'ed within a single type of acyltransferase are boxed and marked above the alignment as follows: E2o-specific, vv; E2b.specific, o@: and E2p-specific, ¢,. Filled symbols indicate identical residues, open symbols indicate functionally conserv'ed re:ddues. Substrata-specific residues which are also found in the MRP3 sequence are marked M above the alignment. Residue~ in the human placental E2p sequence (E2pHUMp) which differ from their counterparts in imman liver E2p (E2pHUMlt are underlined

230 placental cDNA clones, and their sequences differ at 45 positions, 38 being in the leader peptide, the others being at non-conserved positions in the mature E2 polypeptides. Lipoyl domains. The lipoyl domains of the acyltransfcrases perform essential roles in the reaction mechanism of the 2-oxo acid dehydrogenase complexes, carrying substra:e and reducing equivalents between the acfive sites of the complexes. There are identical residues at only two positions in all of the lipoyl domains: the lipoyl-lysine residue (Lys-40 in E2pEco); and a glycine residue (Gly-24 in E2pEco) which could perform a critical structural role (Fig. 3). Most of the highly conserved positions ,:ontain charged or polar residues, while the majority of positions where conservative substitutions are found contain hydrophobic residues, suggesting that these positions occupy the interior of the folded domain [26]. This is consistent with hydrophobic interactions being important for maintaining the three-dimensional structure of the domain, while hydrogen bonding or charge interactions are responsible for the transient protein-protein recognition necessary for catalysis. Free lipoic acid or lipoamide function as substrates in the E2- and E3-catalysed reactions, but the E1 subunits seem to require the cognate lipoyl domain [27]. The E1 specificity is presumably based on contacts with surface residues on the lipoyl domains, whereas interactions between the lipoyl domains and the E2 and E3 active sites may be less dependent on specific residues. This implies that conservation between the lipoyl domains reflects structural constraints rather than those imposed by the El-specific interactions. It is likely that the post-translational lipoylation of the folded lipoyl domains is also dependent on contacts with unconserved residues, because the bovine E2b subunit is not lipoylated in E. coli [28]. Thus, the incompletely conserved residues flanking the lipoyl-lysine could be important in defining the E1 and lipoylation specificities. Subunit-binding domain. Here a segment of approximately 40 residues is conserved in all of the E2 subunits, although it was not originally recognised in the E2b sequence of Pseudomonas putida [19]. It represents one of the most highly conserved parts of the E2 subunits, with 17 of the 36 residues exhibiting some degree of conservation, but its function seems to vary. It has been implicated in E3-binding in the E. coli PDH and ODH complexes, and in binding of the E1 and E3 subunits in the A. vinelandii and B. stearothermophilus PDH complexes [10,12-14]. In the eukaryotic complexes an additional lipoic acid-containing subunit, component X, has been implicated in binding most of the E3 subunits, and the subunit-binding domain of E2p is involved in binding the E1 subunits [29]. The majority of the conserved residues in this domain are

hydrophobic or polar residues, suggesting that all of the subunit-binding domains have a similar structure. Since the El and E3 subunits can be resolved from the complexes by high ionic strength or high pH, it is possible that differences in the patterns of charged residues in the subunit-binding domains determine binding specificity. Catalytic domain. The carboxy-terminal domain of the E2 polypeptides contains the determinants for selfassembly, acyltransferase activity and El-binding. In previous comparisons of the primary and predicted secondary structures of the E. coil E2 catalytic domains and the chloramphenicol acetyltransferases (CAT), a remote but significant homology was detected, and it was suggested that the acyltransferase activities of tbe E2 subunits might be mechanistically similar to the histidine-mediated general base catalysis proposed for CAT [1,24,30]. An H.-DG motif in the active site of CAT occurs at equivalent positions in E2pEco and E2oEco [24] and a sequence which includes this motif, R D DHK-NG, is conserved in all of the E2 subunits aligned in Fig. 3. Consistent with the role attributed to this histidine residue, its replacement by tyrosine in CAT [31] and cysteine in E2pEco [25] have been shown to abolish acetyltransferase activity. However, this contrasts with a recent report that alanine and asparagine replacements have no effect on the catalytic activity of the E2pSce core [32]. Clearly further work is needed to resolve this disparity. Several other residues in the CAT sequence have been assigned catalytic or structural roles by in vitro mutagenesis [33,34]. In particular, Set-148 has been shown to act in catalysis by stabilizing a tetrahedral intermediate, and Asp-199 appears to be involved in an important salt-bridge with Arg-18. Potential counterparts of these residues can be identified in the multiple alignment (e.g., Set-550 and Asp-606 of E2pEco, Fig. 3), and their importance is presently under investigation in the E2p of E. coil The other conserved residues in the E2 catalytic domain could perform essential structural roles or be involved in cofactor binding. It is unlikely that any of the conserved sites confer substrate specificity, since acetyl-, succinyi- and branched-chain 2-oxo acyl-transferases are included in the alignment. Substrate-specific residues, tentatively defned as those which are identical or functionally conserved in a single class of acyltransferase, are indicated above the alignment in Fig. 3. The E2o sequences have 25 residues in common (v and • in Fig. 3), whereas there are 9 E2b-specific residues (o and • in Fig, 3), and 3 E2pspecific residuez ( ~ in Fig. 3). Most of the E2p-, E2b-, and E2o-specifc residues occupy different positions in the respective catalytic domains, but five of the E2bspecific residues occupy the same positions as E2o-

231 specific residues (3 and ~ in Fig. 3), suggesting that the corresponding acylated lipoyl domains may make contact with some of the same regions of the E2b and E2o catalytic domains. The number of apparently substratespecific residues in the E2 c,atalytic domains clearly depends on the number of sequences of each type (E2p, E2b or E2o) which are compared, and how closely the sequences are related. Thus the prokaryotie E2o sequences have more residues in common than the E2b or E2p sequences, which are of both prokaryotic and eukaryotie origin. Presumably residues which are conserved in a single class of prokaryotic and eukauotic acyltransferase are those which are most fikely to confer specificity. The MRP3 sequence contains the three putative E2p-diagnostic residues, but only one each of the E2oand E2b-specific residues (marked M above the alignmeat in Fig. 3), implyin~ that MRP3 is probably an E2p. An experimental analysis of the subs(rate specificity of MRP3 would provide a good test of the validity of the specificity assignments made above. Interdomain linker sequences. The linker sequences of the E2 subunits of the E. coli PDH and ODH complexes were the first to be characterized genetically and biochemically. Their unusual composition, 1H-NMRdetected conformadonal mobility and proteolytic sensitivity [1,2], are consistent with the linker peptides having extended structures that serve as flexible hinges between the folded domains of the E2 subunits. A similar extended and flexible conformation has been proposed for sequences in oilier proteins that are rich in alanine + proline (see Ref. 35 for a brief review). In the assembled E2 sequences, there is considerable variation in both length (17-66 residues) and composition of the linker sequences. Those rich in alanme + protine, or alanine + pr~.tine + charged residues, are the most common, but a smaller number wh{ch are rich in protine, charged residues, or proline + charged residues, are also evident (Fig. 3). When present, alanine and proline residues are generally clustered either at the ends of the tinkers (as in linker 2 of E2pRAT and linker 3 of E2pEco), or centrally (as in linker 4 of E2pEco and E2pAVI). Charged residues are also clustered, mainly at the extremities of the tinkers, although E2oBsu has a notable group of giutamine residues in the middle of linker 4 (Fig. 3). The variations in length and composition of the interdomain linkers suggests that they may be under little selective pressure, except to maintain the conformational mobility required for E2 function. Experiments in which the natural tinker sequences are replaced by pep(ides of known sequence and predicted secondary structure may help to clarify the role of the interdomain segments in the catalytic cycle of the E2 subunits.

Acknowledgements This work was supported by a project grant from the Science and Engineering Research Council to J.R.G. Sequence analysis was carri~ out using the SERC SEQNET computing facility at Daresbury. No~eadded in p oof(Received22 November1990) The sequence of the 8acillu~ sub(ills dihydro|ipoam,:de a¢¢tyltransferasehas been publishedrecemly[36]. Note added in proof (Received22 November1990) The sequence of the Bacillus sub(ills dihydrolipoamide acetyltransferase has been published recendy [36].

References I Guest, J.R, Angler, S.J.and Ru~sell,G.C. (1989) Ann. N.Y. Acad. Sci. 573, 76-99. 2 Perham, R.N. and Packman, L.C. (1989) Ann. N.Y. Acad, Sci.

573, 1-20. 3 Stephens, P.E., Dadison, M.G, Lewis, H.M. and Guest, J.R. (1983) Eur. J. Biochcm.133,481-4~9. 4 Akrigg, D.. Bleasby,A.J., Dix. N.I.M. [:india)',J.B.C., North, A.C.T.. Parry-Smith,D.. Wooton.J.C.. Blundel[.T.L., Gardner. S.P.. Hayes, F.. Islam. S., Sternberg, MJ.E, Thornton, £M., Tickle, IJ. and Murray-Rust,P. (1988) Nature 335,745-746, 5 Cadsson,P. and Hederstedt,L.0989)J. Bacteriol.171,3667-3672. 6 Weslphal, A.H. and de Kok, A. (1990) Eur. 3. Biochem. 187, 235-239. 7 Fussey,$.P.M.,Guest,J.R., James,O.F.W.,Bassendine,M.F.and Year,an, S.J. (1988) Proc.NatLAead.Scl. USA 85, 8654-8658. g Coppel, R.L., McNeilage, L.J., Surh, C.D., Van de Water, J., Spithill, T.W., Whittingham,S. and Gershwin,M.E. (1988) Proc. Natl. Acad.Sci. USA 85, 7317-'/32L 9 Danner,DJ, Litwer,S, Herring,W.J.and Elsas,LJ. (1989) Ann. N.Y. Acad.Sci. 573, 369-377. 10 Packman,L.C., Borges,A. and Perham,R.N. 0988) Biochem.J. 252, 79-86. 11 Higgins.D.G. and Sharp, P.M.(1988) Gene"/3,237-244~ 12 Packman,L.C.and Perham,R.N.(1987)Biochem.J. 242,531-538. 13 Hanemaaijer,R., de Kok,A, JuliUs, J. alld Veezoer,C. (1987) Eur. J. Biochem.169,245-252. 14 Packman,L.C.and Perham.R.N. (1986) FEBSlet(. 206. 193-198. 15 Kre~der,C.A., Langer,C.S. and Heckman,J.E. (1989) J. Biol. Chem. 264, 317-327. 16 Lau, K.S., Griffin,T.A., Hu, C.-W.C. and Chuang, D.T. (1988) Biochemistry27,1972-1981. 17 Griffin,T.A., Lau, K.,5~and Chuang,D.T. (1988) J. Biol.Chum. 263,14008-14014. 18 Nui, X.-D., Browning,K.S., Ikhal, R.H. and Reed, L.J. (1988) Proc. Natl. Acad, ScL USA 85. 7546-7550. 19 Burns,G., Brown,T., Hatter, K. and Sokatch0J.R. (1988) Eur J. Biochem.1"/6,165-169. 20 l-lanemaaijer,R., Janssen, A., De Kok, A. and Veeger,C. (1988) Eur. J. Biochem.174,593-599. 21 Spencer,M.E., Darlison,M.G,, Stephens,P.E, Duckenfield,I.K. and Guest,J.R. (1984) Eur. J. Biochem.141,361-374. 22 Thekkumkara,TJ., Ho, L., Wexler,I.D., Puns,G., Liu,T.-C.and Patel, M.S.(1988) FEBSleu. 240,45-48. 23 Guest, J.R, Lewis, H.M, Graham, L.D,, Packman, L.C. and Perham,R.N. (1985)J. Mol. Biol.185, 743-754. 24 Guest,J.R. (198"/)FEMSMicrobial.Leu.44, 417-422.

232 25 Russell, G.C. and Guest. J.R. (1990) Biochem. J. 269, 443-450. 26 Bowie, J.U.. Reidhaar-Olson, J.F., Lira, W.A. and Sauer, R.T. (1990) .Science247, 1306-1310 27 Grahan~ LD~ Packman, LC. and Perham, R.N. (1989) Biochemistry 28, 1574-1581. 28 Griffin, T.A. and Chuang` D.T. (1989) Ann. N.Y. Acad. Sci. 573, 435 -437. 29 Roche, T.E., Rahmatullah, M., Powers-Greenwood. S.L., Radke. G.A, Gopalakrishnan, S. and Chang, C.L. (1989) Ann. N.Y. Acad. Sci. 573, 66-75. 30 Kleanthous, C, Cullis, P.M. and Shaw, W.V. (1985) Biochemistry 24, 530"/-5313. 31 Burns, D.K. and Crowl, R.M. (198"/) in Protein Structure, Folding,

32 33 34 35 36

and Design 2. UCLA Symposium of Molecular and Cellular Biology (Oa~JeT, D.L., ed.), Vol. 69, pp. 375-384, A.R. Liss, New York. Niu, X.-D. and Reed, L.J. (1990) FASEB J. 4. 1744. Leweraton, A., Murray, I.A, Kleanthous, C., Chilis, P.M. and Sh~.w, W.V. (1988) Biochemistry 27, 7385-7390. Lewendon, A, Murray, I.A., Shaw, W.V., Gibbs, M.R. and leslie, A.G.W. ( 1 ~ ) Biochemistry 29, 2075-2080. Abilton, E., Bremier, L. and Cardinaud, IL (1990) Biochim. Biophys. Acta lq37, 394-400. Hemi|& H, Palva, A~ Paulin, L., Arvid~n, S. and Palva, !. (1990) J. Bacteriol. 17~ 5052-5063.

Sequence similarities within the family of dihydrolipoamide acyltransferases and discovery of a previously unidentified fungal enzyme.

A composite protein sequence database was searched for amino acid sequences similar to the C-terminal domain of the dihydrolipoamide acetyltransferase...
685KB Sizes 0 Downloads 0 Views