crossmark Research © 2016 by The American Society for Biochemistry and Molecular Biology, Inc. This paper is available on line at http://www.mcponline.org

Cell-free Determination of Binary Complexes That Comprise Extended Protein-Protein Interaction Networks of Yersinia pestis*□ S

Sarah L. Keasey‡§, Mohan Natesan‡, Christine Pugh‡, Teddy Kamata‡, Stefan Wuchty¶**, and Robert G. Ulrich‡储 Binary protein interactions form the basic building blocks of molecular networks and dynamic assemblies that control all cellular functions of bacteria. Although these protein interactions are a potential source of targets for the development of new antibiotics, few high-confidence data sets are available for the large proteomes of most pathogenic bacteria. We used a library of recombinant proteins from the plague bacterium Yersinia pestis to probe planar microarrays of immobilized proteins that represented ⬃85% (3552 proteins) of the bacterial proteome, resulting in >77,000 experimentally determined binary interactions. Moderate (KD ⬃␮M) to high-affinity (KD ⬃nM) interactions were characterized for >1600 binary complexes by surface plasmon resonance imaging of microarrayed proteins. Core binary interactions that were in common with other gram-negative bacteria were identified from the results of both microarray methods. Clustering of proteins within the interaction network by function revealed statistically enriched complexes and pathways involved in replication, biosynthesis, virulence, metabolism, and other diverse biological processes. The interaction pathways included many proteins with no previously known function. Further, a large assembly of proteins linked to transcription and translation were contained within highly interconnected subregions of the network. The two-tiered microarray approach used here is an innovative method for detecting binary interactions, and the resulting data will serve as a critical resource for the analysis of protein interaction networks that function within an important human pathogen. Molecular & Cellular Proteomics 15: 10.1074/mcp.M116.059337, 3220–3232, 2016. From the ‡Molecular and Translational Sciences Division, U.S. Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702; §Biological Sciences Department, University of Maryland Baltimore County, Baltimore, Maryland 21250; ¶National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20892 Received February 24, 2016, and in revised form, July 11, 2016 Published, MCP Papers in Press, August 3, 2016, DOI 10.1074/mcp.M116.059337 Author Contributions: R.G.U. designed the study. S.L.K. and C.P. performed fluorescent microarray assays. M.N. performed SPRi experiments. S.L.K., M.N., C.P., and T.K. analyzed microarray data. S.W. and S.L.K. performed systems biology analyses. S.L.K. and R.G.U. wrote the paper with input from all authors.

3220

Plague is an infectious disease of catastrophic historical epidemics that continues to be a contemporary concern to public health, because of the global distribution of the etiological agent, Yersinia pestis. Reoccurring outbreaks of plague are frequent in Madagascar (1), and episodic infections are reported annually in the United States (2). Spread by rodent fleas, bubonic plague can be successfully treated with antibiotics if diagnosed early, whereas pneumonic plague can progress rapidly to death within 24 h of infection. Although most isolates are considered sensitive to standard treatments, an antibiotic resistant strain of Y. pestis was reported (3). The highly virulent plague bacillus evolved from the closely related species Yersinia pseudotuberculosis that usually causes a mild gastroenteritis in humans or a tuberculosis-like lung infection in animals (4). The chromosome of Y. pestis strain CO92 encodes ⬃3885 open-reading frames (ORFs)1, whereas an additional 181 are expressed by three plasmids. Although the theoretical proteome of Y. pestis can be predicted by analysis of genomic sequences, less than 40% of the ⬃4500 ORFs have been confirmed by direct detection of protein products (5). Further, the composition of the bacterial proteome varies in response to the fluctuating environmental signals that occur during the process of infection. Although the proteome describes the sum of all proteins within the organism, interaction networks are useful for representing the various intertwined roles of individual proteins that contribute to metabolism, signaling, structure, movement and other cellular pathways. The basic building blocks of proteomic interaction networks are binary protein-protein interactions (PPi). Proteome-scale PPi studies were described for E. coli (6 –11), and a few gram-negative pathogens (12–15), although only limited data for Y. pestis and other highly virulent bacteria currently exists. The loss of any interaction node that is a critical component of virulence, housekeeping and other cellular functions will 1 The abbreviations used are: ORF, open reading frames; AP, Affinity purification; Ci, Clustering coefficient; KD, Equilibrium dissociation constant; KEGG, Kyoto Encyclopedia of Genes and Genomes; KIM, Kurdistan Iran man; LPS, Lipopolysaccharide; PPi, Protein-protein interaction; RFA, Random forest algorithm; RFU, Relative fluorescence units; SD, Standard deviation; SPRi, Surface plasmon resonance imaging; Y2H, Yeast two-hybrid.

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

result in greatly attenuated or avirulent bacteria (16). Consequently, PPi networks are potential targets for the development of new antimicrobials. Yeast two-hybrid (Y2H) and affinity purification coupled with mass spectrometry (AP-MS) are the most widely used experimental methods for highthroughput detection of PPi (6, 7, 17). Results from AP-MS studies favor the identification of stable multiprotein complexes with slow off-rates, whereas reporter gene activation in the Y2H assay also detects transient interactions that may have fast kinetics. Yet, results from Y2H and AP-MS studies cannot differentiate binary interactions from protein aggregates, and provide no information concerning kinetics of complex formation or stability. Further, there is often little agreement between independent studies (6, 7, 17), and false discovery or false negative rates are difficult to determine. As an alternative approach, we used a recombinant library of proteins representing 3552 ORFs that are encoded by the chromosome and plasmids of Y. pestis as a cell-free experimental platform to examine binary PPi and model interaction networks. The recombinant library included proteins that have no known function, and products from predicted ORFs that have no experimental proof of expression by Y. pestis. Based on the sequenced genomes of KIM and CO92 strains of Y. pestis, proteins were produced by in vitro translation, purified, and printed on microarrays. Solution-phase proteins were used to serially detect binding complexes with the protein microarrays and to measure interaction kinetics. We demonstrate that the experimental data can be used to build models of interaction networks for queries of specific functions. MATERIALS AND METHODS

Y. pestis Proteome Microarrays—The Y. pestis protein microarray was constructed as described previously (18). Briefly, 3968 Y. pestis ORFs cloned into pENTR221 were sequenced, fully characterized, and recombined into the expression vector pEXP7-DEST GST using Gateway cloning methods (Life Technologies, Carlsbad, CA). The Y. pestis proteins were expressed in vitro using ExpresswayTM (Life Technologies) cell-free E. coli extract system and isolated by GSTbased affinity purification. Proteins that passed quality control criteria of quantity, solubility, and stability (3552 proteins out of 4146 total ORFs), representing ⬃85% total proteome coverage, were printed in microarrays on glass slides coated with a thin film of nitrocellulose (Gentel Biosciences, Madison, WI). The microarrays passing additional quality control measures for printing, as previously described (18), were cryo-preserved (⫺20 °C) until use. Larger quantities of recombinant Y. pestis proteins were produced in E. coli BL21-AI cells (Life Technologies) for probes to determine PPi. The probe proteins were affinity purified on GSTrap HP columns by FPLC (AKTA, GE Healthcare Life Sciences, Pittsburg, PA), and stored (⫺20 °C) in a final glycerol concentration of 25%. Insoluble proteins were recovered using Inclusion Body Solubilization Reagent (Thermo Scientific, Waltham, MA) followed by protein refolding using dialysis with a gradient of decreasing urea concentration, carried out at 4 °C. Protein purity and concentration were analyzed with an Agilent Bioanalyzer 2100 Protein 230 kit (Agilent Technologies, Santa Clara, CA). The PPi probe proteins were biotinylated with a maleimide-activated, sulfhydryl-reactive biotinylation reagent containing an extended spacer arm (EZ Link Biotin BMCC; Thermo Scientific). The

Molecular & Cellular Proteomics 15.10

conditions used favored biotinylation of all four reduced Cys groups presented by the GST tag, facilitating optimal orientation of the analyte for measuring protein interactions. Biotinylation was quantified by comparison to a standard curve, using a dot blot method employing streptavidin-horse radish-peroxidase conjugate and substrate. The biotinylated proteins were stored (⫺20 °C) in a final glycerol concentration of 25% by volume. The biotinylated GST Y. pestis recombinant proteins were conjugated to either Streptavidin Alexa Fluor 532 or 647 (Life Technologies) in a 1:1 molar ratio, using 2.5 ␮M of protein per 2.5 ␮M fluorescent dye. A Tecan HS Pro400 (Tecan, San Jose, CA) hybridization station was used to automate the interaction studies with the protein microarrays, and all procedures were performed at 22 °C. Dilutions and wash steps used PBS (pH 7.4), 5 mM MgCl2, and 0.05% Triton X-100, 5% glycerol, 1% BSA, and 0.5 mM DTT. Briefly, the microarrays were blocked for 1 h in PBS (pH 7.4), 0.1% Tween-20, and 1% BSA. The interaction studies were performed in duplex by combining one probe protein that was conjugated to Alexa Fluor 532 and another probe protein conjugated to Alexa Fluor 635, and immediately transferring for equilibration (1 h, 22 °C) on the surface of the Y. pestis protein microarray. The microarrays were washed and dried before further processing. Proteome Microarray Data Analysis—The microarrays were scanned with a confocal laser scanner (GenePix 4000B; Molecular Devices, Sunnyvale, CA) and analyzed using GenePix Pro 6.0 software. Raw pixel counts were generated by scanning simultaneously at 635 nm and 532 nm, using the highest photomultiplier tube gain setting that did not produce saturated signals, and a power setting of 100%. The scanner settings minimized background signals and were optimal for detecting fluorescence from specific PPi events. Acquired data were analyzed separately for 635 nm and 532 nm wavelengths with the ProtoArray Prospector v5.1 program (Life Technologies) and quantile normalized. Outliers among the replicate protein features on individual arrays were identified using a modified Z-score (median absolute deviation ⬎3.5) and removed from further analysis. The averaged and condensed normalized data were then log10-transformed. Based on the Gaussian distribution in histogram plots obtained for data from each wavelength scan, the significant binding partners were determined by standard deviations from the mean signal (⬃280 relative fluorescence units (RFU)) for each microarray, using Origin Pro (OriginLab, Boston, MA) v.8.5 software. Interactions were ranked according to the following: high intensity interaction, ⱖ 3 S.D.; medium intensity interaction, 2–3 S.D.; low intensity interaction, 1–2 S.D.; insignificant signal, ⬍1 S.D. Interaction Kinetics—Binding interaction kinetics of binary complexes were determined by surface plasmon resonance imaging (SPRi) of protein microarrays, using a Flexchip instrument (GE Healthcare Life Sciences). The chip surface was cleaned by UV-Ozone treatment (UVO-Cleaner, Jelight, Irvine, CA) for 5 min. A self-assembled monolayer was formed by incubating a gold-coated slide surface with 11- mercaptoundecanoic acid (11-MUA) in ethanol for 2 h. The 11-MUA grafted slides were washed with ethanol, and activated with 200 nM 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide (EDC) and 50 nM N-hydroxy succinimide (NHS). NeutrAvidin (100 ␮g/ml) was coupled to the surface by incubation for 20 min, and surfaces were blocked with ethanolamine (1 M). The Y. pestis proteins were biotin labeled as described above, adjusted to concentrations of 150, 125, 100, 75, 50, 25 ␮g/ml in 1⫻ PBS containing 20% glycerol, and printed in duplicate on the NeutrAvidin immobilized gold surface by using a BioOdysseyTM CalligrapherTM printer (Bio-Rad, Hercules, CA). The 24 ⫻ 24 microarray was printed in contact mode using StealthTM pin SMP-5B (Arrayit, Sunnyvale, CA) with 70% humidity, resulting in 160 ␮m spots. A gasket cover was placed on the slide over the microarray to create a microfluidic chamber, and SPRi data were collected. Surface areas of the printed proteins were assigned for analysis,

3221

Yersinia pestis Protein Interaction Networks

whereas reference spots were used for correction of bulk refractive index changes and instrumental drift. The microarray surface was blocked in Biacore Flexchip blocking buffer (GE Healthcare Life Sciences). Nonbiotinylated proteins (1 ␮M) were flowed across the surface in running buffer (0.01 M HEPES buffered saline, pH 7.4, 0.005% Tween-20), injected at a flow rate of 500 ␮l/min for 15 min, followed by a 15 min dissociation phase in running buffer. All binding events were performed at 25 °C. The microarray surfaces were regenerated after each experiment with 10 mM Gly-HCl pH 2.2 buffer for reuse up to seven times. To determine ka and kd values, a global fit analysis was carried out for a 1:1 interaction model with mass transport corrections, using Flexchip evaluation software Version 2.1 (GE Healthcare Life Sciences) for data analysis. Interactomes—The sequence composition of protein binding partners was determined by mapping protein interactions as vectors of amino acid triplets. A set of 3,485 experimentally determined PPi from E. coli (8) were used as a positive interaction training set for the machine learning algorithm used to predict Y. pestis PPi in common with E. coli. Using gene ontology annotations with strong evidence codes (19), we collected proteins from E. coli that were found in the cytoplasm, outer membrane or periplasm. Out of this set of proteins, we generated 3536 non-interacting pairs of E. coli proteins, because paired proteins were located in different compartments, as a negative interaction training set. Grouping amino acids in seven classes (20), the frequencies of all 7 ⫻ 7 ⫻ 7 ⫽ 343 combinations of classes in a protein were calculated. Specifically, the frequency of a combination k over all 343 class combinations in a protein i of a given interaction nik was defined as fik ⫽ 343 , where nik was the occurrence of combi冘i⫽1 nil nation k in protein i, scanning over all consecutive amino acid triplets. An interaction between a protein i and j was represented by a 343dimensional vector where each vector unit held the frequency difference of a combination k, ⌬ijk ⫽ 兩 fik ⫺ fjk兩. A random forest method was used to evaluate the likelihood of the predicted interactions (21). For each of 1000 decision trees M ⫽ 兹N variables out of all n ⫽ 343 triplet combinations and 2⁄3rd of all protein pairs were sampled. Pairs of proteins were classified as interacting if more than half of all trees considered them interacting. Network Analysis—Topological analysis of the network of Y. pestis PPi and random graphs were performed using the Network Analyzer application for Cytoscape (22). Random graphs that were composed of the same number of observed nodes and edges were generated using the Network Analysis Tool (NeAT) (23). Clusters of highly intraconnected nodes within the network were identified using the Cytoscape application MCODE (Molecular Complex Detection) (24). Node proteins within the network were characterized according to KEGG pathways using the Database for Annotation, Visualization and Integrated Discovery (DAVID) (25), and pathway enrichment was determined using the entire Y. pestis proteome as background. Node proteins were also categorized by family according to the ontologybased classification system utilized in PANTHER (26). Protein Abundance Estimations—E. coli BL21-AI cells were transformed with pEXP7-DEST harboring the GST-tagged recombinant proteins, cultured (37 °C) to mid-log phase in a Bioscreen C MBR (Growth Curves USA, Piscataway NJ) with constant shaking. Protein expression was induced for 4 h with 0.2% arabinose, bacteria were harvested, pelleted, and frozen (⫺80 °C). The bacterial cell pellets were lysed in 50 ␮l of Bacterial Protein Extraction Reagent (B-PER, Thermo Scientific), and the soluble fraction was retained. Glutathioneconjugated magnetic beads (Genscript USA, Inc Piscataway NJ) were used to capture the plasmid-expressed proteins for quantification. The magnetic beads (25 ␮l) were washed twice in 50 ␮l 1⫻ PBS and added to 10 ␮l of serially diluted lysate supernatant (neat, 1:5, 1:25 in 1⫻ PBS). Samples were incubated for 1 h at room temperature with

3222

constant agitation. A magnetic plate was used to separate the GSTtagged protein bound magnetic beads from unbound supernatant material. The magnetic beads were washed twice in wash buffer (100 ␮l 1⫻ PBS, 1% BSA) and incubated with rabbit anti-GST antibody (1 ␮g/ml) diluted in wash buffer. Beads were washed twice, and incubated for 1 h incubation with an Alexa 488 conjugate of goat antirabbit IgG (1:1000 in wash buffer). Finally, beads were washed three times, fluorescence was read on a Safire II plate reader (MTX Lab Systems, Inc. Vienna, VA), and protein abundance was estimated in comparison to a standard curve established with a calibrated amount of GST. Operon Organization and Ortholog Identification—Operon annotations for Y. pestis KIM10⫹ (UID: 187410) and E. coli K-12 substr. W3110 (UID: 316407) were obtained from organism-specific databases available through Biocyc (http://biocyc.org) (27). A comparative analysis of Y. pestis KIM10⫹ (UID: 184710), Y. pseudotuberculosis YPIII (UID: 502800), and E. coli K-12 substr. W3110 (UID: 316407) ORFs was performed using the Comparative Analysis Tool in the Biocyc database (27). Orthologs were identified as bidirectional best hits obtained from BLAST (28) outputs run pairwise between all three organisms (⬎10% identify, ⬎40% similarity, e-value ⬍ 0.001). ORFs lacking ortholog hits were termed “unique.” RESULTS

Binary Protein-Protein Interactions—We utilized a protein microarray (18) that comprised ⬎85% of the predicted Y. pestis proteome to identify protein interaction partners (Fig. 1A). The microarrayed proteins were expressed in vitro as recombinant GST-fusions from DNA sequence-confirmed plasmid clones and printed on glass slides that were coated with a thin layer of nitrocellulose, as previously described (18). Because the proteomes of Y. pestis and E. coli share ⬃50% of proteins with ⬎60% amino acid identity (28, 29), we examined two reported AP-MS data sets of E. coli PPi (6, 7) to select protein probes that were common to both species. There were 212 E. coli proteins that were involved in 239 interactions in common between the two E. coli AP-MS studies, and 179 of these proteins were also conserved within the predicted Y. pestis proteome (e-value ⬍10⫺6 by BLASTp) (28). Genetic clones for each of the 179 Y. pestis orthologs were used to express proteins in E. coli for use as interaction probes, resulting in 118 purified proteins that were stable in solution. The probe proteins were expressed as GST-fusions, and biotinylated for conjugation to streptavidin fluorescent molecules (Fig. 1B). The microarray surface was incubated with a fixed concentration (300 nM) of biotin-labeled analytes, and interactions were detected by laser-scanning fluorescence (Fig. 1B). We used experimental conditions that favored detection of interactions with slow dissociation rates, similar to the stable protein complexes detected by AP-MS or ELISA. The fluorescent signals from each microarray exhibited a Gaussian distribution (Fig. 1C). A total of 60,604 provisional interactions (supplemental Tables S1–S3) were enumerated based on fluorescence intensities ⱖ1 standard deviation (S.D.) away from the mean signal of each individual microarray (Fig. 1C): ⱖ3 S.D. from mean signal as high-intensity interactions (n ⫽ 1612), 2–3 S.D. as medium (n ⫽ 9515), and 1–2 S.D.

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

A

B Y. pestis ORF

analyte probe conjugated to fluorescent tag

* * * *

clone into GST-tagged vector

detect interactions by fluorescence

bind

in vitro transcription/translation

*

control

* specific PPi

GST affinity purification

QC

C 1 2 3 SD

1000 Count (frequency)

Nitrocellulose microarray printing

Low >1SD n=66,733

800 600

NS

400 Medium >2SD n=11,653

200 0 -1

0 1 2 3 4 Relative fluorescence units (log10)

5

High >3SD n=1,980

FIG. 1. A whole-proteome microarray for the determination of protein-protein interactions. A, Open reading frames (ORFs) from Y. pestis KIM10⫹ chromosomal and plasmid DNA were cloned into a vector containing a glutathione-S-transferase (GST) tag, sequence verified, and transcribed and translated in vitro. GST-tagged proteins were affinity purified, and quality control (QC) of each protein was evaluated based on protein stain and Western blotting using anti-GST antibody. Approximately 85% of the 4164 potential ORFs from Y. pestis were arrayed on to nitrocellulose-coated glass slides and visualized using a rabbit anti-GST antibody bound to Cy5-labeled anti-rabbit antibody (bottom). B, Specific protein-protein interaction (PPi) binding events were detected as fluorescence signal, originating from analyte proteins conjugated to fluorescent tags (red circles). Example control proteins, as well as a specific PPi, are shown (yellow and green boxes). C, Normalized fluorescent signals from each whole-proteome Y. pestis array exhibited a Gaussian distribution when plotted as a histogram and significant binding partners were determined based on standard deviation (S.D.) away from the mean signal of each individual array: ⱖ3 S.D. ⫽ high-intensity interaction (dark blue shading on Gaussian and pie graphs, n ⫽ 1980), between 2 and 3 S.D. ⫽ medium-intensity interaction (medium blue shading, n ⫽ 11,653), between 1 and 2 S.D. ⫽ low-intensity interaction (light blue shading, n ⫽ 66,733), ⬍1 S.D. ⫽ insignificant signal (NS, unshaded).

as low-intensity interactions (n ⫽ 49,477). Based on these criteria, the average signals for high, medium, and low-intensity interactions were 6612, 2047, and 939 RFU, respectively. To increase the diversity of recorded PPis, we randomly selected an additional 26 proteins from the high-signal intensity data set to probe the microarray, allowing us to detect 16,762 new binary interactions (77,366 total). Protein pairs were then randomly selected from each of the three binding affinity groups (low-high signal intensity) for examination of interaction kinetics. The GST-tagged Y. pestis proteins were biotinylated and printed onto a NeutrAvidin-coated gold surface (Fig. 2A). The unlabeled protein analytes were flowed across the microarray surface (Fig. 2B), and the resulting interaction kinetics were assessed by SPRi (Fig. 2C). A total of 1637 interactions were identified, and 444 were successfully modeled (supplemental Fig. S1) to obtain KD values ranging from 0.26 –186 nM (ka ⫽ 2.78 ⫻ 102 - 4.95 ⫻ 105 M⫺1s⫺1; kd ⫽ 1.4 ⫻ 10⫺6 - 3.23 ⫻ 10⫺3 s⫺1) (supplemental Table S4). A 1:1 Langmuir binding model was used to evaluate kinetic constants, and interactions with more complex binding kinetics were not further considered. In the representative SPRi binding curves shown in Fig. 2C, binding interactions between the hypothetical protein y1097, a putative thioredoxin-like protein,

Molecular & Cellular Proteomics 15.10

and the hypothetical protein encoding y3300, which is a putative ligase, display a relatively fast association rate and moderate dissociation rate, resulting in a KD of 13.1 nM (Fig. 2C, top). In contrast, the complex of methyltransferase protein MenG and the transcription termination factor Rho (Fig. 2C, bottom) is formed by an intermediate-level association rate that is countered by a very slow dissociation rate, resulting in a more stable high affinity interaction (KD ⫽ 2.66 nM). Approximately 32% of the SPRi interactions were also observed in PPi detected with the higher-content microarray and labeled protein probes (Fig. 1). Furthermore, we detected 46% of the orthologous Y. pestis interactions that were predicted from the previously reported E. coli AP-MS studies (6, 7). Operon Organization of Interacting Y. pestis Proteins—Because bacterial proteins that form functional units are often clustered into operons that are regulated by a single promoter, we examined the possibility that Y. pestis operons were enriched for binary interactions. Approximately 52% (2064 ORFs) of Y. pestis genes are located in 743 operons (supplemental Table S5), varying from 2–11 genes (average ⬃3 genes) per transcriptional unit, consistent with reported estimates for other bacteria (30). Examining 13,633 interac-

3223

Yersinia pestis Protein Interaction Networks

A

C

B Y. pestis ORF

y1097-y3300 (KD 0.0131 μM)

analyte Response Units

400 clone into GST-tagged vector biotinylated ligand

Cell-based expression in E. coli GST affinity purification or recovery from inclusion bodies

modified gold surface

NeutrAvidin

300 200 100 ka 8,292 Ms-1

0 20

Biotinylation

polarizer

White light

30

40

50

60

bandpass filter

800 CCD camera

SPRi biosensor printing

detect interactions as mass accumulation of analyte protein at chip surface

Response Units

QC

kd 1.09x10-4 s-1

MenG-Rho (KD 0.00266 μM)

600 400 200 0 20

ka 2,110 Ms-1 30

kd 5.60x10-6 s-1

40 50 Time (min)

60

FIG. 2. Kinetics of protein-protein interactions examined using surface plasmon reasonance imaging. A, Interacting proteins (n ⫽ 96) representative of high, medium, and low fluorescence signal intensity were selected for kinetic analysis. Sequence-verified GST-tagged Y. pestis open reading frames (ORFs) were expressed recombinantly in E. coli and purified using GST-based affinity chromatography. Quality control (QC) of each protein was evaluated based on protein stain and Western blotting using anti-GST antibody. Proteins were biotinylated and printed (24 ⫻ 24 pattern) onto a gold-coated surface plasmon resonance imaging (SPRi) biosensor coupled with Neutravidin. An image of the printed biosensor is shown (bottom). B, Protein binding events were detected as resonance response units generated because of mass accumulation of analyte proteins (blue squares) on the NeutrAvidin-coupled chip surface containing captured biotinylated ligand proteins (triangles with yellow dots). Binary interaction events (blue square interacting with red triangle) at the chip surface induced a change in the angle of reflection of polarized light directed at the bottom of the chip, which was directly proportional to the amount of analyte protein bound (dark spots on gray background). C, SPRi detected interactions were fit to a 1:1 Langmuir binding model to obtain association (ka, red box) and dissociation (kd, blue box) rates, which enabled estimation of interaction equilibrium dissociation constants (KD ⫽ kd/ka). Representative interactions, with binding curves for four different ligand concentrations (100, 75, 50, and 25 ␮g/ml from top to bottom), are shown. The analyte concentration was held constant at 1 ␮M.

tions between 1175 proteins (high and medium-intensity interactions), we only found 0.3% of experimentally detected interacting pairs of proteins fell within the same transcriptional unit in Y. pestis (supplemental Table S6). We also examined the gene organization for interacting proteins of E. coli that were previously reported (8). Specifically, we observed that only 1.6% of the E. coli interactions fell within the same operon for ⬃5100 interactions with both partners mapping to an annotated transcription unit. Quality Assessment of the Interaction Network—Because the proteomes of Y. pestis and E. coli share ⬃50% of proteins with ⬎60% amino acid identity (28, 29) we reasoned that the composition of protein interactions of E. coli and Y. pestis are highly similar as well. As a consequence, we applied a machine learning approach to assess the quality of protein interactions in Y. pestis by training a random forest algorithm (21) (Fig. 3) where 3485 previously reported PPi from E. coli (8) were used as a positive interaction training set. Based on corresponding gene ontology (GO) terms with strong evidence codes (19), we compiled a negative interaction training set of 3536 noninteracting E. coli pairs of proteins that were experimentally located in different cellular compartments (Fig.

3224

3, cytoplasm, outer membrane, or periplasm). Each pair of proteins was mapped as a vector of amino acid triplet frequencies (20) by grouping amino acids in to one of seven groups based on physicochemical properties, and by considering any three contiguous amino acids as a unit. In this approach, amino acids within the same group likely involve synonymous mutations because of their similar characteristics. In particular, the classification method considers the dipoles and volumes of amino acid side chains, assuming that electrostatic and hydrophobic interactions direct PPi. Each amino acid triad can be differentiated based on the classes of amino acids it contains, therefore important information concerning the PPi is retained. To determine the classifiers ability to distinguish interacting from non-interacting protein pairs, we trained our random forest with these training sets. Using classification thresholds as previously described (31), a receiver operating characteristic (ROC) curve was constructed based on a “held out” set of E. coli training data (Fig. 4A). For ⬎50% of all trees classifying a given interaction as positive, we obtained a true positive rate of 88.5% and a false positive rate of 1.6%, indicating high model performance.

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

positive training set

negative training set

~3,500 E. coli interacting pairs

~3,500 E. coli non-interacting pairs periplasm

prey

E

B A C D bait

G

cytoplasm ... fi

amino acid groups

FIG. 3. Machine learning algorithm for filtering experimentally detected Y. pestis protein-protein interactions. A random forest algorithm (RFA) that classifies interacting proteins as a function of amino acid sequences was trained on a set of experimentally determined interacting E. coli proteins. As a negative, noninteracting set we chose pairs of noncolocalized E. coli proteins (top right). In particular, amino acids were binned in seven classes based on physicochemical properties, representing each protein as a vector of amino acid triplet frequencies (20). Representing (non)interacting E. coli protein pairs as a vector of amino acid frequency differences, the RFA (21, 31) was trained to evaluate experimentally observed Y. pestis interactions. A protein pair was considered interacting if more than half of 1000 decision trees classified it as positive. Amino acid abbreviations: Ala, A; Gly, G; Val, V; Ile, I; Leu, L; Phe, F; Tyr, Y; Met, M; Thr, T; Ser, S; His, H; Asn, N; Gln, Q; Trp, W; Arg, R; Lys, K; Asp, D; Glu, E; Cys, C.

X F X

A, G, V I, L, F, P Y, M, T, S H, N, Q, W R, K D, E C

... A N Y L C ...

fi fj fk ... ...

fi fj fk

fj fk

...

train random forest algorithm with training sets

evaluate candidate Y. pestis interactions

interaction is confirmed if more than half of trees classified candidate proteins as interacting We next determined the frequency distributions of the fraction of trees that confirmed the measured SPRi and fluorescent microarray interactions. SPRi results were confirmed for 78.4% of interactions (Fig. 4B). Futhermore, we measured confirmation rates of 68% for high, 51.0% for medium, and 46.7% for low-intensity interactions that we obtained from the fluorescent microarray results. Using data sets with the highest level of confirmed interactions (⬎50% for SPRi and fluorescent microarray interactions), we assembled a network of interactions of Y. pestis, consisting of 314 proteins and a total of 2344 interactions (Fig. 4C). Such a web of interactions formed a single connected network component with no isolated nodes. Furthermore, the network features a minority of highly connected nodes, while the majority of nodes have few interaction partners, as exhibited by a node degree distribution that fits a power-law (Fig. 4D) (24, 32). The network architecture suggests that the underlying network is more robust to perturbations and random loss of nonhub proteins (32). The networks average clustering coefficient Ci, defined as the ratio between the number of edges linking nodes adjacent to a given node i and the total possible number of edges (33), is high compared with a random network that consists of the same number of nodes and edges (Ci,Y. pestis ⫽ 0.232; Ci,random ⫽ 0.043).

Molecular & Cellular Proteomics 15.10

We investigated the substructure of the interaction network by functional annotation clustering of the 314 proteins comprising the set of protein interactions. We identified 11 KEGG pathways (25) that were statistically enriched (Fisher exact p value ⱕ0.04) in the network by using the whole Y. pestis proteome as background (Fig. 5 and supplemental Table S6). The largest number of binary interaction proteins (n ⫽ 26) belonged to the ribosomal complex, confirming the direct physical contacts predicted within this functional module (Fig. 6A). Approximately 50% of the individual proteins comprising the large and small subunits of the ribosome are represented in the Y. pestis functional module (Fig. 6A and supplemental Table S7), where 14 proteins are involved in 26 direct connections. Of these interactions, approximately half are observed in a structure of the E. coli ribosome (PDB 4V48) (34), whereas the remaining interacting protein pairs may represent direct contacts made during the assembly pathway that are important for the stability of an intermediate state of the complex. Smaller modules were found for fatty acid biosynthesis (n ⫽ 3), purine (n ⫽ 7) and pyrimidine metabolism (n ⫽ 7), lipopolysaccharide (LPS) biosynthesis (n ⫽ 4), pyruvate metabolism (n ⫽ 4), aminoacyl-tRNA biosynthesis (n ⫽ 5), RNA degradation (n ⫽ 7), RNA polymerase (n ⫽ 3), mismatch repair (n ⫽ 4), and homologous recombination (n ⫽ 6). Inter-

3225

Yersinia pestis Protein Interaction Networks

0.5

80

frequency

0.4

60 40

0.3 0.2 0.1

20 00

20

40

60

80

0

100

0

0.2

false positive rate

0.4

0.6

0.8

1

fraction SPRi (78%) SPRi WPM hihigh gh (78%) (68%) FMA (51%) WPM medium FMA medium (51%) (47%) WPM l ow FMA low (47%)

C

D random Y. pestis R2 = 0.93

2.0

No. nodes (log10)

FIG. 4. Application of a machine learning algorithm reveals a network of bacterial protein interactions that exhibits topological features of biological networks. A, A receiver operating characteristic (ROC) curve was constructed based on the performance of the trained RFA on a held out test set of E. coli protein pairs. Provided that half of all trees classified a given interaction as positive, we obtained a TPR ⫽ 88.5% and a FPR ⫽ 1.6% (square). In (B), we determined the frequency distributions of the fraction of trees that confirmed measured SPRi and fluorescent microarray (FMA) interactions. The dashed line represents our classification threshold when more than half of all trees indicated a positive given interaction. In particular, we observed that SPRi and FMA high-intensity interactions largely were confirmed by our computational approach. C, The Y. pestis network of protein-protein interactions exhibits a high average clustering coefficient (Ci ⫽ 0.232) compared with a random network comprised of the same number of nodes and edges (Ci ⫽ 0.043), and has a small radius and characteristic path length. D, The node degree distribution of the network (filled circles) approximates a power-law distribution typical of scalefree networks, but not random networks (open squares).

B

100

true positive rate

A

1.5

1.0 0.5 0.0

2,344 interactions between 314 proteins radius: 3 Y. pestis Ci: 0.232 random Ci: 0.043

pyrimidine metabolism

0.0

0.5

1.0

1.5

2.0

degree (log10)

pyruvate metabolism purine metabolism

RNA polymerase homolgous recombination

fatty acid biosynthesis

ribosome

RNA degradation

LPS biosynthesis

orphan mismatch repair

aminoacyl-tRNA biosynthesis

FIG. 5. Statistically enriched biological pathways and protein assemblies of the Y. pestis interactome. Eleven KEGG pathways (labeled in the outer edge of the interactome map) were identified as statistically significant (25) in the network of Y. pestis protein interactions compared with the entire Y. pestis proteome as background (Fisher exact test, p value ⬍0.04; squares: node proteins belonging to a pathway, color corresponds to pathway grouping; gray circles: annotated proteins not associated with a statistically significant pathway; royal blue circles: orphan proteins).

3226

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

module connections were found for a subset of functionally associated groups of nodes (Fig. 6B– 6F), whereas several previously reported interactions were recapitulated in the Y. pestis PPi network (Fig. 6B, 6D, 6E) (35–37). Of the annotated proteins within the network (n ⫽ 218), one-quarter could be mapped to a protein class (Fig. 7) based on biological function ontology classification (26). The majority of proteins belong to a single protein class, but there are several multiclass proteins as well.

A S5 S5

L10 L10

B L19 L19

L4 L4 S2 S2 S4 S4

L6 L6

L9 L9 L33 L33

fabG

pheS

accB

pheT

accA

L11 L11

Ribosome

L2 L2

S13 S13

C lysS

S6 S6

L17 L17

aminoacyl-tRNA fatty acid synthesis biosynthesis

aceE

D

E hfq

F

ruvA

accB

pnp ruvB

hsp 70

rho

RNA degradation

dna Q

holE

homologous recombination

accA

pyruvate metabolism

FIG. 6. Intermodule connections are prevalent within groups of functionally clustered Y. pestis proteins. A, Fourteen out of 26 ribosomal proteins within the Y. pestis network participate in direct interactions. Approximately half of the small and large ribosomal subunits are represented in the Y. pestis network. B–F, Intermodule connections for various functional groups are shown, several of which have been previously reported (dashed lines).

100

FIG. 7. Diverse protein families are represented within the network of Y. pestis proteins. A total of 314 proteins comprise the Y. pestis interaction network, of which ⬃70% are annotated, and 25% were successfully mapped to a protein family, based on biological functional ontology classifications (26). The functionally classified proteins belong to diverse biological categories and highlight the importance of protein-protein interactions in various metabolic and cellular pathways.

No. proteins

80

60

Orphan and Hypothetical Proteins Within the Y. pestis PPi Network—Approximately 30% of ORFs represented in the network encode orphan proteins with no known biochemical function, or are hypothetical proteins predicted solely by gene annotation algorithms that identified similarities with another hypothetical or expressed protein (Fig. 5 and 7). The abundance of orphans in the PPi network (supplemental Table S8), as well as the large number of direct physical contacts, suggests that many of these proteins are essential to the integrity of the network, although direct interactions between unique orphan proteins were not common. We observed that 16% (n ⫽ 679) of the total predicted Y. pestis proteome consists of unique orphan ORFs found only in this pathogen, based on comparisons of genomes of Y. pestis KIM10⫹, Y. pseudotuberculosis YPIII, and E. coli K-12 substr. W3110. We note that many of the ORFs used to synthesize products for inclusion in the microarrays encoded proteins that have never been shown to exist in nature. However, 22 orphan proteins that were previously identified as targets of antibody responses to Y. pestis (supplemental Table S9) (18) were associated with fluorescently detected interactions. We further note that orphan proteins recognized by antibodies from acutely infected rhesus macaques were involved in more connections within the Y. pestis PPi network in comparison with all other antibody-target orphan proteins, though the significance of this observation is not clear. Modular Topology of the Y. pestis PPi Network and Multiassembly Connections—We identified clusters of highly intraconnected nodes by clustering coefficients using the MCODE algorithm (24). Heavily weighted nodes comprised clusters in the network that may represent macromolecular protein assemblies or functional modules (Fig. 8). Although biological functions of orphan proteins are not clear (Fig. 8A), their prevalance within the clusters provides further evidence that • ATP-binding cassette transporter • acetyltransferase • anion channel • acyltransferase • deaminase • dehydrogenase • calcium-binding protein • helicase • esterase • chaperonin • kinase • glycosyltransferase • endoribonuclease • nucleotidyltransferase • G-protein modulator • protease • transcription factor • isomerase • protein kinase • translation initiation factor • methyltransferase • RNA helicase • nuclease • RNA methyltransferase • reductase • storage protein • transfer/carrier protein • transaminase

40

20

0

ribosomal

orphan nucleic acid binding

oxidoreductase

• hydrolase • transferase

ligase

• enzyme modulator • transporter

• cation transporter • chaperone • lyase • translation factor

Protein class

Molecular & Cellular Proteomics 15.10

3227

Yersinia pestis Protein Interaction Networks

A

B 50S ribosomal protein L11

Formate acetylase RNA polymerasebinding transcription factor

50S ribosomal protein L17 Y1012

y1282

D-cysteine desulfhydrase y3300 y0480

y1738 Gluatmate synthase subunit ß

y0790 Queuosine biosynthesis protein QueC

y1338

y0280 y2054

y0292

y0546

Transcription antitermination protein NusB 3-ketoacyl-ACP reductase

Ribonuclease activity regulator RraA

y1002

y0074

y2457

RNA-binding protein Hfq Transcriptional regulator BolA

thioredoxin

y1097 y3808

y3912

y4026

y1511

y2853

y3219

y0680

y0645 y2252

y0630

y3656

y0499 Transcriptional regulator HU subunit α

y0681 y1758

y2032

y3272

PTS family enzyme component 2B

y2255

Lysyl-tRNA synthetase

y3135

DNA topoisomerase

y0129

y1034

Triosephosphate isomerase y0052 23S rNRA pseudouridate synthase B

y4016

y3094 y1487

Elongation factor Ts

y2462

Co-chaperone HscB

30S ribosomal protein S6

y1903 Ferrous iron transport protein A

Ligase

Preprotein translocase subunit SecB

Threonyl-tRNA synthetase DNA polymerase III, subunit ε

Phosphoenolpyruvate-protein phosphotransferase

y1025

y2246 y0498

Pyruvate kinase y3853

y1438 y0647

y2231 Double-stranded DNA-mimic protein

23S rRNA Acetyl CoA carboxylase biotin carboxyl methyltransferase carrier protein subunit

ATP dependent helicase HepA

y1584

N5-glutamine S-adenoxyl L-methione-dependent methyltransferase

y4015

Holliday junction DNA helicase RuvB 50S ribosomal protein L9

y3706 y1752

DNA-directed RNA polymerase subunit α

Molecular chaperone DnaK

FIG. 8. Highly-interconnected clusters represent potential connections between macromolecular assemblies. Two clusters of highly intra-connected node proteins (A and B) were identified within the Y. pestis network (24). Biologically relevant enrichment could not be identified within the clusters (25) because of a large number of poorly annotated orphan proteins (gray nodes). Clusters A and B contain a large number of proteins involved in either transcription or translation, which may represent binary connections within these molecular assemblies.

these proteins are essential to the network integrity. For example, clusters comprising 264 interactions between 24 proteins (Fig. 8A), and 155 interactions between 28 proteins (Fig. 8B), contain annotated node proteins with functions involved in either transcription or translation. Network connections between transcription and translation clusters as well as downstream biochemical pathways were also evident, suggesting that there are direct interactions between the macromolecular assemblies involved in these essential cellular processes. Relationships Between Cellular Proteins, Interactions, and Network Connections—It has been suggested that expanded surface areas and conformational flexibilities of intrinsically disordered segments of proteins permit interactions with multiple binding partners (38). To explore the potential contribution of intrinsically disordered proteins to Y. pestis PPi, we assessed the secondary structure and network connections of a statistical sampling of proteins (44 randomly selected) within the larger network of 13,633 interactions between 1175 proteins, which includes core interactions and those that may be unique to Y. pestis. We identified intrinsically disordered regions within the proteins (39), ranging from completely structured to 33% disordered. The number of interaction partners within the network (supplemental Fig. S2) from this limited data sampling did not correlate with disorder of proteins. However, these results are only a preliminary snapshot, and further detailed studies will be required to draw a definitive conclusion. The primary determinants of binary complex formation within the bacterium are interaction affinities, protein concentrations, compartmentalization within the cell, and protein crowding (40). Total protein concentrations are controlled by rates of transcription, translation, or degradation, and are

3228

regulated by the cell in response to environmental signals. To assess perturbations of the Y. pestis PPi network, the steadystate levels of select Y. pestis proteins were measured and the potential effects of changes in transcription were simulated. The 171 Y. pestis ORFs examined were individually expressed in E. coli from plasmids under the uniform control of a T7 promoter, allowing us to measure protein concentrations that were independent of most bacterial regulatory controls except intrinsic rates of translation. The plasmid transformed host cells were cultured at 37 °C to mid-log phase, and concentrations of Y. pestis protein were measured by a quantitative immunoassay. The protein abundance levels produced under these conditions were normally distributed (Fig. 9A, horizontal histogram), whereas apparent protein mass (range of 32–195 kDa) did not appear to impact protein abundance in cells. The Y. pestis ORFs we examined were involved in 316 binary interactions, and 90% were found to have protein levels that exceeded the experimentally determined KD. To approximate the altered protein concentrations that occur during infection, we used mRNA transcription data from a previously reported study of Y. pestis KIM5 replicating in macrophages (41). We compared protein abundance levels with transcription levels of Y. pestis mRNA (Fig. 9A, vertical histogram) during cell infection, assuming that the average signal intensity of mRNA from control bacteria cultured without cells correlated with the abundance of transcripts within the population of bacteria. Based on the comparison that is illustrated in Fig. 9A, overall levels of gene-specific mRNA appeared to be independent of protein levels. A comparison of steady-state protein abundance levels and changes in mRNA transcript levels 1.5 h and 8 h after infection of cells (Fig. 9B) indicated that the majority of proteins examined

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

A Freq

30 15

mRNA (log10 intensity)

1.2 1.15 1.1

1.05 1

0.95 1

2 Protein per cell (log10 attomole)

3

40 20 Freq

B 8

Fold change in RNA

6 4 2 0 -2 -4 1

2 Protein per cell (log10 attomole)

C

3

y0862 CspI

Fold change in RNA

8

PrmA

6

DISCUSSION

4 2

0

exhibited ⬍2-fold change in transcription levels regardless of the time point post-infection, in comparison to bacteria cultured without host cells. These slight variations are unlikely to significantly perturb overall protein abundance levels, and therefore may have little effect on potential PPi. In contrast, several proteins exhibit ⬎2-fold changes in transcription levels. For example, the mRNA encoding the cold shock-like protein CspI (y0224) exhibited ⬃7-fold increase between 1.5– 8 h post-infection (Fig. 9C, red). Similarly, the methyltransferase ribosomal protein L11 (PrmA) shown in Fig. 9C (blue) exhibited a 6.8-fold increase in transcript levels by 8 h but only a 1.2-fold increase at 1.5 h post-infection, suggesting that binary interactions for both CspI and PrmA may increase later in infection. Alternatively, a subset of proteins exhibit significant decreases in mRNA levels in comparing the 1.5 h and 8 h post-infection time points to transcript levels of Y. pestis cultured without host cells. For example, the y0862 locus encodes a transcriptional regulator of sugar metabolism and exhibits an 8-fold increase in transcript levels at 1.5 h after infection, but only a 3-fold increase at 8 h (Fig. 9C, green), suggesting that y0862 protein abundance, and also PPi, may decrease at later time points post-infection. Overall, time-dependent changes after infection were observed for ⬃10% of proteins, although the impact of these changes in mRNA transcription and protein abundance on PPi will require additional study. We also noted that transcriptional changes were not biased toward lower abundance proteins (Fig. 9B), and no correlations were observed between protein or transcript abundance and interaction affinities (supplemental Fig.S1B and S1C).

1

2 Protein per cell (log10 attomole)

FIG. 9. Comparison of Y. pestis protein abundance and mRNA levels. A, Protein abundance levels of 171 Y. pestis open reading frames, expressed individually in E. coli under the control of a T7 promoter, were normally distributed (horizontal histogram), whereas mRNA transcript levels from Y. pestis cultured in the absence of cells (41) approximated an exponential distribution (vertical histogram). There was no correlation between protein abundance and mRNA transcript levels. B, Comparison of steady-state abundance levels for proteins expressed in Y. pestis alone and transcription levels 1.5 h (open circles) and 8 h (closed circles) after infection of macrophages with Y. pestis (41). Potential time-dependent changes in abundance were evident for ⬃10% of the proteins examined. C, Representative differentially regulated transcripts are shown. Up-regulated transcript at 8 h post infection compared with 1.5 h post infection: CspI (red) and PrmA (blue), suggesting increased protein abundance at the later time point and increased likelihood of PPi. Downregulated transcript at 8 h post infection: y0862 (green).

Molecular & Cellular Proteomics 15.10

We report the first large-scale analysis of binary protein interactions for the highly pathogenic bacterium Y. pestis. A microarray encompassing 85% (3,552 proteins) of the bacterial proteome facilitated the cell-free detection of ⬎77,000 binary complexes, and kinetic interaction data for ⬎1600. Networks formed by Y. pestis binary interactions were enriched for regulatory and metabolic pathways that are essential for cell maintenance and survival. We also noted direct connections between ribosomes and RNA polymerase, functional protein clusters, and other multisubunit assemblies. Approximately 30% of proteins within the Y. pestis network were orphan gene products (8), suggesting prospective functions for many of these previously uncharacterized and diverse proteins. The networks inferred from these newly described binary protein interactions are roadmaps of dynamic cellular processes that will be useful for systems biology studies, and provide an extensive number of targets for the development of antimicrobials. New molecular targets for therapeutic interventions are needed to counter the growing threats caused by antibiotic resistance among gram-negative pathogens and the potential dissemination of highly patho-

3229

Yersinia pestis Protein Interaction Networks

genic bacteria by disease outbreaks or through potential acts of bioterrorism. In contrast to results from cell-based studies, binary interactions that are measured with isolated proteins are more likely to register formation of true complexes (42). Measuring PPi in vitro is the simplest way to detect binary complexes and eliminate potential multimeric complexes, trapping of proteins by macromolecular aggregates, and the protein crowding effects that occur within cells, as well as overcoming the limitations introduced by fusion of bacterial proteins to heterologous anchor proteins, and localization of proteins to nonnatural cellular compartments (43). Stable interactions (i.e. interactions with slow dissociation rates) are readily detected by the microarray method used here, whereas interactions with fast dissociation rates are more difficult to capture. Although estimating false discovery rates for our data is complicated because of lack of a gold standard PPi data set for Y. pestis, as well as the novelty of the experimental methods, there was 32% agreement between the two microarray methods utilized in our experiments, exceeding the ⬃5% of interactions in common between previously reported AP-MS and Y2H studies (6, 7, 17). We suggest that the majority of the interactions detected in our study are physiologically relevant. Although high-affinity interactions were also reported for proteins that did not coexist within the same Yersinia species (42), these nonphysiological complexes are likely to occur only as rare events. Formation and stability of the reported binary complexes may be controlled by accessibility or fluctuations in the abundance of either protein component within the bacterium. However, the cellular mechanisms that regulate protein levels specifically for control of binary complexes are not clear. Our analysis suggests that transcription levels appear to play a minor role in regulating protein abundance. Taniguchi et al. also reported no correlation between protein and mRNA copy numbers for any given gene in a study of both mRNA and proteins at the singlebacterium level with single-molecule sensitivity (44). We observed that only 0.3% of Y. pestis binary interactions fell within the same operon compared with an equally low frequency of 1.6% for E. coli interactions. We further note a recent report indicating that operon organization facilitates cotranslational association of oligomeric interactions between bacterial luciferase subunits LuxA and LuxB that are encoded by the luxCDABE operon (45). A focused study of operons by employing the methods described here may be useful to identify additional proteins that assemble into translation-controlled complexes. Reconstructing the networks of protein interactions within cells illuminates the inner workings of cellular machinery and allows us to gain an understanding of complex diseases. New techniques are also needed to detect the dynamic changes in PPi resulting from the spatial and temporal changes that occur inside cells. The network of Y. pestis PPis that we present provides a framework for understanding dynamic bi-

3230

ological processes, contains subnetworks that may represent macromolecular assemblies or functional modules, and presents opportunities for selection of protein targets for further evaluation in a hypothesis-driven manner. For example, the experimentally determined protein interactome from Y. pestis may be a rich source of targets for development of new antimicrobials. Most antibiotics that are directed toward commonly exploited biochemical targets, including chloramphenicol and beta-lactams (46), are becoming ineffective because of the spread of resistance. There is a constant need for synthesis of new antibiotic variants that can circumvent common microbial resistance pathways, while future efforts must consider alternative targets that may be less prone to drive selection of resistance. Beta-lactam antibiotics (ampicillin, meropenem, imipenem), fluoroquinolones (ciprofloxacin, moxifloxacin), or aminoglycosides (streptomycin, gentamicin) are commonly used to treat Y. pestis infections (47). Fluoroquinolones and aminoglycosides disrupt the initiation and elongation steps of protein synthesis by binding to the 30S and 50S ribosomal proteins, thereby inhibiting cell growth (48). The experimentally determined protein interactome from Y. pestis may be a rich source of novel targets for development of antimicrobials. In particular, we conclude that network hubs and individual bacterial protein interactions provide opportunities for targeted disruption of essential cell functions or virulence mechanisms. In the assembled network, we observed that functionally important proteins, including potential drug targets, exhibit topological importance. For example, the ribosomal protein Hfq is known to facilitate small RNA-mRNA interactions in gram-negative bacteria, and is a post-transcriptional regulator of the stress response (49). Further, Hfq is required for virulence in Y. pseudotuberculosis (50), and Y. pestis strains are strongly dependent on Hfq for growth at 37 °C (51). Within the PPi network, Hfq is a hub protein that interacts with 14 other proteins, including five orphans, offering several potential targets for antimicrobial intervention. Thus, the results presented here provide opportunities for study of virulence pathways and the development of new antimicrobial strategies. * This work was supported in part by a grant from the Defense Threat Reduction Agency (RGU) and by appointment of SLK to the Research Participation Program for the U.S. Army Medical Research and Materiel Command, administered through an agreement between the U.S. Department of Energy and the USAMRMC. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army or the National Institutes of Health.10.1074/mcp.M116.059337. □ S This article contains supplemental material. 储 To whom correspondence should be addressed: Molecular and Translational Sciences, USAMRIID, 1425 Porter Street, Frederick MD 21702. Tel.: 301-619-4232; Fax: 301-619-8334; E-mail: rulrich@ bhsai.org. ** Present address: Department of Computer Science, University of Miami, Coral Gables, FL.

Molecular & Cellular Proteomics 15.10

Yersinia pestis Protein Interaction Networks

REFERENCES 1. World Health Organization.(2014) Plague-Madagascar. (WHO Global Alert and Response, Disease Outbreak News). Geneva, Switzerland. Retrieved from http://www.who.int/csr/don/21-november-2014-plague/en/ 2. Kwit, N., Nelson, C., Kugeler, K., Peterson, J., Plante, L., Yaglom, H., Kramer, V., Schwartz, B., House, J., Colton, L., Feldpausch, A., Drenzek, C., Baumbach, J., DiMenna, M., Fisher, E., Debess, E., Buttke, D., Weinburke, M., Percy, C., Schriefer, M., Gage, K., and Mead, P. (2015) Human plague – United States, 2015. M. M. W. R. 64, 918 –919 3. Galimand, M., Guiyoule, A., Gerbaud, G., Rasoamanana, B., Chanteau, S., Carniel, E., and Courvalin, P. (1997) Multidrug resistance in Yersinia pestis mediated by a transferable plasmid. N. Engl. J. Med. 337, 677– 680 4. Chain, P. S., Carniel, E., Larimer, F. W., Lamerdin, J., Stoutland, P. O., Regala, W. M., Georgescu, A. M., Vergez, L. M., Land, M. L., Motin, V. L., Brubaker, R. R., Fowler, J., Hinnebusch, J., Marceau, M., Medigue, C., Simonet, M., Chenal-Francisque, V., Souza, B., Dacheux, D., Elliott, J. M., Derbise, A., Hauser, L. J., and Garcia, E. (2004) Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis. Proc. Natl. Acad. Sci. U.S.A. 101, 13826 –13831 5. Carniel, E., and Hinnesbusch, B. J. (2012) Yersinia: Systems Biology and Control. Norfolk, UK: Caister Academic Press. 6. Arifuzzaman, M., Maeda, M., Itoh, A., Nishikata, K., Takita, C., Saito, R., Ara, T., Nakahigashi, K., Huang, H. C., Hirai, A., Tsuzuki, K., Nakamura, S., Altaf-Ul-Amin, M., Oshima, T., Baba, T., Yamamoto, N., Kawamura, T., Ioka-Nakamichi, T., Kitagawa, M., Tomita, M., Kanaya, S., Wada, C., and Mori, H. (2006) Large-scale identification of protein-protein interaction of Escherichia coli K-12. Genome Res. 16, 686 – 691 7. Butland, G., Peregrín-Alvarez, J. M., Li, J., Yang, W., Yang, X., Canadien, V., Starostine, A., Richards, D., Beattie, B., Krogan, N., Davey, M., Parkinson, J., Greenblatt, J., and Emili, A. (2005) Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 433, 531–537 8. Hu, P., Janga, S. C., Babu, M., Díaz-Mejía, J. J., Butland, G., Yang, W., Pogoutse, O., Guo, X., Phanse, S., Wong, P., Chandran, S., Christopoulos, C., Nazarians-Armavil, A., Nasseri, N. K., Musso, G., Ali, M., Nazemof, N., Eroukova, V., Golshani, A., Paccanaro, A., Greenblatt, J. F., Moreno-Hagelsieb, G., and Emili, A. (2009) Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins. PLos Biol. 7, e96 9. Rajagopala, S. V., Sikorski, P., Kumar, A., Mosca, R., Vlasblom, J., Arnold, R., Franca-Koh, J., Pakala, S. B., Phanse, S., Ceol, A., Ha¨user, R., Siszler, G., Wuchty, S., Emili, A., Babu, M., Aloy, P., Pieper, R., and Uetz, P. (2014) The binary protein-protein interaction landscape of Escherichia coli. Nat. Biotech. 32, 285–290 10. Peregrin-Alvarez, J. M., Xiong, X., Su, C., and Parkinson, J. (2009) The modular organization of protein interactions in Escherichia coli. PLoS Comp. Biol. 5, e1000523 11. Gully, D., and Bouveret, E. (2006) A protein network for phospholipid synthesis uncovered by a variant of the tandem affinity purification method in Escherichia coli. Proteomics 6, 282–293 12. Parrish, J. R., Yu, J., Liu, G., Hines, J. A., Chan, J. E., Mangiola, B. A., Zhang, H., Pacifico, S., Fotouhi, F., DiRita, V. J., Ideker, T., Andrews, P., and Finley, R. L. (2007) A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol. 8, R130 13. Rain, J. C., Selig, L., De Reuse, H., Battaglia, V., Reverdy, C., Simon, S., Lenzen, G., Petel, F., Wojcik, J., Scha¨chter, V., Chemama, Y., Labigne, A., and Legrain, P. (2001) The protein-protein interaction map of Helicobacter pylori. Nature 409, 211–215 14. Stingl, K., Schauer, K., Ecobichon, C., Labigne, A., Lenormand, P., Rousselle, J. C., Namane, A., and de Reuse, H. (2008) In vivo interactome of Helicobacter pylori urease revealed by tandem affinity purification. Mol. Cell. Proteomics 7, 2429 –2441 15. Shimoda, Y., Shinpo, S., Kohara, M., Nakamura, Y., Tabata, S., and Sato, S. (2008) A large scale analysis of protein-protein interactions in the nitrogen-fixing bacterium Mesorhizobium loti. D. N. A. Res. 15, 13–23 16. Desroy, N., Denis, A., Oliveira, C., Atamanyuk, D., Briet, S., Faivre, F., LeFralliec, G., Bonvin, Y., Oxoby, M., Escaich, S., Floquet, S., Drocourt, E., Vongsouthi, V., Durant, L., Moreau, F., Verhey, T. B., Lee, T. W., Junop, M. S., and Gerusz, V. (2013) Novel HldE-K inhibitors leading to

Molecular & Cellular Proteomics 15.10

17.

18.

19.

20.

21. 22.

23.

24.

25.

26.

27.

28. 29.

30. 31.

32. 33. 34.

35.

36. 37.

attenuated Gram negative bacterial virulence. J. Med. Chem. 56, 1418 –1430 Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., and Sakaki, Y. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. U.S.A. 98, 4569 – 4574 Keasey, S. L., Schmid, K. E., Lee, M. S., Meegan, J., Tomas, P., Minto, M., Tikhonov, A. P., Schweitzer, B., and Ulrich, R. G. (2009) Extensive antibody cross-reactivity among infectious gram-negative bacteria revealed by proteome microarray analysis. Mol. Cell. Proteomics 8, 924 –935 Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., Ringwald, M., Rubin, G. M., and Sherlock, G. (2000) Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29. Accessed September 2012 Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., Li, Y., and Jiang, H. (2007) Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci. U.S.A. 104, 4337– 4341 Breiman, L. (2001) Random forests. Mach. Learn. 45, 5–32 Saito, R., Smoot, M. E., Ono, K., Ruscheinski, J., Wang, P. L., Lotia, S., Pico, A. R., Bader, G. D., and Ideker, T. (2012) A travel guide to Cytoscape plugins. Nat. Methods 9, 1069 –10746 Brohee, S., Faust, K., Lima-Mendez, G., Sand, O., Janky, R., Vanderstocken, G., Deville, Y., and van Helden, J. (2008) NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res. 36, W444 –W51 Bader, G. D., and Hogue, C. W. (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 4, 2 Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nat. Protoc. 4, 44 –57 Mi, H., Poudel, S., Muruganujan, A., Casagrande, J. T., and Thomas, P. D. (2016) PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res. 44, D336 –D42 Caspi, R., Altman, T., Billington, R., Dreher, K., Foerster, H., Fulcher, C. A., Holland, T. A., Keseler, I. M., Kothari, A., Kubo, A., Krummenacker, M., Latendresse, M., Mueller, L. A., Ong, Q., Paley, S., Subhraveti, P., Weaver, D. S., Weerasinghe, D., Zhang, P., and Karp, P. D. (2014) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 42, D459 –D471 Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic Local Alignment Search Tool. J. Mol. Biol. 215, 403– 410 Deng, W., Burland, V., Plunkett, G., Boutin, A., Mayhew, G. F., Liss, P., Perna, N. T., Rose, D. J., Mau, B., Zhou, S., Schwartz, D. C., Fetherston, J. D., Lindler, L. E., Brubaker, R. R., Plano, G. V., Straley, S. C., McDonough, K. A., Nilles, M. L., Matson, J. S., Blattner, F. R., and Perry, R. D. (2002) Genome sequence of Yersinia pestis KIM. J. Bacteriol. 184, 4601– 4611 Brouwer, R. W. W., Kuipers, O. P., and van Hijum, S. A. F. T. (2008) The relative value of operon predictions. Brief. Bioinform. 9, 367–375 Wuchty, S. (2011) Computational prediction of host-parasite protein interactions between Plasmodium falciparum and Homo sapiens. PloS One 6, e26960 Barabasi, A. L. (2009) Scale-free networks: a decade and beyond. Science 325, 412– 413 Watts, D. J., and Strogatz, S. H. (1998) Collective dynamics of ‘small-world’ networks. Nature 393, 440 – 442 Gao, H., Sengupta, J., Valle, M., Korostelev, A., Eswar, N., Stagg, S. M., Van Roey, P., Agrawal, R. K., Harvey, S. C., Sali, A., Chapman, M. S., and Frank, J. (2003) Study of the structural dynamics of the E. coli 70S ribosome using real-space refinement. Cell 113, 789 – 801 Beyer, D., Kroll, H. P., Endermann, R., Schiffer, G., Siegel, S., Bauser, M., Pohlmann, J., Brands, M., Ziegelbauer, K., Haebich, D., Eymann, C., and Bro¨tz-Oesterhelt, H. (2004) New class of bacterial phenylalanyl-tRNA synthetase inhibitors with high potency and broad-spectrum activity. Antimicrob. Agents Chemother. 48, 525–532 Davies, A. A., and West, S. C. (1998) Formation of RuvABC Holliday junction complexes in vitro. Curr. Biol. 8, 725–727 Freiberg, C., Pohlmann, J., Nell, P. G., Endermann, R., Schuhmacher, J., Newton, B., Otteneder, M., Lampe, T., Ha¨bich, D., and Ziegelbauer, K.

3231

Yersinia pestis Protein Interaction Networks

38.

39.

40.

41.

42.

43.

44.

(2006) Novel bacterial acetyl coenzyme A carboxylase inhibitors with antibiotic efficacy in vivo. Antimicrob. Agents Chemother. 50, 2707–2712 Babu, M. M., van der Lee, R., de Groot, N. S., and Gsponer, J. (2011) Intrinsically disordered proteins: regulation and disease. Curr. Opin. Struct. Biol. 21, 432– 440 Buchan, D. W. A., Minneci, F., Nugent, T. C. O., Bryson, K., and Jones, D. T. (2013) Scalable web services for the PSIPRED Protein Analysis Workbench. Nucleic Acids Res. 41, W340 –W348 Ozbabacan, S. E. A., Engin, H. B., Gursory, A., and Keskim, O. (2011) Transient protein-protein interactions. Protein Eng. Des. Sel. 24, 635– 648 Fukuto, H. S., Svetlanov, A., Palmer, L. E., Karzai, A. W., and Bliska, J. B. (2010) Global gene expression profiling of Yersinia pestis replicating inside macrophages reveals the roles of a putative stress-induced operon in regulating type III secretion and intracellular cell division. Infect. Immun. 78, 3700 –3715 Swietnicki, W., O’Brien, S., Holman, K., Cherry, S., Brueggemann, E., Tropea, J. E., Hines, H. B., Waugh, D. S., and Ulrich, R. G. (2004) Novel protein-protein interactions of the Yersinia pestis type III secretion system elucidated with a matrix analysis by surface plasmon resonance and mass spectrometry. J. Biol. Chem. 279, 38693–38700 Bruckner, A., Polge, C., Lentze, N., Auerback, D., and Schlattner, U. (2009) Yeast Two-Hybrid, a Powerful Tool for Systems Biology. Int. J. Mol. Sci. 10, 2763–2788 Taniguchi, Y., Choi, P. J., Li, G. W., Chen, H., Babu, M., Hearn, J., Emili, A.,

3232

45.

46.

47.

48. 49.

50.

51.

and Xie, X. S. (2010) Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538 Shieh, Y., Minguez, P., Bork, P., Auburger, J. J., and Guilbride, L. (2015) Operon structure and cotranslational subunit association direct protein assembly in bacteria. Science 350, 678 – 680 Colombo, C. J., Mount, C. A., and Popa, C. A. (2008) Critical care medicine at Walter Reed Army Medical Center in support of the global war on terrorism. Crit. Care Med. 36, S388 –394 Louie, A., Vanscoy, B., Liu, W., Kulawy, R., Brown, D., Heine, H. S., and Drusano, G. L. (2011) Comparative efficacies of candidate antibiotics against Yersinia pestis in an in vitro pharmacodynamic model. Antimicrob. Agents Chemother. 55, 2623–2628 Lambert, T. (2012) Antibiotics that affect the ribosome. Rev. Sci. Tech. 31, 57– 64 Yonekura, K., Watanabe, M., Kageyama, Y., Hirata, K., Yamamoto, M., and Maki-Yonekura, S. (2013) Post-transcriptional regulator Hfq binds catalase HPII: Crystal structure of the complex. PLoS ONE 8, e78216 Schiano, C. A., Bellows, L. E., and Lathem, W. W. (2010) The small RNA chaperone Hfq is required for the virulence of Yersinia pseudotuberculosis. Infect. Immun. 78, 2034 –2044 Bai, G., Bolubov, A., Smith, E. A., and McDonough, K. A. (2010) The importance of the small RNA chaperone Hfq for growth of epidemic Yersinia pestis, but not Yersinia pseudotuberculosis, with implications for plague biology. J. Bacteriol. 192, 4239 – 4245

Molecular & Cellular Proteomics 15.10

Cell-free Determination of Binary Complexes That Comprise Extended Protein-Protein Interaction Networks of Yersinia pestis.

Binary protein interactions form the basic building blocks of molecular networks and dynamic assemblies that control all cellular functions of bacteri...
2MB Sizes 0 Downloads 7 Views