Biomed Environ Sci, 2014; 27(4): 250-258

250

Original Article Rapid and High-throughput Identification of Recombinant Bacteria with Mass Spectrometry Assay*

ELSEVIER

XIAO Di 1,2, TAO Xiao Xia 1,2, WANG Peng 3 , LID Guo Dong", GONG Ya Nan 1,2, ZHANG Hui Fang 1,2, WANG Hai Bins, and ZHANG Jian Zhong 1,2,# 1. State Key Laboratory for Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China; 2. Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Hangzhou 310003, Zhejiang, China; 3. Provincial Key Laboratory for Plague Control and Prevention, Yunnan Provincial Institute for Endemic Diseases Control and Prevention, Dali 671000, Yunnan, China; 4. Beijing Municipal Center for Disease Control and Prevention, Beijing 102206, China; 5. Chaoyang District Center for Disease Control and Prevention, Beijing 100021, China

Abstract Objective To construct a rapid and high-throughput assayfor identifying recombinant bacteria based on mass spectrometry. Methods Matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) techniques were used to identify 12 recombinant proteins (10 of Yersinia pestis, 1 of Campy/obaeter jejuni and 1 of Helicobacter pylori). A classification model for the various phase of recombinant bacteria was established, optimized and validated, using MALDI-TOF MS-ClinProTools system. The differences in the peptide mass spectra were analyzed by using Biotyper and FlexAnalysis softwares. Results Models of GA, SNN, and QC were established. After optimizing the parameters, the GA recognition model showed good classification capabilities: RC=100%, mean CVA=98.7% (the CVA was 96.4% in phase 1, 100% in phase 2, 98.4% in phase 3, and 100% in phase 4, respectively) and PPV=95%. 3 This model can be used to classify the bacteria and their recombinant, which only requires 3.7x10 cells for analysis. The total time needed is only 10 min from protein extraction to reporting the result for one sample. Furthermore, this assaycan automatically detect and test 96 samples concurrently. A total of 48 specific peaks (9, 16, 9, and 14 for the four stages, respectively) was found in the various phase of recombinant bacteria. Conclusion MALDI-TOF MS can be used as a fast, accurate, and high-throughput method to identify recombinant bacteria, which provide a new ideas not only for recombinant bacteria but also for the identification of mutant strains and bioterrorism pathogens. Key words: Recombinant bacteria; MALDI-TOF MS; ClinProTools; Rapid identification; Specific peaks Biomed Environ Sci, 2014; 27(4): 250-258

doi: 10.3967/bes2014.048

www.besjourna/.com (full text)

CN: 11-2816/Q

/SSN: 0895-3988 Copyright ©2014 by China CDC

'This work was supported by the National Key Program for Infectious Disease of China (Contract No. 2013ZX10004216-002), Key Projects in the National Science & Technology Pillar Program during the iz" Five-year Plan Period (Contract No. 2012BAI06B02) and the Science and Technology Program of Zhejiang Province PublicTechnology Social Development Project (No. 2010C33035). #Correspondence should be addressed to ZHANG Jian Zhong, professor, PhD, Tel: 86-10-58900707, E-mail: [email protected] Biographical note of the first author: XIAO Di, female, born in 1973, master, associate professor, mainly engaged in the infectious disease proteomics research. Received: March 18, 2013; Accepted: May 26, 2013

251

Biomed Environ Sci, 2014; 27(4): 250-258 INTRODUCTION

D

uring the past 3 decades, more than 40 emerging infectious diseases were reported in the world and many of them were caused by bacterial pathogens. Natural evolution and gene recombinant of microorganisms are the intrinsic factors that result in the emergence of new pathogens. At the same time, with the rapid development and wide use of biotechnology, the construction of new recombinant pathogens by human has become possible. Therefore, in order to identify the new pathogens and bioterrorism recombinant bacteria that might appear in the future, the effective rapid detection assays must be established. On the other hand, for existing or emerging infectious diseases, the development of cost effective vaccines has been the focus of research. DNA recombinant vaccines which are produced by cloning and expressing the effective immunogen of pathogenic microorganism are cheap and suitable for large-scale production, so they have become the focus of vaccine research and I1-21. development Currently, the process of recombinant bacteria analysis by SDS-PAGE and western blot analyses are slow (approximately 7-10 days) and low throughput'5 were detected between m/z 2 000 and 15 000 Da. Good reproducibility was obtained in spectra from two induced expressions and different sample spots (Figure 2A). Classification Model Models of GA, SNN, and QC were established using the spectra of Set 1. After optimizing the parameters of the maximal number of peaks in model, maximal number of generations, and k-nearest neighbor classification, the GA model was optimized to generate Model 3-50-3 (Table 3), which

Table 2. Sample Group for Model Development Phase

Group Set 1

Group Set 2

1

CE

CE, CagL-PCE-F

*,@

2

PCE *

*,@

#

#,@

3

Pla , Caf1M

4

Pla , Caf1M

, Caf1

*&

, PCE , PCE

M*,&

*,@

#,&

#,@

, PMT054

, Caf1M , PMT054

#,@

#,&

*,@

, PCE

CagL-PCE-V, PCE

#,&

, PCE

*

*

YPO0388 , YPO1293 , YPO2174 , CagL

#,&

#

#

YPO0388 , YPO1293 , YPO2174 , Cagl , PorA

, PMT054 , YPO3141 , YPO2301

#

*

#,@

*,&

, PMT054 , YPO3141 , YPO2301

*

*,&

, PCE , PCE

#

*

#

#

*

#

Note. Set 1: samples used to establish the model; Set 2: samples used to validate the model; phase: the phase of cloning and expression;*: before induction; #: after induction; @: plasmid PGEX4t-1; &: plasmid PET32a (+). CE: competent E. coli BL21; PCE: competent E. coli BL21 strain transformed with the empty vectors. CagL-PCE-F: CagL-RE strain which can’t be cultured in media containing antibiotics; CagL-PCE-V: CagL-RE strain which can be cultured in media but the colony PCR-negative.

254

Biomed Environ Sci, 2014; 27(4): 250-258

had RC=100% and CA=98.7% (96.4% in phase 1, 100% in phase 298.4% in phase 3 and 100% in phase 4). 16 (160 spectra) samples at different phases of cloning and expression were used to validate the classification model and the model validation accuracy rate was 95% (Table 4). The GA model based on RE strains that express Yersinia pestis protein has the same classification and recognition capabilities as RE strains that express Campylobacter

jejuni and Helicobacter pylori proteins. Detection Limit For E. coli, 1 OD=1×109 cells/mL. The average OD of RE was 12.4 in a volume of 300 μL. The volume of analysis for MS was 1/100 of the total volume. MALDI-TOF MS was able to detect protein samples diluted to 103-fold. Thus, the minimum detection limit for RE was 3.7×103 cells (Figure 2B).

Figure 2. MALDI-TOF MS analyses of serial dilutions of RE used to determine the detection sensitivity and reproducibility of the RE-Pla spectra. A (A1-A4): the reproducibility of the RE-Pla spectra; B (B1-B4): the detection sensitivity of RE-Pla by MALDI-TOF MS. Only the enlarged MS spectra from 2 000 to 10 000 Da (A) and 2 000 to 8 000 Da (B) are shown.

Biomed Environ Sci, 2014; 27(4): 250-258

255 of target DNA fragment all can result in indifferent peptide maps (Data not be shown).

The Stage-specific Peaks of Cloning and Expression A total of 48 specific peaks for four phases of cloning and expression were obtained by analysis with MSP (Table 5). These specific peaks are markers that can be used to differentiate the various phase of expression induction. Induction, transformation with empty plasmid, different vector plasmid and insert

DISCUSSION An ideal classification method should be sensitive, specific, simple in sample preparation and easy to perform, and is not dependent on a special knowledge background. In this study, we established a rapid assay to identify the recombinant bacteria using MS. A systematic assessment and analysis of the assay was conducted.

Table 3. Classification Model at Different Phases of Cloning and Expression Model Name

Cross Validation (%)

Recognition Capability (%)

3-50-5

97.31

100.00

3-50-10

97.81

100.00

3-50-20

98.91

99.36

3-50-30

98.70

100.00

The Phase in Theory/ The Phase Actually

The Number of Correctly Classified

5-50-10

96.72

100.00

1/1

10

100.00

1/1

10 9

5-50-20

92.99

Table 4. Results of Classification Based on the Model 3-50-30

5-50-30

92.10

100.00

2/3

7-50-10

91.54

100.00

2/2

10

2/2

10

2/2

10

2/2

10

3/3

10

3/3

10

3/3

10

3/2

8 10

7-50-20

94.53

99.64

7-50-30

94.79

99.64

3-60-30

97.88

100.00

3-70-30

97.68

100.00

5-60-10

96.12

100.00

5-70-10

95.69

100.00

7-60-10

98.33

100.00

4/4

7-70-10

95.85

100.00

4/2

9

SNN

76.33

99.36

4/4

10

QC

94.78

95.86

4/3

9

4/3

9

Note. The Algorithms of SNN and QC models are Supervised Neural Network and Quick Classifier, respectively. The Algorithms of the other models are all Genetic Algorithm. 3-50-30: was the best model.

Note. The number of spectra for each sample was 10; In this table, positive predictive value (PPV) is 95%.

Table 5. Specific Peptides at the 4 Phases of Cloning and Expression Identified by MSP Phase in set 1

Specific Peptides (Da)

1

4 213

6 467

6 596

6 684

8 404

8 815

9 265

9 581

11 779

-

-

-

-

-

-

-

2

3 655

3 692

5 062

5 312

6 436

7 122

7 318

7 916

7 951

8 398

8 834

9 038

9 441

9 595

9 688

10 345

3

3 260

3 739

5 075

5 594

6 391

6 560

8 779

8 821

9 431

-

-

-

-

-

-

-

4

3 071

3 106

3 378

3 619

3 916

4 058

4 849

5 069

5 199

5 747

6 488

6 825

7 657

9 194

-

-

Note. The specific peptides of the CE, PCE, and RE strains respectively before and after induction provided the basis for the establishment of classification model which can identify what peptides will be done in the future.

256 Model validation showed that Model 3-50-30 had good classification capability with a recognition accuracy rate of 95% to distinguish the different recombinant during the cloning and expression from 160 different spectra (none of the validation set samples was in the modeling sample set). This model is a general model which can equally classify and identify RE that expressed target proteins from Yersinia pestis, Campylobacter jejuni, and Helicobacter pylori. In general, competent E. coli can not be cultured in media containing antibiotics. Therefore, only E. coli clones transformed with the empty plasmid or a plasmid with an inserted DNA fragment can be cultured. The recombinant during 4 phase can be identified using the method of mass spectrometry-based classification established in this study. After plasmid with an inserted DNA fragment was transferred into competent E. coli, bacteria were enriched and proteins were extracted using ethanol/formic acid method and spectra were obtained by MALDI-TOF and were classified and identified by the GA model 3-50-30. If the sample was classified as phase 1, the vector plasmid was not transferred into competent E. coli. If the sample was classified as phase 2, plasmid was transferred into competent E. coli, but the exogenous target DNA fragment was not present in the plasmid. If the sample was classified as phase 3, the sample was positive recombinant and could be used to induce target protein expression and classified and identified by model 3-50-30. If the colony after induction was classified as phase 3, the induction and expression was unsuccessful and if it was classified as phase 4, the cloning and expression was successful. This process bypasses the procedures used for positive clone identification, including resistance screening, colony PCR, bacteria enrichment, and plasmid PCR (which require approximately 3 d). For high-throughput induction of expression, the advantages are particularly obvious; 96 samples can be tested concurrently, and positive clones can be verified within only 2 h. After the target fragment is transformed into E. coli BL21, expression can be induced directly on solid media by adding IPTG, which can simultaneously identify PCE, RE without target protein expression, and RE with target protein over expression. The phase-specific peaks indicated that the insertion of exogenous gene fragments into E. coli causes significant changes in the peptide expression

Biomed Environ Sci, 2014; 27(4): 250-258 map. The peptide maps varied greatly before and after induction of the recombinant E. coli. These changes are the basis of classification using a mathematical model. Analyzing the error in the classification and identification, we found that the spectra of recognition errors had quite different intensities compared to the modeled data. For example, the spectrum peak intensity of the classification error for sample 11 of Set 2 (CagL*) was 3 000, which was different from the model spectra with peak intensities >12 000. This discrepancy resulted in a recognition error (Figure 3). The results suggest that spectra with different peak intensities may have different numbers of peaks. Consistency in the spectrum intensity is a key factor for improving the performance of the model. However, this consistency should be achieved with the same number of shots and superposition of the same number of spectra, rather than by simply adjusting the laser intensity. One of the factors that resulted in the emergence of new virulent pathogens is the mutation of pathogens through natural selection. At the same time, with the rapid development and wide use of biotechnology, the construction of recombinant pathogens by human has become possible. The ‘9.11’ terrorist attack in the United States has sounded the alarm for us[18]. The other way for recombinant pathogens construction is to obtain the pathogen mutant with strong toxicity, pathogenicity and infectivity based on natural pathogen by mutation and selection technique. In addition, the recombinant pathogen with new pathogenicity can be constructed by importing natural virulence genes and drug resistance genes into the target microorganism. However, it is difficult to identify these recombinant pathogens by using currently available methods, such as phenotypic characteristics identification, biochemical reaction, immunological methods and molecular diagnostic techniques (16S rDNA, Real-time PCR, MLST, RFLP, RAPD) and so on. Proteins between 2 000 and 15 000 Da that are extracted with ethanol/formic acid [13] are mostly positively charged ribosomal proteins . Due to their relatively high abundances and identical copy numbers in cells, ribosomal proteins give stable MS signals under different culture conditions. Ribosomal proteins have been suggested as candidate marker proteins for bacterial identification[13,19-20]. In this study, the ribosomal protein expression patterns in recombinant E. coli varied greatly in the different phase of induced expression. These differences are not

Biomed Environ Sci, 2014; 27(4): 250-258

257

Figure 3. Differences between spectra results by different peak intensity. A and B: the spectrum of RE-CagL before induction; the maximum intensity of A is 12 000 and B is 3 000; Figures A1 and B1 are the enlarged maps of the portions circled in Figures A and B. due to changes in the type of exogenous gene but are the same for any protein expressed in any strain, which is the theoretical basis that ribosomal peptide spectra were used to establish the identification model for the recombinant microorganism. Sensitivity is an important indicator of a detection assay. Using a series of dilutions, we determined the RE recognition and detection limit. Correct RE identification required a minimum of approximately 4×103-4×104 colony forming units (CFUs). A normal bacterial colony would meet this requirement for MS. This limit cannot be matched by the conventional process for the positive identification of clones, which includes colony PCR, bacterial enrichment (5×109 CFUs), plasmid extraction, and plasmid PCR. The expression strain used in this study was competent E. coli BL21. Different expression systems require different expression vectors[21-23]. Individual RE constructs may have different peptide expression profiles. However, the method established in this study could be applicable for any strain. The classification and recognition capability may be different for model using different bacterial strains. However, there are some limitations in our study. One is that this method must depend on mass spectrometry, the other one is that the sample must be pure bacteria. But the testing cost of MS is low (1 yuan RMB/1sample) and throughput is high (96 samples/2 h), suggesting it is a potential good

method for the identification of recombinant pathogen and mutant strains. Author contributions: ZHANG Jian Zhong and XIAO Di contributed to the study designing, data acquisition and analyzing, and manuscript drafting. TAO Xiao Xia, WANG Peng, LIU Guo Dong, GONG Ya nan, ZHANG Hui Fang, and WANG Hai Bin assisted with gene cloning and expression. REFERENCES 1. Plotkin SA. Vaccines: the fourth century. Clin Vaccine Immunol, 2009; 16, 1709-19. 2. Soler E and Houdebine LM. Preparation of recombinant vaccines. Biotechnol Annu Rev, 2007; 13, 65-94. 3. Shi B, Zeng L, Song H, et al. Cloning and Expression of Aspergillus tamarii FS132 Lipase Gene in Pichia pastoris. Int J Mol Sci, 2010; 11, 2373-82. 4. Chang H, Cheng A, Wang M, et al. Cloning, expression and characterization of gE protein of duck plague virus. Virol J, 2010; 7, 120. 5. Chomnawang MT, Nabnuengsap J, Kittiworakarn J, et al. Expression and immunoprotective property of a 39-kDa PlpB protein of Pasteurella multocida. J Vet Med Sci, 2009; 71, 1479-85. 6. Cao Y, Wang Y, Luo H, et al. Molecular cloning and expression of a novel protease-resistant GH-36 alpha-galactosidase from Rhizopus sp. F78 ACCC 30795. J Microbiol Biotechnol, 2009; 19, 295-300. 7. Pan Y, Xia H, Lü P, et al. Molecular cloning, expression and characterization of Bmserpin-2 gene from Bombyx mori. Acta Biochim Pol, 2009; 56, 671-7. 8. Tanabe M and Iverson TM. Expression, purification and preliminary X-ray analysis of the Neisseria meningitidis outer membrane protein PorB. Acta Crystallogr Sect F Struct Biol

258 Cryst Commun, 2009; 65, 996-1000. 9. Hsieh SY, Tseng CL, Lee YS, et al. Highly efficient classification and identification of human pathogenic bacteria by MALDI-TOF MS. Mol Cell Proteomics, 2008; 7, 448-56. 10.Friedrichs C, Rodloff AC, Chhatwal GS, et al. Rapid identification of viridans streptococci by mass spectrometric discrimination. J Clin Microbiol, 2007; 45, 2392-7. 11.Edwards-Jones V, Claydon MA, Evason DJ, et al. Rapid discrimination between methicillin-sensitive and methicillin-resistant Staphylococcus aureus by intact cell mass spectrometry. J Med Microbiol, 2000; 49, 295-300. 12.Kumar MP, Vairamani M, Raju RP, et al. Rapid discrimination between strains of beta haemolytic streptococci by intact cell mass spectrometry. Indian J Med Res, 2004; 119, 283-8. 13.Mellmann A, Cloud J, Maier T, et al. Evaluation of matrix-assisted laser desorption ionization-time-of-flight mass spectrometry in comparison to 16S rRNA gene sequencing for species identification of nonfermenting bacteria. J Clin Microbiol, 2008; 46, 1946-54. 14.Xiao D, Meng FL, He LH, et al. Analysis of the urinary peptidome associated with Helicobacter pylori infection. World J Gastroenterol, 2011; 17, 618-24. 15.Ketterlinus R, Hsieh SY, Teng SH, et al. Fishing for biomarkers: analyzing mass spectrometry data with the new ClinProTools software. Biotechniques, 2005; Supp1, 37-40. 16.Liu W, Gao X, Cai Q, et al. Identification of novel serum biomarkers for gastric cancer by magnetic bead. Front Biosci,

Biomed Environ Sci, 2014; 27(4): 250-258 2010; 2, 961-71. 17.Xiao D, Zhao F, Lv M, et al. Rapid identification of microorganisms isolated from throat swab specimens of community-acquired pneumonia patients by two MALDI-TOF MS systems. Diagn Microbiol Infect Dis, 2012; 73, 301-7. 18.Laxminarayan S and Kun LG. Combating bioterrorism with bioengineering, IEEE Eng Med Biol Mag, 2002; 21, 23-7. 19.Suh MJ, Hamburg DM, Gregory ST, et al. Extending ribosomal protein identifications to unsequenced bacterial strains using matrix-assisted laser desorption/ionization mass spectrometry. Proteomics, 2005; 5, 4818-31. 20.Sun L, Teramoto K, Sato H, et al. Characterization of ribosomal proteins as biomarkers for matrix-assisted laser desorption/ ionization mass spectral identification of Lactobacillus plantarum. Rapid Commun Mass Spectrom, 2006; 20, 3789-98. 21.Smith KP, Kumar S, and Varela MF. Identification, cloning, and functional characterization of EmrD-3, a putative multidrug efflux pump of the major facilitator superfamily from Vibrio cholerae O395. Arch Microbiol, 2009; 191, 903-11. 22.Sánchez-Venegas JR, Navarrete A, Dinamarca J, et al. Cloning and constitutive expression of Deschampsia antarctica Cu/Zn superoxide dismutase in Pichia pastoris. BMC Res Notes, 2009; 2, 207. 23.Canales M, Lastra JM, Naranjo V, et al. Expression of recombinant Rhipicephalus (Boophilus) microplus, R. annulatus and R. decoloratus Bm86 orthologs as secreted proteins in Pichia pastoris. BMC Biotechnol, 2008; 8, 14.

Rapid and high-throughput identification of recombinant bacteria with mass spectrometry assay.

To construct a rapid and high-throughput assay for identifying recombinant bacteria based on mass spectrometry...
8MB Sizes 0 Downloads 3 Views