Accepted Article

Received Date : 05-Feb-2014 Revised Date : 10-Apr-2014 Accepted Date : 22-May-2014 Article type

: Research Letter

Editor

: Michael Galperin

DBAASP: Database of Antimicrobial Activity and Structure of Peptides

Giorgi Gogoladze1, Maia Grigolava1, Boris Vishnepolsky1, Mindia Chubinidze1, Patrice Duroux2, Marie-Paule Lefranc2, Malak Pirtskhalava1*. 1

Laboratory of Bioinformatics. Ivane Beritashvili Center of Experimental Biomedicine. 14 Gotua Str., Tbilisi 0160, Georgia.

2

IMGT®, the international ImMunoGeneTics information system®, Université Montpellier 2, Institut de Génétique Humaine, UPR CNRS 1142, 34000 Montpellier, France.

* Corresponding author T: +995 574 16 23 97; E-mail: [email protected]

ABSTRACT Database of Antimicrobial Activity and Structure of Peptides (DBAASP) is a manually curated database for those peptides which antimicrobial activity against particular targets has been evaluated experimentally. The database is a depository of complete information on: chemical structure of peptides; target species; target object of cell; peptide antimicrobial/hemolytic/cytotoxic activities and experimental conditions at which activities were estimated. The DBAASP search page gives possibility to search peptides according to their structural characteristics, complexity type (monomer, dimer and two-peptides), source, synthesis type (ribosomal, nonribosomal and synthetic) and target species. The database prediction algorithm provides a tool for rational design of new antimicrobial peptides. DBAASP is accessible at http://www.biomedicine.org.ge/dbaasp/ This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/1574-6968.12489 This article is protected by copyright. All rights reserved.

Accepted Article

Keywords: antimicrobial peptide, database, antimicrobial activity, hemolytic/cytotoxic activity

INTRODUCTION Antimicrobial peptides are the innate immune defense system against microbial infections. They are found in different kingdoms of the living organisms including humans (Wang, 2008). These peptides are broad spectrum antibiotic substances that can be used as therapeutic agents for treatment of diseases caused by bacteria, fungi, viruses, parasites and cancerous cells. The antimicrobial peptides belong to three classes according to their origin. The first class is natural ribosomally synthesized peptides presented in all organisms (Brogden, 2005). More than one thousand of them have been identified. The second class is natural nonribosomally synthesized peptides produced in bacteria (Caboche, 2010). The third class is non-natural synthetic peptides. Many of antimicrobial peptides have no regular secondary structure in solution and acquire functional form after interaction with lipid bilayer. Peptide amphipathicity is required for antimicrobial activity, with hydrophilic and hydrophobic amino acids being aligned on separate sides of the structured peptide. According to the primary and secondary structures, antimicrobial peptides can be divided into: i) linear α-helical peptides, ii) peptides with β-sheet, iii) peptides with β-hairpin, and iv) peptides rich in particular amino acids (Reddy, 2004).

Due to the increase of microbe resistance to the conventional antibiotics new agents are needed for therapeutic purposes. Artificial antimicrobial peptides have been considered as potential drugs against infectious diseases. To make such drugs the knowledge of the relations between peptide structure and antimicrobial activity is necessary. In order to reveal physicochemical parameters that are responsible for antimicrobial activity and high therapeutic index peptide structure-activity relation should be studied. Full information on chemical structure and activity of peptides is needed to carry out study. Information about target object of cell – lipid bilayer, membrane protein, cytoplasmic protein, DNA, RNA – is also required. Information on detailed chemical structure of peptides, and their antimicrobial/hemolytic/cytotoxic activities is scattered among different databases and not complete. Prediction tools in the databases are mainly based on machine learning algorithms. Prediction by machine learning needs submission of both positive and negative samples in the training and testing datasets. There are not any database which gives possibility to form experimentally validated negative (non-AMP) set for machine learning. The main goal of this work was to create a database which overcomes these lacks. We implemented “Database of Antimicrobial Activity and Structure of Peptides” (DBAASP) which contains information on antimicrobial peptides of different origins (ribosomal, nonribosomal and synthetic) and complexity (monomers, dimers and two-peptides). DBAASP is a manually curated and is a depository of information on those peptides which antimicrobial activity against particular targets have been evaluated experimentally. The database encompass:

This article is protected by copyright. All rights reserved.

Accepted Article

1. Full information on chemical structure of peptides. This means having whole information regarding posttranslational and N/C termini modification of amino acids. 2. Information about peptide antimicrobial activities and experimental conditions at which activities were estimated. 3. Information about peptide hemolytic/cytotoxic activities. 4. Information about target object of cell. DBAASP allows development of machine-learning based antimicrobial peptide prediction tool for particular target species that will operate by experimentally validated positive (AMP) and negative (non-AMP) sets of peptides. DATABASE CONTENT Structure of Database DBAASP is hosted on a Linux server using JBoss 7 Application server (http://www.jboss.org). All entries are stored in a MySQL 5.5 (http://www.mysql.com/) database. The application is written in Java 7 (http://www.oracle.com), using the PrimeFaces, an open source component library for JavaServer Faces. Data Collection Information about antimicrobial peptides was collected from PubMed (http://www.ncbi.nlm.nih.gov/pubmed) using keywords: antimicrobial, antibacterial, antifungal, antiviral, antitumor, anticancer and antiparasitic peptides. Web Interface The pages that give access to the database sections are: i) “Home” is the introductory page. ii) “Search” is a link to the page for general search of the peptides. iii) “Ranking Search” gives access to the search page which allows users to find peptides by target species and activity measure and gives result in a form of ranking list iv) “Prediction” provides a link to the prediction tool. v) “About” gives access to the page describing the goal of designing the database. vi) “Help” provides description of the search abilities and abbreviations used. Information stored in DSAASP 1. Characterization of antimicrobial peptides in DSAASP The knowledge on AMPs is presented according to the unique ontology for immunogenetics and immunoinformatics, IMGT-ONTOLOGY, using its concepts of Identification, Classification, and Description (Giudicelli & Lefranc, 2012). Identification: Peptide and Target Summary. Identification holds: i) Name of peptide; ii) Synthesis Type (Ribosomal, Nonribosomal, Synthetic); iii) Complexity (Monomer Dimer, TwoPeptide); iv) Target Group (Gram+, Gram-, Virus, Parasite, Insect, Cancer, Fungus, Mammalian Cell and Mycoplasma) and v) Target Object of Cell (Membrane Protein, Cytoplasmic Protein, Lipid Bilayer and DNA/RNA). Classification: Peptide Source.Classification involves: i) Kingdom of sources species (We use Cavalier-Smith's six-kingdom system (Cavalier-Smith, 2004)) and ii) Source species. Description: Information about Sequence. Description includes: i) Sequence (amino acids with L and without stereoisomer are denoted by uppercase letters, D stereoisomers are represented by

This article is protected by copyright. All rights reserved.

Accepted Article

lowercase letters); ii) Length of peptide and iii) Unusual and posttranslationally modified amino acids (unusual or posttranslationally modified amino acids are denoted by X or x).

2. Antimicrobial Activity against Target Species Antimicrobial activity against target species comprises information on: target species, activity measure, activity value and unit. A brief description of experimental conditions is provided also. 3. Hemolytic and Cytotoxic Activity For medical purposes antimicrobial peptide should have high antimicrobial and low or no hemolytic/cytotoxic activity. For activity test erythrocytes from different species are used most frequently. “Hemolytic and Cytotoxic Activity” information comprises: target cell, activity measure for lysis, peptide concentration and unit. 4. Organization of information in DBAASP Peptide Card Full information on each peptide is presented in the peptide card. Example of the peptide card is given in the Supplemental File S1. There is “Peptide and Target Summary” information at the top followed by “Peptide Source” information with the link to NCBI. The “Information about Sequence” are presented in several tables, including: i) table “Monomer” describes sequence termini modification and length of peptide; ii) table “Intrachain Bond” hold information about position of amino acids involved in intrachain bond and bond type; iii) table “Unusual or Modified Amino Acid” involves position and type of unusual amino acid and amino acid before modification. Peptide activity informarion is presented by two tables: “Activity Against Target Species” and “Hemolytic and Cytotoxic Activity”. Additional information about experiment or target is given in the “Note”. There are references to the articles at the bottom of the card. 5. Database Statistics Statistical data of the database on the 10th of April, 2014 are presented by the following tables: i) Number of peptides by complexity (Table 1). ii) Number of monomers by synthesis type, bond type and amino acid modification (Table 2). Additional information on monomer length distribution is given in the Fig. 1.

UTILITIES Search Peptides can be searched by the “Search” page. The page is divided into four sections for an easier query construction (Supplemental File S2). The result table provides a list of peptides which correspond to the search criteria of the “Search” page. For each peptide are displayed the ID, Name, N Terminus, Sequence and C Terminus (Supplemental File S3). “View” is a link to the individual “DBAASP peptide card” (see “Organization of information in DBAASP Peptide Card” in the previous section). Ranking Search The “Ranking Search” page allows users to find peptides by target species and activity measure in combination with other search options. The result table is a list of peptides ranked by activity value for given target species and activity measure (Supplemental File S4). System will soon have ranking search for healthy cells and lysis measure.

This article is protected by copyright. All rights reserved.

Accepted Article

Prediction Tool Most of the available AMP prediction methods use common approach for different classes of AMPs. In contrast to available approaches, we propose, that a strategy of prediction should be based on the fact, that there are several kinds of AMPs. In the current version of DBAASP we focus on the prediction of a particular but the biggest class of AMP, the linear cationic antimicrobial peptides (LCAP). The method is based on the analysis of the physico-chemical characteristics responsible for the peptide ability to interact with an anionic membrane. The following characteristics such as hydrophobicity, amphiphaticity, location of the peptide in relation to membrane, charge, propensity to disordered structure were studied. The detail description of the algorithm can be found in paper (Vishnepolsky & Pirtskhalava, 2013).

COMPARITION WITH OTHER DATABASES The available AMP databases can be divided into specialized and general ones. As specialized we can consider: PhytAMP - dedicated to plant AMPs (Hammami, 2009); Bactibase - database for bacteriocin peptides (Hammami, 2007); PenBase - database provides information about penaeidins (Gueguen, 2006); RAPD - database of recombinantly-produced peptides (Li, 2008); Defensins knowledgebase - database contains information about defensins (Seebah, 2007); Peptaibols - database dedicated to peptaibols (Whitmore, 2004); Norine - database of nonribosomal peptides (Caboche, 2008); DADP - database of anuran peptides (Novkovic, 2012); DAMPD – contains ribosomal peptides (Sundararajan, 2012). The antimicrobial peptide databases of general type are characterized by high number of entries and broad range of peptide origin. For instance: YADAMP contains 2525 sequences (Piotto, 20120); APD has 2338 sequences (Wang, 2009); CAMP holds 6756 sequences (Waghu, 2014). LAMP consists 5547 sequences (http://biotechlab.fudan.edu.cn/database/lamp/index. php). DBAASP contains more than 4000 experimentally proved ribosomal, nonribosomal and synthetic peptides. Thus, DBAASP can be included in the group of AMP databases of general type. The comparison of the databases of general type have been performed and result of comparison is presented in the tables 3, 4. Comparison shows that detailed chemical structure is presented in DBAASP only. Despite DBAASP, information on target object of cell are only presented in APD. In DBAASP all entries are provided by: i. detailed chemical structure of peptide, ii. antimicrobial activity against target species and iii. hemolytic/cytotoxic activity, while this information either incomplete or does not exist in other databases. There are differences in searching systems also. Despite DBAASP, search by sequence “Length” can be done by YADAMP search system. In spite of DBAASP, there are not any search systems that allow to find peptides by “N Terminus” modification and “Synthesis Type”. Only APD and DBAASP gives possibility to find peptides by “C terminus” modification, “Unusual Amino Acid”, “Intrachain Bond”, “Complexity” and “Target Object of Cell”. DBAASP gives possibilities to perform ranking search by particular target species and measure of activity. Result of the ranking search is the ranking list of selected peptides by activity values. DBAASP and CAMP have the tool of sequence-based prediction of existence of antimicrobial activity. Accuracy of DBAASP prediction on the testing set (91%) is little better than CAMP prediction accuracy (90%) (unpublished data). CONCLUSION To improve antimicrobial properties of existing antimicrobial peptides or design new active ones data about peptide chemical structure and antimicrobial activities are needed. DBAASP provides

This article is protected by copyright. All rights reserved.

Accepted Article

users with detail information concerning peptide sequence; N and C end modifications; source; bonds; posttranslational modification of amino acid. All records contain information about antimicrobial activity of the peptide. The database is updated regularly and the system is provided with the prediction algorithms for antimicrobial peptides. (The development of prediction algorithm is in progress.) The prediction algorithms and knowledge of peptide detailed structure and activity are necessary to get cheaper rational design of new therapeutic peptides. AVAILABILITY DBAASP is accessible at http://www.biomedicine.org.ge/dbaasp/ ACKNOWLEDGEMENT Project is supported by Shota Rustaveli National Science Foundation (Georgia) and Centre National de la Recherche Scientifique (France) Grant No 09/10. The authors also acknowledge support from ISTC-BTEP Grant No G-2102 and collaborators from this grant members of the Office of Cyber Infrastructure and Computational Biology (OCICB), National Institute of Allergy and Infectious Diseases (NIAID) Mike Tartakovsky, Alex Rosenthal, Andrei Gabrielian and Darrell Hurt. REFERENCES Brogden KA (2005) Antimicrobial peptides: pore formers or metabolic inhibitors in bacteria? Nat Rev Microbiol 3: 238-250. Caboche S, Pupin M, Leclere V, Fontaine A, Jacques P & Kucherov G (2008) NORINE: a database of nonribosomal peptides. Nucleic Acids Res.36: (Database issue) D326-D331.

Caboche S, Leclere V, Pupin M, Kucherov G & Jacques P (2010) Diversity of monomers in nonribosomal peptides: towards the prediction of origin and biological activity. J Bacteriol 192: 5143-5150. Cavalier-Smith T (2004) Only six kingdoms of life. Proc Biol Sci 271: 1251-1262. Giudicelli V & Lefranc MP (2012) IMGT-ONTOLOGY. Front Genet 3: article79. Gueguen Y, Garnier J, Robert L, Lefranc MP, Mougenot I, de Lorgeril J, Janech M, Gross PS, Warr GW, Cuthbertson B, Barracco MA, Bulet P, Aumelas A, Yang Y, Bo D, Xiang J, Tassanakajon A, Piquemal D & Bachère E (2006) PenBase, the shrimp antimicrobial peptide penaeidin database: sequence-based classification and recommended nomenclature. Dev Comp Immunol 30: 283–288. Hammami R, Zouhir A, Ben Hamida J & Fliss I (2007) BACTIBASE: a new web-accessible database for bacteriocin characterization. BMC Microbiol 17: 89. Hammami R, Ben Hamida J, Vergoten G & Fliss I (2009) PhytAMP: a database dedicated to antimicrobial plant peptides. Nucleic Acids Res 37: (Database issue) D963-D968.

This article is protected by copyright. All rights reserved.

Accepted Article

Junkes C, Wessolowski A, Farnaud S, Evans RW, Good L, Bienert M & Dathe M (2008) The interaction of arginine- and tryptophan-rich cyclic hexapeptides with Escherichia coli membranes. J Pept Sci 14: 535-543. Li Y & Chen Z (2008) RAPD: a database of recombinantly-produced antimicrobial peptides. FEMS Microbiol Lett 289: 126–129. McAuliffe RO, Ross P & Hill C (2001) Lantibiotics: structure, biosynthesis and mode of action. FEMS Microbiol Rev 25: 285–308. Novkovic M, Simunic J, Bojovic V, Tossi A & Juretic D (2012) DADP: the database of anuran defense peptides. Bioinformatics 28: 1406-1407. Piotto SP, Sessa L, Concilio S & Iannelli P (2012) YADAMP: yet another database of antimicrobial peptides. Int J Antimicrob Agents 39: 346-351. Reddy KV, Yedery RD & Aranha C (2004) Antimicrobial peptides: premises and promises. Int J Antimicrob Agents 24: 536-47. Salomon RA & Farias RN (1992) Microcin 25, a novel antimicrobial peptide produced by Escherichia coli. J Bacteriol 174: 7428-7435. Seebah S, Suresh A, Zhuo S, Choong YH, Chua H, Chuon D, Beuerman R & Verma C (2007) Defensins knowl-edgebase: a manually curated database and information source focused on the defensins family of antimicrobial peptides. Nucleic Acids Res 35: (Database issue) D265– D268. Sundararajan VS, Gabere MN, Pretorius A, Adam S, Christoffels A, Lehväslaiho M, Archer JAC, & Bajic VB (2012) DAMPD: a manually curated antimicrobial peptide database. Nucleic Acids Research 40: (Database issue) D1108–D1112 Waghu FH, Gopi L, Barai RS, Ramteke P, Nizami B, & Idicula-Thomas S (2014) CAMP: Collection of sequences and structures of antimicrobial peptides. Nucleic Acids Res 42: (Database issue) D1154-D1158. Vishnepolsky B & Pirtskhalava M (2013) Prediction of linear cationic antimicrobial peptides based on characteristics responsible for their interaction with the membranes. arXiv:1307.4656 [q-bio.BM] Wang G (2008) Structures of human host defense cathelicidin LL-37 and its smallest antimicrobial peptide KR-12 in lipid micelles. J Biol Chem 283: 32637-32643. Wang G, Li X & Wang Z (2009) APD2: the updated antimicrobial peptide database and its application in peptide design. Nucleic Acids Res 37: (Database issue) D933- D937.

This article is protected by copyright. All rights reserved.

Accepted Article

Whitmore L & Wallace BA (2004) The peptaibol database: a database for sequences and structures of naturally occurring peptaibols. Nucleic Acids Res 32: (Database issue) D593– D594.

Figure legend Fig. 1. Distribution of the number of monomers according to the length of monomers Peptide Complexity Type Monomers Dimers Two-peptides Total Number of Peptides Table 1. Number of peptides by complexity.

Number of Peptides 3917 27 110 4054

Monomer Type Number of Monomer Ribosomal monomers 1382 Nonribosomal monomers 3 Synthetic monomers 2532 Monomers with disulfide bond (DSB) 746 Monomers with N-C termini peptide bond (NCB) 194 Monomers with thioether bond (TIE) 5 Monomers with sidechain-mainchain bond (SMB) 2 Monomers without intrachain bond 3021 Monomers with modified N terminus 253 Monomers with modified C terminus 1722 Monomers with modified N and C termini 226 Monomers without N and C termini modification 2168 Peptides with Modified side chain 271 Monomers with D amino acid 246 Table 2. Number of monomers by synthesis type, bond type and amino acid modification.

This article is protected by copyright. All rights reserved.

Accepted Article

Available Information

Number of Entries

Origin of Peptides

Detailed Chemical Structure

Target Object of Cell

Activity Against Target Species

Hemolytic / Cytotoxic Activity

>4000 2525 2353 6756 5547

N, S N, S N, S N, S, P N, S, P

+ NC -

+ + -

+ NC NC NC NC

+ NC NC NC

DBAASP YADAMP APD CAMP LAMP

N – Natural peptide, S – Synthetic peptide, P – Predicted peptide, NC – Not complete Table 3. Comparison of information available in databases of general type

DBAAS P YADA MP APD CAMP LAMP

Search System Options

Sequen ce Length

N Termin us

C Termin us

Unusu al Amin o Acid

Intracha in Bond

Complex ity

Synthes is Type

Target Group

Target Object of Cell

Predicti on Tool

+

+

+

+

+

+

+

+

+

+

+

-

-

-

-

-

-

-

-

-

-

-

+ -

+ -

+ -

+ -

-

+ +

+ -

+ -

Table 4. Comparison of general type databases’ search system

This article is protected by copyright. All rights reserved.

Accepted Article This article is protected by copyright. All rights reserved.

DBAASP: database of antimicrobial activity and structure of peptides.

The Database of Antimicrobial Activity and Structure of Peptides (DBAASP) is a manually curated database for those peptides for which antimicrobial ac...
115KB Sizes 3 Downloads 3 Views