Vol. 1, No. 3 2004

Drug Discovery Today: Technologies Editors-in-Chief Kelvin Lam – Pfizer, Inc., USA Henk Timmerman – Vrije Universiteit, The Netherlands DRUG DISCOVERY

TODAY

TECHNOLOGIES

Lead optimization

Pharmacophore definition and 3D searches T. Langer1,*, G. Wolber2 1 2

Computer Aided Molecular Design Group, Institute of Pharmacy, University of Innsbruck, Innrain 52, A-6020 Innsbruck, Austria Inte:Ligand GmbH, Clemens Maria Hofbauer-G. 6, A-2344 Maria Enzersdorf, Austria

The most common pharmacophore building concepts based on either 3D structure of the target or ligand information are discussed together with the application of such models as queries for 3D database search. An overview of the key techniques available on the market is given and differences with respect to algorithms used and performance obtained are highlighted. Pharmacophore modelling and 3D database search are shown to be successful tools for enriching screening experiments aimed at the discovery of novel bio-active

Section Editor: Hugo Kubiniyi – University of Heidelberg, Germany Pharmacophore models are hypotheses on the 3D arrangement of structural properties, such as hydrogen bond donar and acceptor properties, hydrophobic groups and aromatic rings of compounds that bind to a biological target. In the presence of the 3D structure of this target of by comparison with inactive analogs, further geometric and/or steric constraints can be defined. The article describes and evaluates strategies and commercial software for pharmacophore definition, starting from the 3D structures of ligand-protein complexes or from ligands alone. Once a pharmacophore model is established, 3D searches in large databases can be performed, leading to a significant enrichment of active analogs.

compounds. and a ligand and (ii) the enrichment of hit rates obtained in experimental screening of subsets that have been obtained from in silico screening experiments (Fig. 1) [2].

Introduction The key goal of computer-aided molecular design methods in modern medicinal chemistry is to reduce the overall cost associated with the discovery and development of a new drug, by identifying the most promising candidates to focus on the experimental efforts. Often, drug discovery projects have reached already a well-advanced stage before detailed structural data on the target has become available. Experimental screening for lead structure determination suffers from limitation with respect to the possible number of compounds that can be submitted to a high-throughput bio-assay and with the low number of hits obtained that is in the range of 0.1% [1]. Within this context, the pharmacophore approach has proven to be successful, allowing (i) the perception and understanding of key interactions between a target *Corresponding author: (T. Langer) [email protected] URL: http://pharmazie.uibk.ac.at/CAMD 1740-6749/$ ß 2004 Elsevier Ltd. All rights reserved.

DOI: 10.1016/j.ddtec.2004.11.015

Key technologies – structure-based pharmacophores A pharmacophore (pharmacophore model, pharmacophoric pattern) can be considered as the ensemble of steric and electrostatic features of different compounds which are necessary to ensure optimal supramolecular interactions with a specific biological target structure and to trigger or to block its biological response [3]. Feature-based pharmacophores have turned out to be the most effective type of pharmacophore models and the utility of such models as queries for 3D database search has been reviewed recently [4,5]. The strength of this type of pharmacophore models is the general definition of the pharmacophoric points. The chemical function character allows searching for very diverse structural scaffolds because multiple structural elements can express the same chemical function. Pharmacophore key elements might be a group of atoms, a part of the volume of the www.drugdiscoverytoday.com

203

Drug Discovery Today: Technologies | Lead optimization

Vol. 1, No. 3 2004

Figure 1. Typical pharmacophore-based virtual screening workflow.

molecule, ‘classical’ pharmacophoric features like H-bond acceptors (HBA) and donors (HBD), charged or ionizable groups, hydrophobic (HY) and/or aromatic rings (RA) together with geometrical constraints like distances, angles, and dihedral angles. The set of these features is termed a pharmacophoric ‘model’ or ‘hypothesis’. There are different possibilities to derive pharmacophores models: The way to determine a 3D pharmacophore is mainly based on the availability of the three-dimensional structure of the binding site of the target. When the 3D structure of the target has been characterized, and when a certain number of ligands (with or without associated binding affinity) are available, pharmacophore models can be generated directly from the complex structure of the ligand and the target. Using the LigandScout program, [6] available from Inte:Ligand GmbH (http:// www.inteligand.com/), is one possibility to derive automatically a feature-based pharmacophore model from a ligand– target complex structure. In this program, the first step is the assignment of ligand information on hybridization status and bond characteristics that is not present in the input data files from the Protein Databank [7] by using an extended heuristic approach together with template-based numeric analysis. Feature-based pharmacophores are then generated by determining interactions between ligand and target atoms on the basis of H-bond formation, charge and hydrophobic contact. These models can be then refined according to binding data or several models can be combined into one 204

www.drugdiscoverytoday.com

common feature pharmacophore. The capability of searching 3D databases will be implemented shortly. If only 3D information on the binding site is available without a ligand interacting, another approach to derive a pharmacophore model can be undertaken: Using the structure-based focusing (SBF) technique within the Cerius2 software package [8], available from Accelrys Inc (http://www.accelrys.com/) allows the construction of binding-site pharmacophore hypotheses. The procedure is mainly based on (i) calculation of interaction sites using the algorithms defined in the LUDI program [9], (ii) clustering of the vectors for H-bonding donating and accepting groups and of the hydrophobic regions, and (iii) transformation of the obtained clusters into a feature-based pharmacophore hypothesis representing the HBA, HBD, and HY functions. The Unity program [10], available from Tripos Inc (http://www.tripos.com/) also allows the construction of structural pharmacophore queries based on molecules, molecular fragments, or receptor sites. In addition to atoms and bonds, 3D queries can include features such as lines, planes, centroids, extension points, hydrogen bond sites, and hydrophobic sites. Distance, angle, excluded volume, surface volume, and spatial constraints define the geometric relationships between features. In the molecular operating environment MOE (Chemical Computing Group, http://www.chemcomp.com/) [11], 3D pharmacophore queries can contain locations of features or chemical groups as well as restrictions on shape. Restrictions on shape can be

Vol. 1, No. 3 2004

imposed by specifying the included and/or excluded volume areas. In MOE, the position and the shape of the volume are defined by a single sphere or by the union of several spheres. Additionally, a consensus query from not one but a set of aligned molecules can be used for the 3D-pharmacophore database search which provides high control, offering both partial and systematic matching as well as flexible matching rules.

Key technologies: ligand-based pharmacophores If only ligand information is available, the identification of a pharmacophore, in principle, involves two steps: (i) the analysis of the training set molecules itself to identify pharmacophoric features, and (ii) the alignment of the assumed bio-active conformations of the molecules to determine the best overlay of corresponding features. Conformational flexibility actually represents one of the main difficulties in pharmacophore generation, because the bio-active conformations of the molecules are usually not known. Several programs are available for building pharmacophores from ligand information: Catalyst [12], available from Accelrys Inc, is by far the most used one, because it offers large flexibility during pharmacophore generation together with integrated high-speed 3D database searching capability. Other successful programs are DiscoTech [13], and Gasp [14], both from Tripos Inc. The main differences between the programs lie in the algorithms used for the alignment and in the way in which the conformational flexibility is handled, and how 3D database search is performed. In Catalyst, conformational flexibility is handled by computing a series of low-energy conformers for each molecule using a randomized search algorithm together with a poling function allowing an extensive coverage of the conformational space. Two major automatic modes for pharmacophore model generation are implemented: the algorithm for quantitative models HypoGen and the builder for purely qualitative, that is, common feature models, HipHop. In the first step, Catalyst checks surface accessibility of molecules available for receptor interaction and then defines the position of different features by comparison of absolute coordinates of all conformations stored for the training set molecules rather than by interfeature distances. Model building is started with examination of the two most active molecules given in the training set, and all possible pharmacophore hypotheses based on the features available in these both molecules are enumerated. Following steps reduce the numbers of hypotheses to be considered by omitting those models that cannot explain the actual bioactivity data by geometric fitting of the molecular structures to the chemical features. In quantitative models, each chemical function includes a weight descriptor that is related to its relative importance in conferring the activity. Catalyst constructs multiple hypotheses that can explain and validate the structure/activity data in a chemically reasonable fashion.

Drug Discovery Today: Technologies | Lead optimization

The program provides the ability to cluster and merge hypotheses to develop more comprehensive models and can process numbers of conformations up to 255 per compound. In Disco [15], which is the basis for the commercial product DiscoTech [13], each molecule is characterized by ligand points and site points. The ligand points include atoms with hydrogen bond donor, hydrogen bond acceptor, and hydrophobic character, or negative charge, or positive charge. Site points represent the hypothetical position of complementary atoms in the binding site and are determined from the position of heavy atoms in the ligand structure. Conformational flexibility in this case is handled by precomputing a series of low-energy conformers for each molecule with each conformer being treated as a rigid body during the alignment step. A conformer is represented by the interpoint distances calculated for the ligand and site points and a clique detection algorithms used to align structures based on these distances. In Disco, the molecule with the fewest conformations, following the active analogue approach paradigm [16], is used as a reference molecule. The output from a Disco run is a ranked list of all possible pharmacophore mappings where each feature of a pharmacophore must be present in all the molecules. This requirement might result in good pharmacophores being missed; hence, Disco has the option of finding solutions where some molecules are excluded from the model. Gasp [14] is based on a genetic algorithm (GA) and differs from both Catalyst and Disco in its handling of the conformational problem: Each molecule is input as a single conformation and conformational analysis together with random rotations and a random translation are applied onthe-fly before any superimposition is made. The pharmacophoric features (hydrogen bond donor protons, acceptor lone-pairs, and ring centers including projected site points, however, no charges) are determined in all compounds and the molecule with the least number of features is chosen as the base molecule to which the other molecules are fitted. Within the GA, the chromosomes encode the angles of rotation of the rotatable bonds in all of the molecules and the mapping of the pharmacophoric features in the base molecule to corresponding features in each of the other molecules. The fitness function first generates conformations for each molecule and then uses a least-squares procedure to overlay each molecule onto the base molecule using the mappings. Fitness is calculated as a combination of the similarity and the number of the overlaid features, together with the volume integral of the overlay. Genetic operators attempt to generate solutions that maximise the fitness function and thus correspond to the best possible structural overlay. Gasp biggest strength over Disco and Catalyst is that it considers steric overlap of the ligands during pharmacophore model generation, whereas the latter two only attempt at matching pharmacophore features without taking shape into account. www.drugdiscoverytoday.com

205

Drug Discovery Today: Technologies | Lead optimization

Vol. 1, No. 3 2004

In a recent paper, results obtained with Catalyst/HipHop, Gasp, and Disco have been compared and discussed in detail [17], indicating that Catalyst and Gasp clearly outperform Disco at reproducing the five target pharmacophores described in this study. Catalyst and GASP were found to provide almost equivalent performance even though the results were not consistent for all the data sets. A very notable result is that, for both programs, the target pharmacophores were found within the first 10 solutions in four out of five data sets. Gasp was found inherently simpler than Catalyst, however, the latter providing much more flexibility in setting and tuning parameters. The biggest advantage of Catalyst over Gasp is that the pharmacophoric features might be customized according the requirements of the training set under investigation.

3D database searching After having generated a pharmacophore model, there are two ways to identify new molecules which share its features and can thus exhibit a desired biological response. First, there is de novo design. This approach seeks to link the parts of the pharmacophore together with fragments to generate molecular structures that are chemically reasonable and novel. The second method is to perform 3D database pharmacophore searching, providing the main advantage over de novo design that one is capable of identifying molecules which can be obtained from corporate compound libraries or can be synthesized using a well-established protocol. In the ideal case, 3D database search is able to identify compounds exhibiting properties outside those of the set of compounds used for building the pharmacophore allowing the identification of novel chemical structures and molecular features (termed as scaffold hopping, or lead hopping, respectively). Technically, there are two possibilities to search 3D molecular databases with pharmacophore models (Fig. 2): firstly, using a database file format containing a set of well pre-computed conformations, thus speeding up the search procedure; secondly, calculate conformers on-the-fly and perform the fitting analysis subsequently. The latter approach has the advantage that mass storage capacity is not relevant, which has been an issue for a long time when using multiconformer databases. By contrast, using pre-computed conformations for pharmacophore fitting has been demonstrated to outperform the on-the-fly calculation approach. In Catalyst, both methods are possible, because normally, Catalyst databases are stored in multiconformational data format, however, permitting additional on-the-fly conformational tuning while fitting molecules to a pharmacophore model. This allows the searching of large databases, containing up to several millions of compounds, within a time frame of few minutes. 206

www.drugdiscoverytoday.com

Figure 2. 3D Database search strategies.

Strategy comparison The key players on the market for pharmacophore-based 3D database search are Accelrys, Tripos, and the Chemical Computing Group, and their software solutions have been discussed in the previous section. Additional programs are available on the market, including C@rol, [18] available from Molecular Networks GmbH (http://www.molnet.de/), Feature Trees, [19] available from BioSolveIT GmbH (http:// www.biosolveit.de/), and several academic prototypes described in recent literature review [20]. All commercial packages allow, more or less, efficient pharmacophore construction and 3D database search. The MOE system (Chemical Computing Group) is a highly integrated, however, easily customizable molecular modelling environment, in which the pharmacophore approach is well embedded. The corresponding Tripos product, Sybyl, contains the modules Gasp and Disco for pharmacophore building; for 3D database search, the integrated system Unity is to be used. Both

Vol. 1, No. 3 2004

environments are also well integrated and the Sybyl Programming Language (SPL) enables the users to automate many procedures, including analysis of pharmacophores, hit lists, etc. The highest performance concerning 3D database search speed and pharmacophore model customization is offered by Accelrys’ products Catalyst and Cerius2, however, the cumbersome graphical interface of the former has been often criticized [17]. Also the integration between the Accelrys products is much lower than that offered by products distributed by their competitors. The choice which software will be used for a pharmacophore generation and 3D database search job might depend more on the flavour of the user than on hard facts based on possibilities offered by the different packages. In certain well-defined areas, the programs of the small software companies will offer better solutions than those of the key players. The success of such spin-off programs will probably highly depend on the capability of being integrated into an existing workflow within the drug discovery and development process.

Conclusions The pharmacophore concept has proven to be extremely successful, not only in rationalizing structure-activity relationships, but also by its large impact in developing the appropriate 3D-tools for efficient virtual screening. Profiling of combinatorial libraries and compound classification are other often-used applications of this concept. The prior use of pharmacophore models in biological screening of compounds is an efficient procedure, because it eliminates quickly molecules that do not possess the required features thus leading to a dramatic increase of enrichment, when compared to a purely random screening experiment. One should not forget, however, that additional molecular characteristics not reflected by pharmacophore models (physico-chemical, ADME and toxicological properties) must be taken into account when deciding upon which compounds should be further developed.

Drug Discovery Today: Technologies | Lead optimization

References 1 Oprea, T.I. (2002) Current trends in lead discovery: are we looking for the appropriate properties? J. Comput. Aided Mol. Des. 16, 325–334 2 Hoffmann, R.D. et al. (2004) Use of 3D pharmacophore searching. In Computational Medicinal Chemistry and Drug Discovery (Tollenaere, J., De Winter, H., Langenaeker, W., Bultinck, P. eds), pp. 461–482, Dekker Inc 3 Wermuth, C.-G. and Langer, T. (1993) Pharmacophore identification. In 3D-QSAR in Drug Design. Theory, Methods, and Applications (Kubinyi, H., ed.), pp. 117–136, ESCOM Science Publishers 4 Kurogi, Y. and Gu¨ner, O.F. (2001) Pharmacophore modeling and threedimensional database searching for drug design using catalyst. Curr. Med. Chem. 8, 1035–1055 5 Langer, T. and Krovat, E-M. (2003) Chemical feature-based pharmacophores and virtual library screening for discovery of new leads. Curr. Opin. Drug Discov. Dev. 6, 370–376 6 Wolber, G. and Langer, T. (2004) LigandScout: 3D Pharmacophores derived from protein-bound ligands and their use as virtual screening filters. J. Chem. Inf. Comput. Sci. Webrelease 24 Nov. 2004, doi:10.1021/ ci049885e 7 Berman, H. et al. (2000) The protein data bank. Nucleic Acids Res. 28, 235– 242 8 Cerius2 available from Accelrys Inc, San Diego, CA, USA 9 Bo¨hm, H-J. (1992) The computer program LUDI: a new method for the de novo design of enzyme inhibitors. J. Comput. Aided Mol. Des. 6, 61–78 10 Unity/Sybyl available from Tripos Inc., St. Louis, MO, USA 11 MOE available from Chemical Computing Group Inc., Quebec, Canada 12 Catalyst available from Accelrys Inc, San Diego, CA, USA 13 DiscoTech available from Tripos Inc., St. Louis, MO, USA 14 Gasp available from Tripos Inc., St. Louis, MO, USA 15 Martin, Y.C. et al. (1993) A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists. J. Comput. Aided Mol. Des. 7, 83–102 16 Marshall, G.R. et al. (1979) The conformational parameter in drug design: the active analogue approach. In Computer-Assisted Drug Design, (Vol. 112) (Olson, E.C., Christoffersen, R.E. eds), pp. 205–226, American Chemical Society 17 Patel, Y. et al. (2002) A comparison of the pharmacophore identification programs: Catalyst, DISCO and GASP. J. Comput. Aided Mol. Des. 16, 653–681 18 C@rol available from Molecular Networks GmbH, Erlangen, Germany 19 Feature Trees available from BioSolveIT GmbH, Sankt Augustin, Germany 20 Van Drie, J.H. (2003) Pharmcophore discovery – lessons learned, Curr. Pharm. Des. 9, 1649–1664

www.drugdiscoverytoday.com

207

Pharmacophore definition and 3D searches.

The most common pharmacophore building concepts based on either 3D structure of the target or ligand information are discussed together with the appli...
236KB Sizes 2 Downloads 3 Views