DOI: 10.1002/cmdc.201500161

Viewpoints

A Bright Future for Evolutionary Methods in Drug Design Tu C. Le[a] and David A. Winkler*[a, b, c] Most medicinal chemists understand that chemical space is extremely large, essentially infinite. Although high-throughput experimental methods allow exploration of drug-like space more rapidly, they are still insufficient to fully exploit the opportunities that such large chemical space offers. Evolutionary methods can synergistically blend automated synthesis and characterization methods with computational design to identi-

Introduction Chemical space is vast, estimated at 10100 molecules, roughly 1080 of which may be ‘drug-like’.[1] This estimate provides both a threat and an opportunity. On the negative side, it is impossible to explore more than a minute fraction of such enormous spaces experimentally, using even the most optimistic projections of the capabilities of future high-throughput synthesis and assessment technologies. The number of possible small drug-like compounds exceeds the number of atoms in the universe. If we further assume arbitrarily that there are 108 drugs (10 000 times more than which currently exist) then the probability of finding a drug by random synthesis of one million compounds is 10¢66 (i.e., (106 Õ 108/1080), essentially zero).[2] However, on the positive side, the nearly infinite number of small molecules that could be synthesized suggests that there are many new and productive areas of chemical space to be exploited. This provides an almost inexhaustible supply of novel drugs if we can find more efficient ways of exploring drug-like chemical space. Evolutionary algorithms are powerful approaches for solving search and optimization problems of this type, particularly those that involve multiple, conflicting objectives. They are effective in searching intractably large spaces for diverse sets of optimal solutions, and are therefore suitable for finding new drugs in large chemical spaces.[3] They are also well placed to work synergistically with high-throughput experimental synthesis and characterization technologies. These algorithms mimic some of the major characteristics of Darwinian evolu[a] Dr. T. C. Le, Prof. D. A. Winkler Cell Biology Group, Biomedical Materials Program CSIRO Manufacturing Flagship, Bag 10 Clayton South MDC 3169 (Australia) E-mail: [email protected] [b] Prof. D. A. Winkler Monash Institute of Pharmaceutical Sciences 381 Royal Parade, Parkville 3052 (Australia) [c] Prof. D. A. Winkler Latrobe Institute for Molecular Science Latrobe University, Bundoora 3046 (Australia)

ChemMedChem 2015, 10, 1296 – 1300

fy promising regions of chemical space more efficiently. We describe how evolutionary methods are implemented, and provide examples of published drug development research in which these methods have generated molecules with increased efficacy. We anticipate that evolutionary methods will play an important role in future drug discovery.

tion. The algorithm takes an initial population of molecules and mutates them using crossover and mutation operators akin to those involved in biological evolution. The fitness of each population of molecules is assessed against an objective (fitness) function, some useful property such as biological activity, lack of toxicity, or dissimilarity to known patent space that is to be optimized. This allows molecules that are the most ‘fit’ to be selected and passed on to the next generation. Objective functions can also be more complex, simultaneously optimizing several desirable properties or minimizing undesirable ones.

What Are Evolutionary Algorithms? Evolutionary algorithms are a family of similar optimization techniques that differ largely in the way they are implemented. The genetic algorithm, originally described by Holland[4] and Rechenberg,[5] is the most widely used evolutionary method. It takes an initial set (population) of molecules (often randomly assembled, or sometimes using prior knowledge) with known activities that the algorithm must optimize. The structures (and optionally, other attributes) are encoded in a numerical way to generate a molecular ‘genome’ that can be modified by mutation operations to generate successive generations. Common genetic operations include: replication (or elitism), point mutation, and crossover. An example of molecular genome coding and genetic algorithm operations is illustrated in Figure 1. Replication or elitism selects the fittest molecules from a population and carries them forward unchanged into the next generation. While there is a range of mutation operators, the most common operation is point mutation. This modifies a single gene in the genome in some way and adds the mutated individual to the population of the next generation. A crossover operation selects two members of a population, randomly chooses a point to split them, and exchanges the genome fragment of one molecule for the equivalent genome fragment of the other. Crossover can also involve more complex mixing rules. The properties (fitness) of this new population of molecules are then evaluated experimentally or computationally.

1296

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Viewpoints

Figure 1. Fragmentation of a molecule into substituents, encoding possible libraries of substituents into genes, and common mutation operations on the genes (Adapted from Liang et al.[7] with permission. Copyright Ó 2012 The Royal Society of Chemistry; dx.doi.org/10.1039/C1OB06186K).

The cycle repeats until no more improvement occurs, or until a desired level of fitness is achieved. Some studies have augmented experimental assessment of molecule fitness by structure–activity models. Initial experimental data are used to train a model that becomes the objective function used to generate subsequent populations of molecules. This can decrease the requirement for experimental measurement of molecule fitness, but care must be taken that the molecules in the subsequent populations remain in, or close to, the domain of applicability of the model. The docking score of populations of molecules binding to a target site of a protein has also been used as a computational objective function.[6] Although genetic algorithms clearly cannot provide an exhaustive search of all drug-like space, the mutation operations ChemMedChem 2015, 10, 1296 – 1300

www.chemmedchem.org

allow distant areas of chemical space to be explored (crossover) or local areas of space to be optimized (point mutation). Solutions are likely to be locally optimal, but nevertheless interesting and useful as in biological evolution. Genetic algorithms therefore provide substantial promise for discovering and optimizing novel drug leads, exploring vast chemical spaces efficiently. Molecules evolved under this process are likely to be substantially better than those in the original pool.

Evolutionary Methods in Drug Design The very useful properties of genetic algorithms have been quite widely applied to solving optimization problems in many of the computational tools used in drug design. Since the early 1990s there have been successful applications of evolutionary

1297

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Viewpoints methods to docking of small molecules to protein targets, conformational analysis, drug dosing strategies,[8] pharmacophore identification, similarity searching, derivation of QSAR models and feature selection, and combinatorial library design.[9] However, the direct structural mutation of populations of molecules to drive them toward some desirable activity or fitness function, often avoiding off-target, toxic, or other undesirable effects, is still in its relative infancy despite its potential for the discovery of new drug chemotypes. The first papers on the application of evolutionary methods to drug discovery were published mainly in computational journals rather than traditional medicinal chemistry journals and books. This may have contributed to the relative paucity of research into direct application of evolutionary methods to drug lead discovery and optimization in the medicinal chemistry community. Parrill was one of the first to describe how evolutionary and genetic methods could be applied to the discovery of new drug leads.[10] He showed how most rational drug design problems involve finding solutions to large combinatorial problems, and that exhaustive searches or synthesis of combinatorial libraries covering these very large chemical spaces are intractable. By applying evolutionary pressure akin to Darwinian natural selection via computational algorithms, Parrill claimed that it should be possible to search large chemical spaces more efficiently and find very useful, although clearly not globally optimal, solutions to such problems. In the specialized area of production of oligonucleotides in single- or double-stranded DNA or RNA, the SELEX (systemic evolution of ligands through exponential enrichment) approach was an early application of evolutionary methods.[11, 12] Random, very large libraries of oligonucleotides were exposed to a target ligand (protein or small molecule), and those that did not bind (less fit) were removed by affinity chromatography. Subsequent rounds of amplification and more stringent elution conditions yielded very tightly binding sequences. The evolutionary component in SELEX was more of a byproduct of the DNA polymerase used in the PCR amplification not being 100 % accurate and introducing mutations into the oligonucleotides. The technique has been used to evolve aptamers that bind to a variety of target ligands with extremely high affinity. In 1999, Archer identified technical components that would drive the practical adoption of evolutionary methods by the pharmaceutical research community in an article in Nature Biotechnology.[13] He described the “drug discovery factory”, the focal point of a futuristic, industrialized era in pharmaceutical discovery. This was a stand-alone facility that implemented state-of-the-art automation in the form of robotics, highthroughput instrumentation to monitor processes and products, and an extensive data system. In the intervening years this vision has been realized to a significant extent through the automation of synthesis by combinatorial and flow chemistry and high-throughput screening. Another very significant milestone toward this objective was realized in 2015 with the development of the first functional fully automated small-molecule “synthesis machine” reported by Li et al.[14, 15] This mimics the functionality of automated peptide, oligonucleotide, and more recently oligosaccharide synthesizers in the small-moleChemMedChem 2015, 10, 1296 – 1300

www.chemmedchem.org

cule chemistry domain. For the first time, it opens the door to fully automated methods for the synthesis of many different types of organic small molecules. Such automated synthesis methods should remove a significant bottleneck in the application of evolutionary methods in drug discovery. The number of published studies that describe the use of evolutionary methods to discover or optimize drug leads is quite small. A selection of some of the more interesting or important studies is provided here. Sheridan et al. were among the first to describe the potential of evolutionary methods to accelerate drug discovery.[16, 17] In their first study, they used genetic algorithms to select a set of amines for the construction of a tripeptoid library. They provided examples of the effectiveness of genetic optimization to discover peptoids similar to an important tripeptoid target, to two tetrapeptide cholecystokinin (CCK) antagonists, and to discover angiotensin converting enzyme (ACE) inhibitors. In all cases they showed how the genetic algorithm identified, in a modest amount of computer time, high-scoring peptoids that resembled the targets. In a subsequent paper they constructed molecules containing a carboxylate, an amino acid, and a primary or secondary amine that could be combined in almost 16 billion ways. This large chemical space was explored by using genetic algorithms to identify small peptoids that were structurally similar to a lead molecule. Hall used directed molecular evolution to develop light-activated drugs that targeted the drugs to diseased areas, thereby largely avoiding healthy tissue.[18] They reported a selection mechanism that could be used to drive the discovery of new photosensitive molecules activated by a designated light wavelength. Subsequently, Douguet, Thoreau, and Grassy[1] reported the use of a computational QSAR model as a fitness function for genetic algorithm optimized de novo design of novel retinoids. SMILES strings representations of molecules were used as molecular genomes on which genetic operations acted to create new populations of molecules. The population members were ranked according to their physicochemical properties computed by a QSAR model. They successfully improved the diversity of a library of salicylic acid analogues, and discovered two novel retinoids that docked and scored well at the RARa receptor. Although this fragment evolutionary approach was useful, the results strongly depended on the quality of the QSAR models. Karan and Miller described the potential of a then-emerging field of research called dynamic diversity to identify ligands with favorable properties.[19] The essence of this technique was the generation of dynamic combinatorial libraries constructed from molecular fragments via reversible chemistry. Various kinds of selection systems, such as protein or synthetic receptors, enriched the populations of molecules in the dynamic library that bound to the selection component. The authors suggested that ‘dynamic diversity’ should be the first step toward a selection and amplification process applicable to solution-phase small-molecule mixture libraries. The advantage of the approach was the ability of the selection process to alter the composition of the compound mixtures in the dy-

1298

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Viewpoints namic combinatorial libraries. The main element of the dynamic equilibrium strategy was a receptor template that allowed the synthesis of a compound under reversible conditions. Kluczyk et al. identified an important component in the drug evolution process and described its application to para-aminobenzoic acid (PABA) building blocks.[2] Building blocks are important components in drug development, and PABA frequently appears in drugs for a wide range of therapeutic uses. By shuffling a relatively small number of side chains in the 184 PABA-containing drugs, 4.5 million compounds could be generated. Building blocks such as PABA, with good chemical synthesis properties and a wide range of potential therapeutic properties, make very useful starting points for chemical libraries that can be subjected to evolutionary methods of optimization. This research group subsequently extended the work on building blocks to develop the chimera method,[20] grafting of building blocks identified in a drug library onto the same substitution sites in a rationally chosen scaffold. This generated libraries that were hybrids between known and prospective drugs. They exemplified the chimera approach by generating two libraries from PABA and salicylic acid. These drug-enriched chimera libraries should provide ready access to new drug hits and leads in high-throughput, multi-target screens. Peptides are an increasingly important source of drug leads.[21] Belda et al. described the application of genetic methods to computer-aided peptide evolution for the discovery of peptides active against prolyl oligopeptidase, p53, and DNA gyrase.[22] Peptides were defined by a six-gene chromosome encoding sequences selected from seven different amino acids chosen from the 20 natural amino acids. Binding energies between peptides and target proteins calculated by the AutoDock program[23] were used as the fitness function. The quality of the outcome was clearly dependent on the accuracy of the docking calculation. However, this virtual screening approach could help narrow the focus to small promising regions in the search space. Up to this point, evolutionary design was primarily focused on finding molecules that satisfy a single objective (e.g., high affinity binding or potent inhibition). Recently, methods that exploit concepts such as Pareto optimality have tackled the design of molecules that satisfy multiple objectives. Gillet reviewed the application of evolutionary computational methods to drug design in 2004.[24] However, her focus was the application of evolutionary algorithms to the design of combinatorial libraries and the selection of descriptors for QSAR models. The use of multi-objective evolutionary approaches was particularly emphasized, as with these methods, different objectives (fitness functions) are handled independently rather than being combined into a single fitness. Nicolaou et al.[25] recognized that computational de novo drug design involved searching an immense space of feasible, drug-like molecules to select those with the highest chances of becoming drugs. Their work described the use of multi-objective optimization algorithms that directly mutated chemical graphs to increase the average fitness of candidates as binders to selected drug targets, similarity or dissimilarity to known drugs, and Lipinski (favorable pharmacokinetics) criteria. ChemMedChem 2015, 10, 1296 – 1300

www.chemmedchem.org

Lameijer et al.[26] reviewed the use of genetic algorithms in drug design and reported an interactive evolutionary algorithm—the Molecule Evoluator—for the design of drug-like molecules.[27] They mutated members of populations of candidate drug molecules atom-by-atom and incorporated crossover mutations as well. Their algorithm incorporated a novel interactive fitness function that included a human expert in the loop. In a similar vein, Ecemis et al. reported, in an engineering journal, the potential for evolutionary methods to increase the efficiency of drug discovery.[28] Their method, Mobius, identified a diverse set of preclinical drug candidates by merging information from computational models with that from human domain experts. The aim in both these studies was to ensure that molecules were synthetically feasible, and to allow prior knowledge and experience of medicinal chemists to be part of the fitness function. Kawai, Nagata, and Takahashi recently published a similaritydriven simple evolutionary approach to producing new drug candidates.[29] Their method explored structural similarity close to reference molecules, but where the core scaffolds and substituents could vary in a limited manner. Such methods could only explore a local environment near the chosen leads, not more globally in the rest of chemical space. The main strength of the approach was in lead optimization and limited lead hopping rather than finding completely novel structures in distant areas of drug-like space that hit the same target. Finally, Devi et al. recently published a useful survey of evolutionary algorithms for de novo drug design in a computing journal[30] that summarized the main approaches to de novo design of drug leads using evolutionary methods. It is recommended reading for those who wish to explore the potential of these methods to find drug leads more efficiently, and to find molecules that interact with drug targets that may have radically different structures. Devi et al.’s review also summarizes the main computational methods that exploit evolutionary algorithms for ligand-based (e.g., pharmacophore searching) and structure-based (e.g., docking) design.

Conclusions Most medicinal chemists are now quite aware of the vastness of drug-like chemical space and the impossibility of exhaustively searching it for new chemical structures that may be the potent and selective medicines of the future. Evolutionary algorithms are among the most effective computational methods for exploring extremely large chemical spaces. There is clear potential for evolutionary methods to make strong contributions to the quest for safer and more effective drugs, and for new drugs for unmet clinical needs. Given their efficiency in searching vast spaces it is surprising that they have not been widely exploited in the drug discovery arena. There have been relatively few published reports of direct evolution of chemical libraries to discover new drugs. One reason may be that evolutionary methods are well matched to automated methods of synthesis and testing, and these have only appeared relatively recently. We postulate that another important reason for the paucity of applications of

1299

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Viewpoints evolutionary methods by medicinal chemists is that much of the seminal work has been published in engineering, computing and related journals, not medicinal chemistry journals. We feel that the time is ripe for evolutionary methods to make a much larger contribution to drug discovery. The application of evolutionary algorithms to pharmaceutical research appears to be at the bottom of the technology S-curve, poised to make strong and significant contribution to the discovery of new or more efficacious drugs in the near future. We anticipate such expansion will be catalyzed by the major advances in synergistic technologies for high-throughput screening and automated synthesis of small organic molecules discussed above. All of the components required to fully exploit the potential of evolutionary drug discovery are now in place. It will be very interesting to see how this exciting field matures in the next decade.

Acknowledgements D.A.W. acknowledges financial support from the CSIRO Advanced Materials Transformational Capability Platform, a Newton Turner Award for Exceptional Senior Scientists, and travel funding from a Royal Academy of Engineering Distinguished Visiting Fellowship. We are grateful to Dr. Randal S. Olson (University of Pennsylvania) for kindly granting permission to reproduce the graphical abstract image. Keywords: chemical space · drug design · drug-like space · evolutionary algorithms · genetic algorithms [1] D. Douguet, E. Thoreau, G. Grassy, J. Comput.-Aided Mol. Des. 2000, 14, 449 – 466. [2] A. Kluczyk, T. Popek, T. Kiyota, P. de Macedo, P. Stefanowicz, C. Lazar, Y. Konishi, Curr. Med. Chem. 2002, 9, 1871 – 1892. [3] J. A. Vrugt, B. A. Robinson, Proc. Natl. Acad. Sci. USA 2007, 104, 708 – 711. [4] J. H. Holland, Adaptation in Natural and Artificial Systems, The University of Michigan Press, Ann Arbor, 1975. [5] I. Rechenberg, Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der Biologischen Evolution, Frommann-Holzboog, 1973.

ChemMedChem 2015, 10, 1296 – 1300

www.chemmedchem.org

[6] Y. Yagi, K. Terada, T. Noma, K. Ikebukuro, K. Sode, BMC Bioinf. 2007, 8, 11. [7] Z. J. Liang, X. Ding, J. Ai, X. Q. Kong, L. M. Chen, L. Chen, C. Luo, M. Y. Geng, H. Liu, K. X. Chen, H. L. Jiang, Org. Biomol. Chem. 2012, 10, 421 – 430. [8] S. Bewick, R. Yang, M. Zhang, Proc. Annu. Int. IEEE EMBS 2009, 6026 – 6029. [9] D. E. Clark, Evolutionary Algorithms in Molecular Design, Wiley-VCH, Weinheim, 2000. [10] A. L. Parrill, Drug Discovery Today 1996, 1, 514 – 521. [11] A. D. Ellington, J. W. Szostak, Nature 1990, 346, 818 – 822. [12] C. Tuerk, L. Gold, Science 1990, 249, 505 – 510. [13] R. Archer, Nat. Biotechnol. 1999, 17, 834. [14] J. Li, S. G. Ballmer, E. P. Gillis, S. Fujii, M. J. Schmidt, A. M. Palazzolo, J. W. Lehmann, G. F. Morehouse, M. D. Burke, Science 2015, 347, 1221 – 1226. [15] R. F. Service, Science 2015, 347, 1190 – 1193. [16] R. P. Sheridan, S. K. Kearsley, J. Chem. Inf. Comput. Sci. 1995, 35, 310 – 320. [17] R. P. Sheridan, S. G. SanFeliciano, S. K. Kearsley, J. Mol. Graphics Modell. 2000, 18, 320 – 334. [18] K. Hall, Med. Hypotheses 1999, 53, 504 – 506. [19] I. I. Karan, B. L. Miller, Drug Discovery Today 2000, 5, 67 – 75. [20] A. Kluczyk, T. Kiyota, C. Lazar, T. Popek, G. Roman, Y. Konishi, Med. Chem. 2006, 2, 175 – 189. [21] P. Ung, D. A. Winkler, J. Med. Chem. 2011, 54, 1111 – 1125. [22] I. Belda, X. Llora, M. Martinell, T. Tarrago, E. Giralt in Genetic and Evolutionary Computation—Gecco, (Eds.: K. Deb, R. Poli, W. Banzhaf, H. G. Beyer, E. Burke, P. Darwen, D. Dasgupta, D. Floreano, O. Foster, M. Harman, O. Holland, P. L. Lanzi, L. Spector, A. Tettamanzi, D. Thierens, A. Tyrrell) 2004, Pt. 1, Proceedings, Vol. 3102, 2004, pp. 321 – 332. [23] G. M. Morris, D. S. Goodsell, R. S. Halliday, R. Huey, W. E. Hart, R. K. Belew, A. J. Olson, J. Comput. Chem. 1998, 19, 1639 – 1662. [24] V. J. Gillet, Applications of Evolutionary Computation in Chemistry, Vol. 110 (Ed.: R. L. Johnson), 2004, pp. 133 – 152. [25] C. A. Nicolaou, J. Apostolakis, C. S. Pattichis, J. Chem. Inf. Model. 2009, 49, 295 – 307. [26] E.-W. Lameijer, T. Baeck, J. N. Kok, A. P. IJzerman, Nat. Comput. 2005, 4, 177 – 243. [27] E. W. Lameijer, J. N. Kok, T. Back, A. P. IJzerman, J. Chem. Inf. Comput. Sci. 2006, 46, 545 – 552. [28] M. I. Ecemis, J. Wikel, C. Bingham, E. Bonabeau, IEEE Trans. Evol. Comput. 2008, 12, 591 – 603. [29] K. Kawai, N. Nagata, Y. Takahashi, J. Chem. Inf. Model. 2014, 54, 49 – 56. [30] R. V. Devi, S. S. Sathya, M. S. Coumar, Appl. Soft Comput. 2015, 27, 543 – 552. Received: April 12, 2015 Revised: May 1, 2015 Published online on June 9, 2015

1300

Ó 2015 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

A Bright Future for Evolutionary Methods in Drug Design.

Most medicinal chemists understand that chemical space is extremely large, essentially infinite. Although high-throughput experimental methods allow e...
258KB Sizes 1 Downloads 9 Views