DOI: 10.1002/cmdc.201500091

Essays

How Many Molecules Does It Take to Tell a Story? Case Studies, Language, and an Epistemic View of Medicinal Chemistry Martin Stahl*[a] and Sabine Baier[b] veloped in the history and philosophy of science. Based on the discussion of these visualizations, we argue that there is a need for a new language of creativity that can be employed during the very course of research, as opposed to its retrospective analysis. This language should be able to reflect both the status and directions in highly complex research processes that may have a clear goal, yet must remain open to unexpected moments of serendipity.

Medicinal chemistry has always been closer to the arts than other disciplines in the natural sciences. Instead of searching for natural laws, medicinal chemistry creates new molecular entities entailing desired pharmaceutical characteristics. While the productive output of medicinal chemistry is comprehensively documented, the epistemic paths of the creative process are less well described. Here we show how such paths could be visualized and how these visualizations relate to images de-

Not a Library

argue that we would do well to focus not only more on the assumptions and thought processes leading from one molecule and one insight to the next, but also on the language in use to describe and visualize this creative process. To this day, medicinal chemistry is a discipline between science and art. It differs from other scientific disciplines in that it is based on creation and storytelling instead of explanation and prediction. The common epistemological focus of the natural sciences aims at the discovery of natural laws either by experiments or by mathematical proofs. Once a natural law is discovered and established, it is not only possible to describe and understand nature in an appropriate, generalized way but also to make use of these laws by predicting future events. In contrast, as Hoffmann also pointed out,[2] a philosophy of science from a chemist’s point of view would necessarily emphasize creation and design more than discovery. This places chemistry closer to art than other scientific disciplines. Medicinal chemistry employs this creative power for a specific aim, the creation and discovery of new drugs. In medicinal chemistry, there are few explicit rules and even fewer truly predictive tools let alone natural laws of creation. The number of choices for the next compounds to synthesize stays enormously large, and the best choices are hardly ever obvious. So to a large extent, medicinal chemistry relies on rational creative thinking, negotiation with similarities and tacit knowledge,[3] much of which is carried by individuals and then lost again. The idea to convert this tacit knowledge into explicit knowledge is not new;[4] at Roche, for example, a knowledge repository called ROCK[5] now celebrates its 10th anniversary and continues to be actively fed and used. The rough guidelines or rules that emerge are the direct result of a multitude of project experiences.

In 2001, Roald Hoffmann published a commentary scolding the chemistry community for using the term “library” for collections of compounds generated by combinatorial chemistry.[1] A library, Hoffmann argued, is a place of wisdom, of historically grown, organized and interconnected knowledge— whereas enumerations of chemical structures according to predefined rules are the sterile product of an automated procedure. Long after the heyday of combinatorial chemistry, it is worthwhile going back to Roald Hoffmann’s commentary. What at first sight might appear as a semantic debate disconnected from practical application, turns out to reveal something fundamental about how we deal with medicinal chemistry legacy data. The collective wisdom of generations of medicinal chemists is indeed akin to a library. Projects unfold, small-molecule ligands for targets and target classes are developed over time and across institutions, chemical structures continuously reference and build on each other. As a community, we just don’t do very much to allow this historical growth and connectedness to become visible. Open any full paper on a medicinal chemistry program and you will see compound structures associated with data. The table, or its larger cousin the database, has become our means of communicating medicinal chemistry. We treat compounds as independent entities. Therefore, considerations that have led from one molecule to another are frequently unmentioned and even harder to query. Herein we [a] Dr. M. Stahl Roche Pharma Research and Early Development Roche Innovation Center Basel F. Hoffmann–La Roche AG, 4070 Basel (Switzerland) E-mail: [email protected] [b] Dr. S. Baier Collegium Helveticum, STW C 12.2 Schmelzbergstrasse 25, 8092 Zrich (Switzerland)

ChemMedChem 0000, 00, 0 – 0

These are not the final page numbers! ÞÞ

1

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

&

Essays Opposing Forces

dicinal chemistry history has fixed and generalized cornerstones to build on: the chemical structures themselves. A medicinal chemistry story can move from one molecule to the next. The dependence of inventions in medicinal chemistry could be depicted in the form of graphs, in which nodes are compounds and edges are logical links between compounds (Figure 1). These links are directional, that is, if compound A

If the creation of safe, novel compounds with therapeutic efficacy is the forward-looking business of medicinal chemistry, then its complement, looking into the past, is the case study. Few such case studies get published, often long after completion of discovery work. Why is that so? We suggest that there are two opposing forces at work here, the need to protect intellectual property and the desire to turn medicinal chemistry into a more rational process. The chemical structure of a compound that has the potential to become a clinical candidate encodes for the entire drug design process. Its novelty and superiority need to be protected. A patent that covers a class of novel organic molecules does not concern itself with the small individual steps it took to develop them. It presents them as a whole and it emphasizes the non-obvious aspects of the new compound class. Patent literature does (and must) quote prior art and background art, but because it is an instrument of protection, it does not foster candid history writing; it rather attempts to sever its links to the past. The practice of intellectual property protection also leads to delays in publishing; most pharmaceutical companies do not encourage publication of full papers directly after a patent application has been published. At the same time, there is a strong incentive in medicinal chemistry to simplify and rationalize. The process of optimizing hits to leads and ultimately to viable candidates is long and cumbersome, so any opportunity to make this process faster and more projectable is of high interest. The resulting focus on analytical approaches to drug discovery diverts attention away from storytelling. Significant efforts have gone into analyzing large-scale correlations and trends in molecular properties across chemical space.[6] This approach naturally ignores the historical perspective, flattens hierarchies of origin in compound collections and goes hand in hand with the “database approach” to documentation alluded to above. Extrapolation and true prediction is harder than we tend to believe. All of our data sets are man-made and therefore biased and incomplete, and any “general rules” often just reflect history. We emphasize that this is not a pessimistic statement and not in contrast to the very real progress in rational drug discovery. It just seems that the desire to act rationally has led the field to gloss over important details. Practically useful guidelines for chemists do exist, but they require a more sophisticated granularity. There is no easy way to bypass the tedious collection and distillation of a multitude of individual case studies.[7]

Figure 1. Network view of medicinal chemistry progress. In this highly schematic view, nodes are compounds and edges depict the evolution of compound series. The result is a tree-like structure. Trees may interconnect when lessons learned or structural elements can be transferred between series. Medicinal chemistry articles typically cover small areas of such graphs: 1) Typical full article covering SAR around a lead; 2) Review article covering the latest developments across compound classes; 3) Case study outlining the history of a single molecule reaching clinical stages of development. The terms “screening hit” and “candidate” are placeholders for chemical entry points of any kind (including compounds from other projects) and desired end points that may be defined in many different ways.

has influenced the creation of compound B, the reverse cannot be true. Time progresses along the y axis; chemical diversity spans along both the x and y axes. A single node may have multiple links into the past and into the future. The world of medicinal chemistry could thus be depicted as a forest of “knowledge trees” by which compounds successively reference each other. Trees originate where new chemical entry points are created, for example, from screening, from a known substrate, or by chance observations on a natural product. Some trees grow together; they get connected in their upper branches, some more intimately, some very loosely. Some branches are dead ends, where research did not continue, because structure–activity relationship (SAR) development was not fruitful, because clinical candidates failed, or because targets were invalidated. Non-scientific reasons (institutional, regulatory, commercial, or portfolio related) play a role as well. At any given point in time, trees may grow at different rates; some grow quickly because of high hopes and significant competition, some grow slowly because targets have already been exploited by research efforts or because they pose significant

A Forest of Knowledge The elements of surprise and rational planning may be antipodes, but scientists have long learned to cherish them both. Pasteur’s famous statement that “chance favors only the prepared mind” resolves the apparent paradox. Imaginative, creative steps and logical reasoning are intimately connected in medicinal chemistry; combined they enable progress. How can the interplay between these forces be traced? Fortunately, as opposed to historical accounts in the arts and humanities, me-

&

ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

2

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ÝÝ These are not the final page numbers!

Essays before!) than to choose key structures that were more extensively profiled and thus were clear cornerstones of the project as it progressed. A single representative was chosen at each level of the tree, namely the molecules branching off to the left of the main “trunk” of the naphthyridine series. A more comprehensive depiction of SAR covered would either have to display far more molecules or use language like Markush structures as a more compressed format. Second, for an outsider it is not clear where and how external information has influenced the progress of the project at Merck. An early Shionogi clinical lead is depicted, but without connections, as the review covers neither the story of its genesis nor its specific influence on the Merck project. The role of this molecule and the partially parallel and subsequent strands of the overall HIV integrase story (such as those leading to dolutegravir and elvitegravir) would need to be added to create a holistic view of the medicinal chemistry on this target. A final observation on the raltegravir story is that certain substituents appear early in the SAR and are then reused at various later points in time. In Figure 2 this is the case for cyclic sulfonamide substituents. This indicates that as more SAR data are collected, the number of references to prior compounds increases significantly and that perhaps at some point, a tree view is no longer sufficient to depict the growing gain of knowledge. We discuss this aspect in greater detail further below.

technical challenges. Libraries of books share this tree topology. Here, a link is a (more or less explicit) form of citation, and there is a time dimension in which later works cannot be referenced by earlier ones. If a book is the smallest unit of knowledge in a library, a compound could be called the smallest unit of knowledge in chemistry. Now consider the way scientific literature covers parts of the medicinal chemistry knowledge forest (areas enclosed by dashed lines in Figure 1). A typical full paper concerns itself with a particular branch of the tree, rarely reaching back to early sources of information that enabled present studies. Review articles and in particular patent reviews (such as Informa Healthcare’s “Expert Opinion on Therapeutic Patents”[8]) typically cover the latest research, a horizontal cross-section of the forest. And then there are the rare cases of historical accounts that follow a tree or branch from its origin to the present. The latter are often personal accounts of researchers who describe the genesis of a particular drug that has made it to the market (examples are publications on the discovery of sertraline[9] or maraviroc[10]), and the vertical cross-section covered by such case studies is correspondingly small. Very few articles or book chapters attempt to be broader and show a more complete historical perspective across institutions. Surprisingly few case studies are contained in medicinal chemistry textbooks; Walter Sneader’s “Drug Prototypes and Their Exploitation”[11] is a notable exception in tracing key historical developments. A number of excellent textbooks[12] and series should be highlighted, in particular “Progress in Medicinal Chemistry”[13] as well as the “Annual Reports in Medicinal Chemistry”.[14] Articles that depict the actual medicinal chemistry progress in the form of a graph are particularly rare; an example is an account of the Glaxo Angiotensin II research program from 1995[15] .

Telling and Connecting Stories Figure 2 is certainly only a rudimentary way of depicting the history of raltegravir, but it brings across the point that chemical structures are a powerful means of telling a medicinal chemistry case study. Chemists are visual thinkers; they are trained to observe similarities and differences between molecules. “Seeing” a case study is a more holistic mental process than reading one. It has been claimed that medicinal chemistry research is entering an era of “data overload” due to the rapid growth of databases of chemical structures and associated bioactivity data, of both public and commercial nature; this has various consequences regarding data consistency and quality.[17] It is therefore time to identify the threads of discovery that have led to a demonstrable gain in knowledge. A simple start could be the abstraction of existing review articles in a graphical form, as we have attempted to do in Figure 2; such graphics could become a standard way of representing SAR progress in various types of articles. What we are suggesting here is quite the opposite of the “big data” approach that is advocated by some.[18] We need more explicit case studies of known drugs, we need to begin linking them to each other, and in particular we need case studies of failed projects. Along the way, data mining and text mining approaches will be useful tools, but cannot replace the manual curation and compilation both of individual case studies and of the resulting network. Biology has shown us how to approach complex problems of this type as a community. Public wiki-style systems for biological pathways[19] could serve as a template for a similar

A Case Study: Raltegravir Let us look at the development of HIV integrase inhibitors at Merck as a concrete example. Figure 2 is an attempt to graphically visualize essential learning steps toward raltegravir as described in a review article by two Merck authors.[16] The article describes the evolution of diketo acid hits into a series of naphthyridines, which were later terminated for toxicity reasons. From another hit class, a second series was developed in parallel (dihydroxypyrimidines, later converted into N-methylpyrimidinones), which delivered the successful clinical candidate. This is a situation in which two separate trees grow together due to transferability of SAR. The raltegravir example allows a number of general observations. First, in constructing the tree view, choices had to be made regarding the number of compounds to show in the graph view. Out of more than 110 structures depicted in the review article, only 21 are shown in Figure 2. The path toward the successful final candidate is fairly quickly described, and with relatively few molecules. It turns out to be harder to choose among structures that outline the full range of chemical space explored (keeping in mind that the authors must have gone through an even more stringent filtering process ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

These are not the final page numbers! ÞÞ

3

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

&

Essays

Figure 2. Development of the HIV integrase inhibitor raltegravir according to Egbertson and Anthony.[16]

&

ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

4

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ÝÝ These are not the final page numbers!

Essays prediction is actually feasible in this space,[22] and the amount of biostructure data alone is overwhelming and requires new classification methods.[23] This independence from specific paths of structural genesis leads to a serious limitation of the tree view, which implies a limited number of traceable links. In an information-rich situation, this view is not only quantitatively unmanageable but also becomes qualitatively inadequate, because the way information is accessed and reused is changing. The evolutionary growth of insight has been a longstanding topic in the philosophy of science, and some inspiration may be gained from the imagery used in this field. Originally, the idea of a “tree of life” emerged in first biological treatises starting around the beginning of the 19th Century. Jean-Baptiste Lamarck depicted an upside-down tree in his “Philosophie Zoologique” of 1809, but he did not feel confident about a general genealogy of all living entities. It was not before Charles Darwin’s publication of “On the Origin of Species” in 1859 that a chronological view on the concept of a tree of life became recognizable. Darwin frequently used the concept of a tree of life but never committed himself and his theory to the idea of a trunk, that is, a teleological interpretation of the genealogy of species. In one of his private letters, Darwin displays an alternative concept that resembles more of a coral than a tree implying neither a direction of growth nor a visible trunk and, therefore, visualizes his theory in a more appropriate way. Nonetheless, the intuitively simple image of a tree of life became more successful than its counter-draft, the coral.[24] In 1972, Stephen Toulmin suggested the use of tree-like structures to reflect the evolution of concepts in the progress of science.[25] However, and for the reasons discussed above, the model of a tree of scientific progress has to be considered as highly questionable in a research context. In hindsight, the trunk of a tree and a direction of growth may be detectable. But, as Hans-Jçrg Rheinberger as well as Karin Knorr-Cetina pointed out, in any concrete moment of ongoing research and development, a clear directionality is hardly ever evident. The French philosopher Gilles Deleuze and Flix Guattari, a French psychoanalyst and psychiatrist, developed the alternative concept of a rhizome. Similar to Darwin’s coral, a rhizome is a network-like structure without a hierarchy. It does not build or require a teleological narrative, and it enables switching perspectives and directions at any moment. The rhizome image is epistemologically more appropriate to the actual process of research and development. In the context of the present discussion, the image of the rhizome lends itself to the description of areas of medicinal chemistry that have been thoroughly explored. This is the case for the world of kinase inhibitors sketched above, but also to more narrowly defined areas such as amidine-based BACE1 inhibitors[26] or inhibitors of serine proteases in the coagulation pathway. Rhizome-like features appear at some point during almost any drug discovery project even when viewed in isolation; the recurrence of the cyclic sulfonamide motif in the raltegravir story (Figure 2) is just one such example. Ultimately, the molecular historian is confronted with the same issues as any historian. A complete story can only be

effort to create the type of tree-shaped network envisioned above. It would be a public repository—a library—of medicinal chemistry case studies, a place where these case studies can be annotated, iteratively refined and connected with each other. It could be a place where students of medicinal chemistry and veterans would contribute alike, the former to learn about the field they are about to enter, the latter because they can share much of their experience. Such a collection could become a valuable resource for learning and teaching across academia and industry.[20]

The Molecular Historian’s Challenges So how many molecules does it take to tell a story? The question may sound like the coastline paradox: the closer you look, the more detail can be added, and the more molecules will play a role. Of course, the total number of compounds synthesized is an upper bound, limiting the fractal nature of the answer. Pragmatically, the question can be rephrased: How many molecules are required to understand the essential learning steps from start to end? The answer may be: surprisingly few. A set of 20–40 well-chosen structures are probably sufficient to delineate key learning steps of most projects. But the fewer structures are chosen, the more the focus shifts to compounds that led toward the final optimized drug. Karin KnorrCetina referred to this situation as the fallacy of an opportunistic research logic that focuses on the successful outcomes while neglecting the dead ends.[21] It is much harder to visualize all regions of chemical space that did not contain informative or improved molecules or that were abandoned for different reasons. The number of dead ends is always greater than the number of promising avenues. Information on failed projects or on unproductive strands of SAR development is the dark matter of medicinal chemistry: there is a lot of it, and we believe it would be valuable material to study, but unfortunately it is rarely reported. Failed projects are harder to depict graphically. The tree view suggests that there has to be an identifiable “trunk”, a path toward the treetop of success. The assumption of a directional trunk suggests the idea of superiority. Within a specific project context, one compound series might indeed be superior to another; in a broader context this might not be true. Not all dead ends are genuine dead ends; some may be another tree’s roots. The hierarchy suggested by the tree view is not a global one; what persists is merely the fact that new insights rely and build on previous ones. Medicinal chemistry storytelling becomes harder not only when conveying negative information, but also when the amount of knowledge gained in an area has become large. Summarizing the world’s combined knowledge on how to design ATP-site kinase inhibitors would have been a manageable task 15 years ago; today, the density of investigated molecules is almost impossible to depict. Kinase drug discovery today can build on so much prior art that it no longer needs to rely on the structures of discrete molecules; much rather, it becomes an exercise in pulling together core fragments, building blocks, and SAR elements from various sources. Activity ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

These are not the final page numbers! ÞÞ

5

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

&

Essays told in hindsight, but in hindsight, we tend to rationalize, beautify, adapt, personalize and simplify. There is no such thing as complete accuracy in historical accounts, and this is also true for medicinal chemistry. Case studies cannot capture the full detail of what a project team has experienced, and each individual team member would tell a slightly different story. Case studies are also stories about the organizations where research takes place, of human psychology and fallibility.[27] Good case studies are those that attempt to extract what is general and re-usable, and they keep an eye on all the multidimensional experimental and contextual parameters relevant for a project. The question is whether we have the language to do so.

Rudimentary Molecular Language Because much of medicinal chemistry relies on making small changes to existing molecular structures, medicinal chemists have developed a rudimentary language for similarities and differences between molecules. Similarity, of course, lacks any physical basis. Because it is a subjective concept, the terms used to describe it are used varyingly in different contexts. Similarity can be defined, formalized and expressed numerically—such software belongs to the standard arsenal of tools in cheminformatics and medicinal chemistry—but such metrics lose their meaning when applied to individual comparisons of molecules that we are dealing with here. Terms such as “scaffold”, “series”, “class”, or “chemotype” are used interchangeably to group compounds that share common features, but only project context defines their usage. Project teams may even define their own coordinate systems and language for chemical structures and for example, speak about modifications in the “southern” part of a scaffold when oriented in a particular way. Such reference systems lose their meaning outside the project context. Differences between molecules are harder to describe than similarities. Over the years, a number of terms have emerged that describe molecular transformations. When a chemistry program focuses on testing different substituents in the periphery of a molecule, we may call this “SAR exploration”. But where are the limits of this exercise and where is new territory entered? We speak of a “bioisosteric replacement” when one part of a molecule is replaced by another one of similar shape and properties, and where we know from experience that the substitution of one for the other often retains biological activity.[28] We speak of “scaffold hopping”[29] when significant structural changes in a central position of a molecule, such as replacement of one ring system by another, leads to a compound with similar biological activity. The concept of matched molecular pairs, coupled with careful statistical analysis, has enhanced our ability to assess the relative influence of small structural changes on molecular properties of interest and to become better at predicting such changes.[30] Medicinal chemists are acutely aware that the size of structural change does not correlate with the size of change in molecular properties or knowledge gained. The concept of “activity cliffs” says just

&

ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

Figure 3. Examples of molecular transformations in which small structural changes lead to significant insights at different levels. a) Examples from a Roche series of cathepsin S inhibitors,[41] where the switch from the pyrolidine to the cyclopentane central ring alleviates a significant amount of conformational strain. b) Etizolam was the result of SAR studies on benzodiazepine derivatives at Yoshitomi Pharmaceuticals in the early 1970s (now Mitsubishi). The chemistry around this class was extended at Mitsubishi, followed by the observation that derivatives are BRD4 bromodomain inhibitors.[42] c) Derivatives of propranolol synthesized at ICI resulted in practolol, a compound that unexpectedly did not antagonize the peripheral vasodilation caused by isoprenaline, but did have the desired effects on the heart. Practolol was withdrawn from the market due to toxicity, but opened the door for further cardioselective drugs, atenolol being one of them.[43]

that: Sometimes we learn the most when the structural changes are smallest. Figure 3 shows a few examples of molecular transformations that are small in structure but large in terms of insight. These insights are gained in completely different domains of knowledge: at the molecular level (cathepsin S inhibition and conformational strain), at the target level (repurposing of a diazepine scaffold), or at the clinical level (beta blockers with an unprecedented tissue selectivity profile). Experimental context defines what and how we learn, independent from any structural similarities or differences. The nature of this context defines how we can speak about relationships between molecules.

Toward a Language of Drug Discovery New compounds play a dual role in a project. On the one hand, they are the desired end results of synthetic processes. On the other hand, they play a far more important and interesting role by constituting new starting points that are potentially able to provide more information on how to move along the knowledge tree. Hans-Jçrg Rheinberger has reflected on this ambiguous role of chemical compounds in his account of evolutionary epistemology in synthetic sciences.[31] In the process of scientific creation, as Rheinberger depicts it, compounds can be treated both as epistemic objects as well as technical objects. Epistemic objects are still to be produced, 6

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ÝÝ These are not the final page numbers!

Essays become edges of a knowledge network, and hypotheses and insights turn into nodes. Language cannot decrease the complexity of the research process to a plannable engineering process. But increased transparency across projects and institutions does offer the chance for questioning some habits. It may shed new light on some historically grown decision preferences of individuals, organizations, or the entire discipline.[32] Perhaps the invention of metrics such as “ligand efficiency” is a beginning of a more universally applicable molecular language.[33] Increased transparency should enable better dialogue between disciplines involved in drug discovery, and along the way it could deliver valuable pointers for how efficiency might be improved. Are medicinal chemists over-productive, as has been claimed;[34] do they make more compounds than required? Increased awareness and transparency of the decision-making patterns in medicinal chemistry might indeed lead to higher efficiency.[35] The often-cited tension between “exploration” and “exploitation”[36] is omnipresent in medicinal chemistry. How many series should be pursued, how conservative or daring should planned structural changes be? When should chemists be more economical with their resources, when have dead ends been reached? The literature is abundant with general advice on being “right” in research[37]—but research is much less about being right than about taking a risk and trying to get there, about asking important questions and addressing them with adequate tools. A truth-seeking culture is aware of risks and can articulate them, avoiding the specter of what Feynman once called “cargo cult science”.[38] It might also be a path to better software tools. Scientists are all too often left to their own devices regarding the use of computers to analyze their data,[39] often simply because those who master the discipline are not experts in programming tools. Earlier generations of medicinal chemists worked with pen and paper to jot down ideas as well as to document their work. Are today’s more rigid electronic systems a hindrance to creativity? Cheminformatics experts and visual analytics[40] specialists alike should be interested in understanding what problems medicinal chemists are really facing on a day-to-day basis, how they access, memorize and reuse data, and how, as a consequence, new tools and interfaces could truly help them to connect all the dots in a complex, multidimensional research program—and above all, to stay creative. Chemical structures of bioactive molecules are fascinating entities: They literally encode invention; they are the DNA of small-molecule drug discovery. The community is well aware of this fact—chemical structures are usually the last pieces of information to be disclosed when one institution evaluates the assets of another; structures of clinical candidates are revealed in strategically planned presentations at conferences, again often as a highlight at the very end. It should be worth our while to investigate how we got there. As we become better at describing the process of optimization, we may well be able to develop a more holistic, and therefore more realistic, view of invention in small-molecule drug discovery.

that is, in medicinal chemistry each new node in the knowledge tree on the path toward the optimized clinical candidate. The technical objects, on the other hand, are those in use to produce the epistemic object, that is, the assays and tools, the tacit knowledge, but also those compounds that have already been made. Together, epistemic and technical objects constitute an experimental system. New molecules are created by the experimental system, and simultaneously the experimental system that has produced them is changed through them. The understanding of biology and chemistry tend to evolve together, assays are improved over time, new models created, new key questions asked. The frame of reference thus shifts as new parameters become relevant. We should not speak of “optimization” as long as this happens; knowledge evolves with the molecules. The system is open for new results, because the experimental system changes over time. Molecules are epistemic objects turning into technical objects, and then the cycle begins again. The frequently depicted medicinal chemistry “optimization cycle” is not an adequate image while the experimental system evolves; the image of a spiral may be more appropriate. Also, sudden step changes may occur: the insight that new data sheds new light on how compounds differentiate or on fundamental biological insights. But there are also phases where the optimization goals and appropriate tools are clearly laid out and the experimental system is less mobile; this may be the case in advanced lead optimization projects. What matters is to be able to articulate, at any given moment in the research process, where we stand, where we think we are heading, and whether trial and error or clear hypotheses are guiding principles. Transparency is key. We should begin to develop a language that helps us navigate the complex process of medicinal chemistry while it is ongoing. Adequate language for small-molecule drug discovery should be able to: 1) articulate the degree of dynamic change in experimental context, 2) describe various types of molecular structure change in a manner independent from project context, and 3) indicate the level at which insights are expected to be gained—the “output” of research. The “input” side is just as complex and multidimensional. Chemists are accustomed to working with many different input parameters, just as they are used to working with multiple partially incompatible theories and model systems. As Roald Hoffmann puts it, “incommensurability is taken without a blink, and actually serves”.[2] The incongruous mix of concepts and input parameters that chemists work with has indeed been a recipe for success. The ability to create something of value in the face of uncertainty, vagueness, and complexity is one of the most intriguing aspects of medicinal chemistry. One of our colleagues put it into words as a rough guide for medicinal chemistry excellence: “Every week, make a compound that gives you more information”. Just what that information is can only be partially defined upfront. How much of it stays open depends on where a project stands, and the more explicitly we can describe this position, the more we can learn. We might then choose to display molecular networks in an inverted manner: Instead of moving from one molecule to the next, we move from one insight to the next. Molecules ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

These are not the final page numbers! ÞÞ

7

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

&

Essays Acknowledgements

[20] D. M. Andrews, M. E. Swarbrick, A. T. Merritt, Drug Discovery Today 2014, 19, 496 – 501. [21] K. Knorr-Cetina, Die Fabrikation von Erkenntnis. Zur Anthropologie der Naturwissenschaft, Suhrkamp, Frankfurt, 2012. [22] E. Martin, P. Mukherjee, J. Chem. Inf. Model. 2012, 52, 156 – 170. [23] a) O. P. J. van Linden, A. J. Kooistra, R. Leurs, I. J. P. de Esch, C. de Graaf, J. Med. Chem. 2014, 57, 249 – 277; b) N. Furtmann, Y. Hu, J. Bajorath, J. Med. Chem. 2015, 58, 252 – 264. [24] Ernst Haeckel drew one of the most well-known trees of life in 1879 in his work “The Evolution of Men”, a welcome foundation for racist ideology with all its devastating consequences in the first half of the 20th Century. [25] S. Toulmin, Human Understanding. The Collective Use and Evolution of Concepts, Princeton University Press, Princeton, 1972. [26] D. Oehlrich, H. Prokopcova, H. J. M. Gijsen, Bioorg. Med. Chem. Lett. 2014, 24, 2033 – 2045. [27] A. T. Chadwick, M. D. Segall, Drug Discovery Today 2010, 15, 561 – 569. [28] a) G. A. Patani, E. J. LaVoie, Chem. Rev. 1996, 96, 3147 – 3176; b) R. P. Sheridan, J. Med. Chem. 2002, 42, 103 – 108; c) N. A. Meanwell, J. Med. Chem. 2011, 54, 2529 – 2591. [29] H.-J. Bçhm, A. Flohr, M. Stahl, Drug Discovery Today Technol. 2004, 1, 217 – 224. [30] A. G. Dossetter, E. J. Griffen, A. G. Leach, Drug Discovery Today 2013, 18, 724 – 731. [31] H.-J. Rheinberger, Experimentalsysteme und Epistemische Dinge, Wallstein Verlag, Gçttingen, 2002. [32] An interesting recent example: D. G. Brown, M. M. Gagnon, J. Bostrçm, J. Med. Chem. 2015, 58, 2390 – 2405. [33] a) A. L. Hopkins, C. R. Groom, A. Alex, Drug Discovery Today 2004, 9, 430 – 431; b) M. D. Shultz, Bioorg. Med. Chem. Lett. 2013, 23, 5980 – 5991. [34] D. R. Cheshire, Drug Discovery Today 2011, 16, 817 – 821. [35] G. R. Robb, D. McKerrecher, N. J. Newcombe, M. J. Waring, Drug Discovery Today 2013, 18, 141 – 147. [36] J. G. March, Organ. Sci. 1991, 2, 71 – 87. [37] D. Cook, D. Brown, R. Alexander, R. March, P. J. Morgan, G. Satterthwaite, M. N. Pangalos, Nat. Rev. Drug Discovery 2014, 13, 419 – 431. [38] R. P. Feynman, Cargo Cult Science, calteches.library.caltech.edu/51/2/ CargoCult.pdf, accessed March 20, 2015. [39] G. Wilson, Am. Sci. 2009, 97, 360. [40] a) P. C. Wong, J. Thomas, IEEE Comp. Graph. 2004, 24, 20 – 21; b) J. J. Thomas, K. A. Cook, Illuminating the Path: The Research and Development Agenda for Visual Analytics, IEEE Press, Piscatway, NJ, 2005. [41] H. Hilpert, H. Mauser, R. Humm, L. Anselm, H. Kuehne, G. Hartmann, S. Gruener, D. W. Banner, J. Benz, B. Gsell, A. Kuglstatter, M. Stihle, R. Thoma, R. Alvarez Sanchez, H. Iding, B. Wirz, W. Haap, J. Med. Chem. 2013, 56, 9789 – 9801. [42] P. Filippakopoulos, J. Qi, S. Picaud, Y. Shen, W. B. Smith, O. Fedorov, E. M. Morse, T. Keates, T. T. Hickman, I. Felletar, M. Philpott, S. Munro, M. R. McKeown, Y. Wang, A. L. Christie, N. West, M. J. Cameron, B. Schwartz, T. D. Heightman, N. La Thangue, C. A. French, O. Wiest, A. L. Kung, S. Knapp, J. E. Bradner, Nature 2010, 468, 1067 – 1073. [43] W. Sneader, Drug Prototypes and Their Exploitation, Wiley, Chichester, 1996, pp. 276 – 277.

M.S. is grateful to Anthony Nicholls for the opportunity to present an early version of this essay at the 2013 EuroCUP Meeting in Amsterdam. M.S. also thanks his medicinal chemistry colleagues at Roche for many years of friendship and collaboration, and in particular Alexander Flohr for valuable comments on the manuscript. S.B. thanks her colleagues at Collegium Helveticum, Zrich, especially Gerd Folkers for constant support and many fruitful discussions beyond the borders of disciplines. Keywords: case studies · drug discovery · evolutionary epistemology · medicinal chemistry · tacit knowledge [1] R. Hoffmann, Angew. Chem. Int. Ed. 2001, 40, 3337 – 3340; Angew. Chem. 2001, 113, 3439 – 3443. [2] R. W. Hoffmann, Synthese 2007, 155, 321 – 336. [3] M. Polanyi, The Tacit Dimension, University of Chicago Press, Chicago, 1966. [4] A. L. Hopkins, A. Polinsky in Annual Reports in Medicinal Chemistry, Vol. 41 (Ed.: A. Wood), Elsevier, Amsterdam, 2006, pp. 425 – 435. [5] A. Mayweg, U. Hofer, P. Schnider, F. Agnetti, G. Galley, P. Mattei, M. Lucas, H.-J. Boehm, Drug Discovery Today 2011, 16, 691 – 696. [6] Two representative examples: a) P. D. Leeson, B. Springthorpe, Nat. Rev. Drug Discovery 2007, 6, 881 – 890; b) G. M. Keser, G. M. Makara, Nat. Rev. Drug Discovery 2009, 8, 203 – 212. [7] Good examples of such rules of thumb are listed in Table 1 in: P. A. Charifson, W. P. Walters, J. Med. Chem. 2014, 57, 9701 – 9717. [8] Expert Opinion on Therapeutic Patents, informahealthcare.com/loi/etp; accessed March 20, 2015. [9] W. M. Welch in Advances in Medicinal Chemistry, Vol. 41 (Ed.: A. Wood), JAI Press, Greenwich, 1995, pp. 113 – 148. [10] C. Barber, D. Pryde in Accounts in Drug Discovery: Case Studies in Medicinal Chemistry, Vol. 4 (Eds.: J. C. Barrish, P. H. Carter, P. T. W. Cheng, R. Zahler), RSC, Cambridge, 2011, pp. 183 – 214. [11] W. Sneader, Drug Prototypes and Their Exploitation, Wiley, Chichester, 1996. [12] a) The Handbook of Medicinal Chemistry (Eds.: A. Davis, S. E. Ward), RSC Press, Cambridge, 2014; b) J. Fischer, C. R. Ganellin, D. P. Rotella, Analogue-based Drug Discovery III, Wiley-VCH, Weinheim, 2012. [13] Progress in Medicinal Chemistry, Vol. 42 (Eds.: F. D. King, G. Lawton, A. W. Oxford), Elsevier, Amsterdam, 2004. [14] Latest volume: Annual Reports in Medicinal Chemistry, Vol. 49 (Ed.: M. C. Desai), Elsevier, Amsterdam, 2014. [15] D. Middlemiss, S. P. Watson, Tetrahedron 1994, 50, 13049 – 13080. [16] M. S. Egbertson, N. J. Anthony in HIV-1 Integrase: Mechanism and Inhibitor Design (Ed.: N. Neamati), John Wiley & Sons, New York, 2011, pp. 197 – 228. [17] C. A. Lipinski, N. K. Litterman, C. Southan, A. J. Williams, A. M. Clark, S. Ekins, J. Med. Chem. 2015, 58, 2068 – 2076. [18] S. J. Lusher, R. McGuire, R. C. van Schaik, C. D. Nicholson, J. de Vlieg, Drug Discovery Today 2014, 19, 859 – 868. [19] WikiPathways, www.wikipathways.org; accessed March 20, 2015.

&

ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

Received: February 27, 2015 Published online on && &&, 0000

8

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

ÝÝ These are not the final page numbers!

ESSAY If we were to visualize the evolutionary growth of medicinal chemistry over time and across target families, molecule by molecule, what picture would emerge? Do we have adequate language to describe the research process?

ChemMedChem 0000, 00, 0 – 0

www.chemmedchem.org

These are not the final page numbers! ÞÞ

M. Stahl,* S. Baier && – && How Many Molecules Does It Take to Tell a Story? Case Studies, Language, and an Epistemic View of Medicinal Chemistry

9

 0000 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

&

How many molecules does it take to tell a story? Case studies, language, and an epistemic view of medicinal chemistry.

Medicinal chemistry has always been closer to the arts than other disciplines in the natural sciences. Instead of searching for natural laws, medicina...
616KB Sizes 1 Downloads 8 Views