Accepted Manuscript Many Drugs Contain Unique Scaffolds with Varying Structural Relationships to Scaffolds of Currently Available Bioactive Compounds Ye Hu, Jürgen Bajorath PII:

S0223-5234(14)00160-3

DOI:

10.1016/j.ejmech.2014.02.040

Reference:

EJMECH 6757

To appear in:

European Journal of Medicinal Chemistry

Received Date: 11 January 2014 Revised Date:

10 February 2014

Accepted Date: 14 February 2014

Please cite this article as: Y. Hu, J. Bajorath, Many Drugs Contain Unique Scaffolds with Varying Structural Relationships to Scaffolds of Currently Available Bioactive Compounds, European Journal of Medicinal Chemistry (2014), doi: 10.1016/j.ejmech.2014.02.040. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Original Paper

Many Drugs Contain Unique Scaffolds with Varying Structural Relationships to Scaffolds

RI PT

of Currently Available Bioactive Compounds

SC

Ye Hu and Jürgen Bajorath*

Department of Life Science Informatics, B-IT, LIMES, Program Unit Medicinal Chemistry and

M AN U

Chemical Biology, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-53113 Bonn (Germany)

Graphical abstract

TE D

Shown is a representative drug-unique scaffold (left) that is involved in three different types of structural relationships with scaffolds derived from bioactive compounds (right). Structural differences are highlighted.

EP

Bioactive

AC C

Drug-unique N N

N

N N

N N

ACCEPTED MANUSCRIPT Original Paper

Many Drugs Contain Unique Scaffolds with Varying Structural Relationships to

Ye Hu and Jürgen Bajorath*

RI PT

Scaffolds of Currently Available Bioactive Compounds

SC

Department of Life Science Informatics, B-IT, LIMES, Program Unit Medicinal Chemistry and Chemical Biology, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-

M AN U

53113 Bonn (Germany)

Abstract

Molecular scaffolds were systematically extracted from approved drugs and analyzed.

TE D

The majority of drug scaffolds, 552 of 700, were found to represent only a single drug. Moreover, 221 drug scaffolds were not detected in currently available bioactive compounds, i.e., the pool from which drug candidates usually originate. These “drug-unique” scaffolds

EP

displayed a variety of structural relationships to currently known bioactive scaffolds, reflecting rather different degrees of relatedness. Many drug-unique scaffolds formed only

AC C

very limited structural relationships to bioactive scaffolds. These drug scaffolds should represent promising candidates for further chemical exploration and drug repositioning efforts and are made freely available.

1

ACCEPTED MANUSCRIPT Key words: Approved drugs, bioactive compounds, scaffolds, drug-unique scaffolds, substructure relationships, matched molecular pairs, topological equivalence

*Corresponding

author,

Tel:

+49-228-2699-306,

Fax:

+49-228-2699-341,

AC C

EP

TE D

M AN U

SC

RI PT

[email protected]

E-mail:

2

ACCEPTED MANUSCRIPT 1. Introduction The scaffold concept is popular in medicinal chemistry because it enables the organization of compounds according to core structures, the association of cores with biological compound activities, the search for privileged substructures, and the design of new

RI PT

compound series [1,2]. In addition, the scaffold concept has also been applied to analyze and compare different drugs and better understand key structural features [3-5]. Scaffolds can be defined in different ways [2]. The scaffold definition most widely applied in medicinal

SC

chemistry was originally introduced by Bemis and Murcko [3]. This definition followed a molecular hierarchy by dividing compounds into R-groups, linkers, and rings. Accordingly,

M AN U

Bemis-Murcko (BM) scaffolds are obtained from compounds by removing R-groups but retaining aliphatic linker fragments between rings [3]. Thus, BM scaffolds generally represent cores consisting of single or multiple ring systems that can be connected in different ways. Thus, they account for molecular topology. From these scaffolds, one can further abstract by

TE D

converting all heteroatoms to carbon and setting all bonds to single bonds (i.e., setting all bond orders to 1) [3,6]. These modifications generate so-called cyclic skeletons (CSKs) [6]. Hence, a given CSK covers a set of topologically equivalent scaffolds that are only

EP

distinguished by heteroatom substitutions and/or bond orders. It follows that different CSKs represent topologically distinct scaffolds.

AC C

A primary focal point of scaffold analysis in medicinal chemistry has been -and continues to be- the association of scaffolds with biological activities of compounds they represent [7-11]. Different approaches have been introduced to systematically derive and organize scaffolds on the basis of retrosynthetic information [12], structural similarity criteria [13], structural rule-based scaffold decomposition [14], or compound-scaffold-CSK hierarchies [15]. Such structural organization schemes can substantially aid in the association of scaffolds with biological activities and the analysis of structure-activity relationships (SARs). For example, the Layered Skeleton-Scaffold Organization (LASSO) graph has been 3

ACCEPTED MANUSCRIPT used to systematically explore SARs in compound data sets along molecular hierarchies [15]. Moreover, the Scaffold Tree that is based on structural rule-based decomposition [14] has not only been utilized in SAR analysis but also to generate virtual scaffolds within experimental scaffold hierarchies for the prediction of novel active compounds [11,16,17].

RI PT

While scaffolds in active compounds and drugs have been analyzed in a variety of ways, as discussed above, only few investigations have thus far systematically compared drug scaffolds with scaffolds originating from other bioactive non-drug compounds. Such

SC

comparisons might help to better understand whether there are specific differences between core structures from drugs and bioactive compounds. In one study, it was shown that there

M AN U

was only limited overlap between BM scaffolds isolated from sets of compounds at different pharmaceutical development stages including active compounds, compound in clinical trials, and drugs [17]. In another analysis, ~13,000 BM scaffolds were extracted from compounds active against ~450 targets belonging to 19 different families and their activity profiles were

TE D

determined [18]. More than 400 scaffolds were identified that were active against targets from at least two target families and a subset of 83 scaffolds was active against targets from three to 13 families. These 83 scaffolds yielded 33 distinct CSKs, 17 of which were detected in more

EP

than 200 approved drugs. Hence, this analysis demonstrated that scaffolds with multi-target activities were well represented in current drugs [19], consistent with the notion of drug

AC C

polypharmacology [20,21]. The collection of scaffolds extracted from available approved drugs can be rationalized as a basic structural representation of known drug space [22], which is distinct from drug-like chemical space. Current known drug space has many origins reflecting the history of drug discovery and development (including classical pharmacological testing, molecular approaches, rational design, etc.). To further extend the exploration of drug scaffolds and analysis of known drug space, we have carried out a systematic structural comparison of scaffolds from approved drugs and from the large pool of currently available bioactive 4

ACCEPTED MANUSCRIPT compounds that addresses three previously unexplored questions. First, how are scaffolds distributed across approved drugs? Second, to what extent are drug scaffolds represented in bioactive compounds? Third, which structural relationships exist between drug scaffolds and scaffolds from bioactive compounds? Our analysis has yielded a number of rather unexpected

RI PT

findings that are reported herein.

2. Experimental

SC

2.1. Scaffolds from bioactive compounds and drugs

From the latest version of ChEMBL (release 17) [23], compounds with direct

M AN U

interactions (i.e., target relationship type “D”) against human targets at the highest confidence level (i.e., target confidence score 9) and available equilibrium constants (Ki values) as activity measurements were extracted. From DrugBank 3.0 [24], approved small molecule drugs with available structures and activity information were collected. From all bioactive

TE D

compounds and approved drugs, BM scaffolds [3] and CSKs [6] were isolated. As a control, scaffolds were also extracted from bioactive compounds for which IC50 measurements were available. In the following, BM scaffolds are simply referred to as scaffolds and scaffolds

EP

extracted from bioactive compounds and drugs are designated bioactive scaffolds and drug scaffolds, respectively. Bioactive scaffolds and drug scaffolds were systematically compared.

AC C

Initially, the overlap between these two scaffold sets was determined. Then, different types of structural relationships between a subset of drug scaffolds and bioactive scaffolds were systematically explored.

2.2. Structural relationships Three types of structural relationships between drug scaffolds and bioactive scaffolds were analyzed:

5

ACCEPTED MANUSCRIPT (1) Substructure relationship: a scaffold is contained as a substructure in another. Benzene, the most generic scaffold, was excluded from the assessment of substructure relationships (to avoid an inflation of substructure matches for this scaffold). The size of scaffolds with substructure relationships was compared by determining the difference in the number of rings

RI PT

they contained.

(2) Topology relationship: if two scaffolds share the same topology, they yield the same CSK. Cyclohexane, the CSK of benzene, was excluded from the assessment of CSK equivalence.

SC

(3) Matched molecular pair (MMP) relationship: an MMP is formed by compounds that differ only at a single site by the exchange of a pair of substructures [25], termed a chemical

M AN U

transformation [26]. Transformation size-restricted MMPs [25,27] were calculated for drug scaffolds vs. bioactive scaffolds using our implementation of the algorithm by Hussain and Rea [27] that utilizes the OpenEye toolkit [28]. Size-restricted MMPs limit chemical transformations to exchanges of typical R-groups that are not larger than a substituted or

TE D

condensed ring [26]. Scaffolds forming an MMP were permitted to differ in size by at most eight heavy atoms. Hence, formation of an MMP between a drug and a bioactive scaffold indicated that these scaffolds were structurally analogous and only distinguished by a

EP

structural modification of limited size. Hence, scaffolds forming an MMP shared a maximum common substructure but one scaffold was not necessarily a substructure of the other. This

AC C

distinguished substructure and MMP-based relationships. These three different types of structural relationships are illustrated in Figure 1.

3. Results and discussion 3.1. Drug scaffolds vs. bioactive scaffolds A total of 45,353 compounds from ChEMBL with available assay-independent equilibrium constants met our selection criteria. From these compounds, 16,250 unique scaffolds were obtained. On average, each bioactive scaffold represented 2.8 compounds. 6

ACCEPTED MANUSCRIPT Approx. 66% of the bioactive scaffolds were only found in a single compound. In addition, from 95,685 compounds for which assay-dependent activity measurements were available, a total of 36,257 unique scaffolds were extracted. From DrugBank, 1241 approved drugs with structure and activity information were

RI PT

obtained that yielded 700 unique scaffolds. On average, each drug scaffold was contained in only ~1.8 approved drugs. Surprisingly, 552 of the 700 drug scaffolds (~79%) represented only a single approved drug.

SC

In the next step, we organized the scaffolds by the number of rings they contained as reported in Figure 2. Fused ring systems were separated and counted as individual rings using

M AN U

the Molecular Operating Environment [29] (e.g., a benzofurane moiety yielded a ring count of two). Approx. ~61% of drug scaffolds but only ~28% of bioactive scaffolds associated with high-confidence activity data contained one to three rings, and the distribution of bioactive scaffolds was shifted towards larger ring numbers compared to drug scaffolds. Thus, bioactive

TE D

scaffolds were on average larger than drug scaffolds consistent with higher molecular weight (on average, 358.4 for bioactive scaffolds compared to 276.5 for drug scaffolds). The

discussed below.

EP

differences in ring counts played a role in the assessment of substructure relationships, as

AC C

3.2. Drug-unique scaffolds We then systematically compared the structures of 700 drug scaffolds and 16,250 bioactive scaffolds from compounds with available equilibrium constants. Surprisingly, 339 drug scaffolds (~48%) were not detected in bioactive compounds with high-confidence activity data, although one might expect that many late-stage structural analogs are available for compounds that ultimately become drug candidates (and at least a few such analogs in the public domain). However, nearly half of the drug scaffolds were not matched by bioactive scaffolds with associated high-confidence activity data. When these 339 drug scaffolds were 7

ACCEPTED MANUSCRIPT compared to the 36,257 bioactive scaffolds associated with approximate activity data, only 118 were detected. Thus, 221 drug scaffolds remained that were not found in any bioactive compound, and these scaffolds were designated “drug-unique scaffolds” for our further analysis. Figure 3 shows exemplary drug-unique scaffolds that contained only one or two

RI PT

rings and represented one or two drugs. These small scaffolds included substituted aliphatic rings and combinations of heteroatom-containing aliphatic and aromatic rings but one would very likely not have predicted that they exclusively occurred in drugs and were not be found

SC

in other bioactive compounds. Of course, the absence of identical scaffolds did not preclude the presence of bioactive scaffolds that were structurally (closely) related to drug-unique

M AN U

scaffolds. Therefore, structural relationships between drug-unique and bioactive scaffolds with high-confidence activity data were systematically explored.

3.3. Structural relationships

TE D

Different types of structural relationships were analyzed according to Figure 1. Substructure relationships accounted for the structural extension or reduction of scaffolds, CSK equivalences established conserved topology, and MMP formation captured structurally

EP

analogous scaffolds with chemical modifications. Table 1 reports the total numbers of individual structural relationships formed between drug-unique and bioactive scaffolds and

AC C

the numbers of drug-unique scaffolds involved in these relationships. We found that 203 drug-unique scaffolds formed a total of 788 substructure relationships. In addition, 123 drugunique scaffolds formed 2274 CSK-based topological equivalences and 64 scaffolds 743 MMP relationships. The distribution of each type of structural relationship is reported in Figure 4. The majority of drug-unique scaffolds were involved in two to five substructure relationships. For CSK equivalences and MMP relationships, comparable distributions were observed. Substructure relationships, in which most drug-unique scaffolds were involved, principally differed from CSK and MMP relationships in that related scaffolds might have 8

ACCEPTED MANUSCRIPT very different sizes, i.e., a small scaffold can be a substructure of a much larger one. Hence, scaffolds of significantly different size that form substructure relationships only have a limited degree of structural similarity and should be distinguished from substructure-related scaffolds having comparable size. Therefore, the size difference of scaffolds with substructure

RI PT

relationships was analyzed, as reported in Table 2. The differences in the number of rings between drug-unique and bioactive scaffolds with substructure relationships ranged from one to 15. The majority of scaffolds forming substructure relationships differed by one to five

SC

rings. In order to limit the consideration of substructure relationships to scaffolds of comparable size, we excluded pairs of scaffolds that differed by more than two rings. This

M AN U

reduced the total number of substructure relationships from 788 to 359, which involved 159 drug-unique scaffolds. The corresponding adjusted distribution of substructure relationships is reported in Figure 5a. On the basis of the adjusted number of substructure relationships, the distribution of the total number of all structural relationships including CSK equivalences and

TE D

MMP relationships was calculated, as reported in Figure 5b. There were 35 drug-unique scaffolds that did not form any structural relationships with bioactive scaffolds and 24 scaffolds that were only involved in one structural relationship. Examples are shown in

EP

Figure 6. These scaffolds were often chemical unusual and rather “non-drug-like”. By contrast, there were 23 drug-unique scaffolds that were involved in more than 50 structural

AC C

relationships with bioactive scaffolds. Table 3 shows eight representative drug-unique scaffolds and reports the structural relationships they formed. These scaffolds were chemically more conventional and drug-like than the scaffolds in Figure 6 and displayed varying degrees of relatedness to bioactive scaffolds. A drug-unique scaffold involved in 80 structural relationships and examples of related bioactive scaffolds are shown in Figure 7.

3.4. Further characterization of drug-unique scaffolds

9

ACCEPTED MANUSCRIPT In addition to assessing structural relationships to bioactive compounds, we also further characterized drug-unique scaffolds from different viewpoints. For example, we determined which of these scaffolds originated from drugs that were available prior to 1997 when druglikeness concepts and investigations became popular [30]. From ChEMBL’s DrugStore, the

RI PT

year of approval of each drug yielding a drug-unique scaffold was extracted or -if not available- the year of the United States Adopted Name (USAN) of a marketed drug. In cases where both approval date and USAN were not available, the year of the first publication of the

SC

drug was determined. For 185 of 240 approved drugs representing our set of 221 drug-unique scaffolds, approval, USAN, or publication dates were obtained. Among these, 135 drugs were

M AN U

found to be available by 1997 yielding 122 drug-unique scaffolds. The remaining 50 drugs, which became available past 1997, represented 49 drug-unique scaffolds. Hence, many drugunique scaffolds have been available for 17 or more years, prior to systematic investigations of drug-likeness, and have apparently been only little further explored since then; a rather

TE D

surprising finding.

In addition, we considered oral bioavailability of drugs from which drug-unique scaffolds originated. Because oral availability records cannot be systematically assembled from

EP

DrugBank or ChEMBL, we carried out rule-of-five (RO5) [31] calculations as a proxy for oral availability. For all 240 approved drugs, RO5 descriptors were calculated and their rules

AC C

assessed using MOE [29] (i.e., no more than five hydrogen bond donors; no more than 10 hydrogen bond acceptors; logP value not larger than 5; molecular weight less than 500 Da). A total of 180 drugs representing 167 drug-unique scaffolds passed RO5 with no more than one rule violation. Hence, the majority of drug-unique scaffolds originated from drugs with oral availability potential. Furthermore, we also determined to what extent drug scaffolds contained undesired chemical moieties (see, for example, the potentially unstable N-S bond in the scaffold shown in Figure 7). Therefore, all 240 approved drugs were screened against a set of 35 undesired moieties 10

ACCEPTED MANUSCRIPT [32]. A total of 48 drugs yielding 45 drug-unique scaffolds were found to contain one or more undesirable moieties. However, the many of these undesired moieties occurred as R-groups and were not a part of the scaffolds. Hence, only a small subset of drug-unique scaffolds was

RI PT

associated with likely or potential chemical liabilities.

3.5. Application of drug-unique scaffolds

The drug-unique scaffolds represented a diverse structural spectrum, and the majority

SC

of these scaffolds were only involved in one to five structural relationships with bioactive scaffolds. These findings indicated that these drug-unique scaffolds were overall only little

M AN U

chemically explored in other bioactive compounds, which should make them attractive candidates for further consideration, for example, in the broader context of drug repositioning [33]. Finding new applications for existing drugs either requires direct association of individual drugs with new targets or, alternatively, further exploration of chemical space

TE D

narrowly confined to known drugs. For the latter application, the set of drug-unique scaffolds identified in our analysis should provide interesting candidates. Furthermore, the target profiles of bioactive scaffolds forming close structural relationships with individual drug-

EP

unique scaffolds might also be considered in the context of drug repositioning. To support these or related investigations, the drug-unique scaffolds we identified in our analysis are

AC C

made freely available via the following URL: http://www.lifescienceinformatics.uni-bonn.de (‘downloads’ section).

4. Conclusions In this study, we have systematically extracted scaffolds from approved drugs and currently available bioactive compounds, compared these scaffolds, and explored their structural relationships. Major goals of our analysis were to determine the distribution of scaffolds among approved drugs and the distribution of drug scaffolds among current 11

ACCEPTED MANUSCRIPT bioactive compounds. The majority of 700 drug scaffolds we obtained represented only individual drugs and, unexpectedly, more than 200 of these scaffolds were not detected in bioactive compounds. For these designated drug-unique scaffolds, structural relationships to bioactive scaffolds were also systematically determined. Many drug-unique scaffolds

RI PT

displayed only limited structural relationships to bioactive scaffolds, hence making these drug

AC C

EP

TE D

M AN U

SC

scaffolds attractive candidates for further exploration.

12

ACCEPTED MANUSCRIPT References

[1] N. Brown, E. Jacoby, On scaffolds and hopping in medicinal chemistry, Mini-Rev. Med. Chem. 6 (2006) 1217-1229.

RI PT

[2] Y. Hu, D. Stumpfe, J. Bajorath, Lessons learned from molecular scaffold analysis, J. Chem. Inf. Model. 51 (2011) 1742-1753.

[3] G.W. Bemis, M.A. Murcko, The properties of known drugs. 1. Molecular frameworks, J.

SC

Med. Chem. 39 (1996) 2887-2893.

[4] H.B. Broughton, I.A. Watson, Selection of heterocycles for drug design, J. Mol. Graph.

M AN U

Model. 23 (2004) 51-58.

[5] J. Wang, T. Hou, Drug and drug candidate building block analysis, J. Chem. Inf. Model. 50 (2010) 55–67.

[6] Y.-J. Xu, M. Johnson, Using molecular equivalence numbers to visually explore Structural

TE D

features that distinguish chemical libraries, J. Med. Chem. 42 (2002) 912-926. [7] R.P. Sheridan, Finding multiactivity substructures by mining databases of drug-like compounds, J. Chem. Inf. Comput. Sci. 43 (2003) 1037-1050.

8 (2003) 681-691.

EP

[8] G. Müller, Medicinal chemistry of target family-directed masterkeys, Drug Discov. Today

AC C

[9] D.M. Schnur, M.A. Hermsmeier, A.J. Tebben, Are target-family-privileged substructures truly privileged? J. Med. Chem. 49 (2006) 2000-2009. [10] J.J. Sutherland, R.E. Higgs, I. Watson, M. Vieth, Chemical fragments as foundations for understanding target space and activity prediction, J. Med. Chem. 51 (2008) 2689-2700. [11] S. Renner, W.A.L. Van Otterlo, M.D. Seoane, S. Möcklinghoff, B. Hoffmann, S. Wetzel, A. Schuffenhauer, P. Ertl, T.I. Oprea, D. Steinhilber, L. Brunsveld, D. Rauh, H. Waldmann, Bioactivity-guided mapping and navigation of chemical space, Nat. Chem. Biol. 5 (2009) 585-592. 13

ACCEPTED MANUSCRIPT [12] X.Q. Lewell, D.B. Judd, S.P. Watson, M.M. Hann, RECAP - retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry, J. Chem. Inf. Comput. Sci. 38 (1998) 38 511-522.

RI PT

[13] S.J. Wilkens, J. Janes, A.I. Su, HierS: hierarchical scaffold clustering using topological chemical graphs, J. Med. Chem. 48 (2005), 182-193.

[14] A. Schuffenhauer, P. Ertl, S. Roggo, S. Wetzel, M.A. Koch, H. Waldmann, The scaffold

SC

tree - visualization of the scaffold universe by hierarchical scaffold classification, J. Chem. Inf. Model. 47 (2007), 47–58.

M AN U

[15] D. Gupta-Ostermann, Y. Hu, J. Bajorath, Introducing the LASSO graph for compound data set representation and structure-activity relationship analysis, J. Med. Chem. 55 (2012) 5546-5553.

[16] S. Wetzel, K. Klein, S. Renner, D. Rauh, T.I. Oprea, P. Mutzel, H. Waldmann,

TE D

Interactive exploration of chemical space with scaffold hunter, Nat. Chem. Biol. 5 (2009) 581-583.

[17] S. Wetzel, W. Wilk, S. Chammaa, B. Sperl, A.G. Roth, A. Yektaoglu, S. Renner, T.

EP

Berg, A. Arenz, A. Giannis, T.I. Oprea, D. Rauh, M. Kaiser, H. Waldmann, A scaffold-treemerging strategy for prospective bioactivity annotation of γ-pyrones, Angew. Chem. Intl. Ed.

AC C

122 (2010) 3748-3752.

[18] Y. Hu, J. Bajorath, Scaffold distributions in bioactive molecules, clinical trials compounds, and drugs, ChemMedChem 5 (2010) 187-190. [19] Y. Hu, J. Bajorath, Polypharmacology directed data mining: identification of promiscuous chemotypes with different activity profiles and comparison to approved drugs, J. Chem. Inf. Model. 50 (2010) 2112-2118. [20] G.V. Paolini, R.H.B. Shapland, W.P. van Hoorn, J.S. Mason, A.L. Hopkins, Global mapping of pharmacological space, Nat. Biotechnol. 24 (2006) 805–815. 14

ACCEPTED MANUSCRIPT [21] A.D.W. Boran, R. Iyengar, Systems approaches to polypharmacology and drug discovery, Curr. Opin. Drug Discov. Devel. 13 (2010) 297-309. [22] R. Bade, H.-F. Chan, J. Reynisson, Characteristics of known drug space. Natural products, their derivatives, and synthetic drugs, Eur. J. Med. Chem. 45 (2010) 5646-5652.

RI PT

[23] A. Gaulton, L.J. Bellis, A.P. Bento, J. Chambers, M. Davies, A. Hersey, Y. Light, S. McGlinchey, D. Michalovich, B. Al-Lazikani, J.P. Overington, ChEMBL: a large-scale bioactivity database for drug discovery, Nucleic Acids Res. 40 (2012) D1100-D1107.

SC

[24] C. Knox, V. Law, T. Jewison, P. Liu, S. Ly, A. Frolkis, A. Pon, K. Banco, C. Mak, V. Neveu, Y. Djoumbou, R. Eisner, A.C. Guo, D.S. Wishart, DrugBank 3.0: a comprehensive

M AN U

resource for ‘omics’ research on drugs, Nucleic Acids Res. 39 (2011) D1035-D1041. [25] P.W. Kenny, J. Sadowski, Structure modification in chemical databases, in: Chemoinformatics in Drug Discovery, T.I. Oprea (Eds), Wiley-VCH: Weinheim, Germany, 2004, pp. 271-285.

TE D

[26] J. Hussain, C. Rea, Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets, J. Chem. Inf. Model. 50 (2010) 339-348. [27] X. Hu, Y. Hu, M. Vogt, D. Stumpfe, J. Bajorath, MMP-cliffs: systematic identification of

EP

activity cliffs on the basis of matched molecular pairs, J. Chem. Inf. Model. 52 (2012) 11381145.

AC C

[28] OEChem TK, version 20111017, release 1.7.6; OpenEye Scientific Software Inc.: Santa Fe, New Mexico.

[29] Molecular Operating Environment (MOE), 2011.10; Chemical Computing Group Inc., 1010 Sherbooke St. West, Suite#910, Montreal, QC, Canada, H3A 2R7, 2011. [30] P.D. Leeson, B. Springthorpe, The influence of drug-like concepts on decision-making in medicinal chemistry, Nature Rev. Drug Discov. 6 (2007) 881-890.

15

ACCEPTED MANUSCRIPT [31] C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney, Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings, Adv. Drug. Deliv. Rev. 23 (1997) 3-25. [32] P. Axerion-Cilies, I.P. Castaneda, A. Mirza, J. Reynisson, Investigation of the incidence

RI PT

of “undesirable” molecular moieties for high-throughput screening compound libraries in marketed drug compounds, Eur. J. Med. Chem. 44 (2009) 1128-1134.

AC C

EP

TE D

M AN U

SC

[33] C.R. Chong, D.J. Sullivan, New uses for old drugs, Nature 448 (2006) 645-646.

16

ACCEPTED MANUSCRIPT Figure 1. Structural relationships. Three different types of structural relationships are illustrated including (a) substructure, (b) topological (CSK equivalences), and (c) MMPbased relationships. For each pair of scaffolds, the structural differences are highlighted in

RI PT

red.

(b) CSK equivalences

M AN U

SC

(a) Substructure relationships

O O

S N

EP

N

O

O

N

N S

AC C

O

TE D

(c) Scaffold-based MMPs

17

ACCEPTED MANUSCRIPT Figure 2. Scaffold sizes. The distribution of bioactive (top) and drug scaffolds (bottom) over increasing numbers of rings is reported. Bioactive scaffolds comprising four rings and drug scaffolds comprising three rings had the highest frequency of occurrence within their

Bioactive scaffolds

RI PT

respective sets.

6000

SC

4000 3000 2000 1000 0 1

Drug scaffolds

2

150 100

EP

# Scaffolds

200

4

5

6 7 # Rings

8

9

10 >10

4

5

6 7 # Rings

8

9

10 >10

TE D

250

3

M AN U

# Scaffolds

5000

50

AC C

0

1

2

3

18

ACCEPTED MANUSCRIPT Figure 3. Drug-unique scaffolds. Shown are 16 representative drug-unique scaffolds. For each scaffold, the number of corresponding drugs is reported.

1

1

1

1

1

1

1

N

1

2

2

M AN U

1

1

N

N N

1

1

AC C

EP

TE D

N

SC

N

1

RI PT

2

19

ACCEPTED MANUSCRIPT Figure 4. Distribution of structural relationships. For drug-unique scaffolds, the number of structural relationships they formed with bioactive scaffolds is reported including (a) substructure, (b) CSK equivalence, and (c) MMP-based relationships.

120 90 60 30 0 0

1

0

1

5 10 20 30 50 100 >100 # Substructure relationships

(c)

80 40

0

160

80 40 0

5

10 20 30 50 100 >100 # CSK equivalences

10 20 30 # MMPs

50 100 >100

EP

5

1

TE D

120

0

AC C

# Drug-unique scaffolds

120

SC

# Drug-unique scaffolds

150

RI PT

(b)

M AN U

# Drug-unique scaffolds

(a)

20

ACCEPTED MANUSCRIPT Figure 5. Distribution of size-adjusted structural relationships. Shown are (a) the distribution of substructure relationships formed between drug-unique scaffolds and bioactive scaffolds that differed in size by at most two rings and (b) the combined distribution of CSK

(b) # Drug-unique scaffolds

80 60

20 0

5 10 20 30 50 100 >100 # Substructure relationships

0

0

1 5 10 20 30 50 100 >100 Total # structural relationships

TE D

1

20

EP

0

40

M AN U

40

60

SC

80

100

AC C

# Drug-unique scaffolds

(a)

RI PT

equivalence, MMP-based, and adjusted substructure relationships according to (a).

21

ACCEPTED MANUSCRIPT Figure 6. Scaffolds with no or only one structural relationship. In (a) and (b), eight exemplary drug-unique scaffolds are shown that were not involved in any structural relationship with bioactive scaffolds or involved in only one relationship, respectively. For

RI PT

each scaffold, the name of a drug it represents is given.

Thiotepa

Ketazolam

Ertapenem

TE D

Capreomycin

N N

AC C

N

EP

N

N N

N

N

Nevirapine

N

N

N

Sodium stibogluconate

M AN U

Ifosfamide

SC

(a)

O N

N

N N

O

N

Porfimer

22

ACCEPTED MANUSCRIPT

(b)

Oxaliplatin

Efavirenz

Imipenem

Carboplatin

Fluconazole

N

O N N

Tolazamide

AC C

EP

TE D

Eszopiclone

SC

N

M AN U

N N

RI PT

Arsenic trioxide

23

ACCEPTED MANUSCRIPT Figure 7. Scaffold involved in different structural relationships. An exemplary drug-unique scaffold is shown (center) that formed a total of 80 structural relationships. In addition, six bioactive scaffolds are shown that were involved in the three types of structural relationships

N

TE D

N

M AN U

SC

RI PT

with this drug-unique scaffold.

S

S N N

N N

S N N

CSK equivalences

MMPs

AC C

EP

Substructure relationships

N

24

ACCEPTED MANUSCRIPT

Types

# Structural relationships

# Drug-unique scaffolds

Substructure relationships

788

203

CSK equivalences

2274

MMP relationships

743

RI PT

Table 1. Structural relationships between drug-unique and bioactive scaffolds.

123

SC

64

The numbers of structural relationships that were formed between drug-unique scaffolds and

M AN U

bioactive scaffolds are reported for three different types of structural relationships. In addition, the numbers of drug-unique scaffolds involved in these relationships are reported.

AC C

EP

TE D

For example, 203 drug-unique scaffolds formed a total of 788 substructure relationships.

25

ACCEPTED MANUSCRIPT Table 2. Size difference between scaffolds with substructure relationships.

# Substructure relationships

1

151

2

208

3

190

4

106

5

56

6

16

16

8

4

6 16 788

EP

SUM

TE D

9

> 10

SC

M AN U 19

7

10

RI PT

∆ Rings

AC C

The number of substructure relationships formed between drug-unique scaffolds and bioactive scaffolds that differed by increasing numbers of rings (∆ Rings) is reported.

26

ACCEPTED MANUSCRIPT Table 3. Drug-unique scaffolds with most structural relationships. Number of structural relationships CSK

MMP

SUM

3

4

135

142

4

3

108

115

SC

RI PT

Substructure

M AN U

Drug-unique scaffolds

N

34

72

0

106

2

64

28

94

5

55

30

80

1

43

33

77

1

72

0

73

1

72

0

73

AC C

EP

TE D

N

O

N

S

N

Shown are the eight drug-unique scaffolds that formed the overall largest numbers (SUM) of substructure, CSK, and MMP relationships.

27

ACCEPTED MANUSCRIPT

Scaffolds were systematically extracted from drugs.



Drug scaffolds were compared to scaffolds from bioactive compounds.



A set of 221 drug-unique scaffolds was identified.



Structural relationships between drug-unique and bioactive scaffolds were explored.

Many drug-unique scaffolds also displayed only limited relationships to

SC



RI PT



AC C

EP

TE D

M AN U

bioactive scaffolds.

Many drugs contain unique scaffolds with varying structural relationships to scaffolds of currently available bioactive compounds.

Molecular scaffolds were systematically extracted from approved drugs and analyzed. The majority of drug scaffolds, 552 of 700, were found to represen...
366KB Sizes 3 Downloads 3 Views