QSAR, the Bilinear Model

replaced by normal equations with fewer unknowns. Kubinyi and Kehrhahn give a number of more typical examples, and in particular their Table VI (ref 10, p 1045) shows results parallel to our Table 111. References and Notes (1) These include at least one of our own, and we are grateful to Drs. H. Kubinyi and 0. T. Kehrhahn of Knoll, AG, Ludwigshafen, for pointing out that the Free-Wilson conditions are eq 41 and 42 of the present paper rather than eq 4 of ref 2. (2) L. J. Schaad, R. H. Werner, L. Dillon, L. Field, and C. E. Tate, J . Med. Chem., 18, 344 (1975). (3) F. B. Hildebrand, “Methods of Applied Mathematics”, Prentice-Hall, New York, N.Y., 1952, Chapter 1. (4) G. E. Shilov, “Linear Algebra”, Prentice-Hall, Englewood Cliffs, N.J., 1971.

Journal of Medicinal Chemistry, 1977, Vol. 20, No. 5 625 (5) C. Hansch, Acc. Chem. Res., 2, 232 (1969), and earlier references cited in this review. (6) R. F. Gould, Ed., “Biological Correlations-The Hansch Approach”, American Chemical Society, Washington D.C., 1972. (7) S. M. Free, Jr., and J. W. Wilson, J. Med. Chem., 7, 395 (1964). (8) W. P. Purcell, G. E. Bass,and J. M. Clayton, “Strategy of Drug Design”, Wiley, New York, N.Y., 1973. (9) P. N. Craig, Chapter 8 in ref 6. (10) H. Kubinyi and 0. H. Kehrhahn, J . Med. Chem., 19,578, 1040 (1976). (11) H. Kubinyi, J. Med. Chem., 19, 587 (1976). (12) R. Courant, “Differential and Integral Calculus”, Vol. 11, Interscience, New York, N.Y., 1936, p 27. (13) T. Fujita and T. Ban, J. Med. Chem., 14, 148 (1971). (14) A. Cammarata and S. J. Yau, J. Med. Chem., 13,93 (1970).

Quantitative Structure-Activity Relationships. 7.’ The Bilinear Model, a New Model for Nonlinear Dependence of Biological Activity on Hydrophobic Character Hugo Kubinyi Research Institute of Knoll AG, D 6700 LudwigshafenlRhein, Federal Republic of Germany. Receiued October 5, 1976 The bilinear model, log 1/C = a log P - b log (BP + 1) + c, a new model for nonlinear dependence of biological activity on hydrophobic character, is applied to 57 data sets of biological activity values in homologous series. From a comparison of the statistical parameters and the residuals obtained with the bilinear model and the parabolic model, the superiority of the bilinear model for a precise quantitative description of both linear and nonlinear parts of structure-activity relationships can be derived; the bilinear model explains the particular effect that in homologous series the relationship between biological activity and hydrophobic character is strictly linear for the lower members, while for higher members this relaGonship is nonlinear,

The biological response elicited by a drug is determined by its intrinsic activity and by its ability to reach a definite receptor site. While the intrinsic activity of a drug molecule depends on various physicochemical properties and the geometry of the molecule, drug permeation is considered to be-in most cases-a passive transport process which is influenced only by the lipophilicity of the molecule. Thus, in homologous series, where the intrinsic activity of all members can be assumed to be identical biological activity should be a simple function of lipophilicity. Indeed linear relationships between biological activity and hydrophobic character (eq 1)”’ are obtained for a large number of homologous series. However, this linear relog 1 / C = a l o g P + b (1) lationship cannot go on infinitely, otherwise compounds with infinite activity would exist. A cut-off pointg is reached in each homologous series: biological activity increases with increasing lipophilicity, reaches a maximum, and then decreases with further increase of hydrophobic character. In 1964 H a n ~ c h “proposed ~ a parabolic model (eq 2) for log 1/c= a(l0g P)2 + b log P + c (2 1 the dependence of biological activity on hydrophobic character on the basis of a “random walk” process; on the way from the outer phase, where the drug is applicated, to their receptor sites the drug molecules have to penetrate a number of lipophilic and hydrophilic barriers. While hydrophilic molecules tend to remain in the aqueous phases and lipophilic molecules tend to go into the lipid (membrane) phases, molecules with an optimal hydrophilic-lipophilic balance will have the best chance to

penetrate all barriers and to reach the receptor sites. Although the parabolic model has been supported by consideration of a kinetic m ~ d e l ~ ~and ’ ~ ’its ” suitability for practical purposes has been proven with some hundred examples, there remains a discrepancy between the linear model and the parabolic model; from the linear model at least the left side of the “parabola” should be strictly linear, while from the parabolic model both sides of the structure-activity relationship should be more or less curved. Besides the parabolic model several other models for nonlinear dependence of biological activity on h drophobic charader have been presented in the last years.’’’6 Among these models the most interesting seems to be the Mc Farland model:“ Mc Farland used a simple hypothetical system, made up of alternating aqueous phases and lipid (membrane) phases of equal volumes and considered the probability of a drug molecule to cross a definite number of barriers. The Mc Farland model can be represented by eq 317 (correspondsto eq 15 of ref 12). Symmetrical curves with linear ascending and descending sides and a parabolic l o g 1 / C = a l o g P - 2 a l o g ( P + 1 ) +c

(3) part within the range of log P = 0 are obtained from eq 3. Although Mc Farland recognized systematic deviations between his model and the parabolic model (the same deviations can be reco nized between the computer plot from the kinetic and the parabola fitted to this plot), he considered the parabolic model to be sufficient for all practical purposes. To adapt the Mc Farland model to biological reality, the model was reconsidered using a slightly modified system (Figure l).17 Only four relevant phases are regarded, e.g., as a model of a simple bacterial cell or an isolated tissue

626 Journal of Medicinal Chemistry, 1977, Vol. 20, No. 5

I

I

I

Kubinyi

,

Figure 1. Hypothetical biological system used for the derivation of the bilinear model:” A, = outer aqueous phase, L, = lipid (membrane) phase, A2 = inner aqueous phase, R = receptor “phase”.

cek an outer aqueous phase A,,, a lipid (membrane) phase L1, an inner aqueous phase Az, and a receptor “phase” R; the different volumes of the aqueous phases, the lipid phase, and the receptor “phase” are indicated in Figure 1. Following the probability approach of Mc Farland, eq 4 was derived” from this modified system (the assumptions and theoretical considerations leading to the model are discussed in ref 17); in eq 4 /3 is the volume ratio of the l o g 1 / C = a l o g P - b l o g ( ~ P +1 ) +c

(4)

i -1.0 -

I

3

\ L

5

6

7

9

8

10

Log P

Figure 2. LDSoand LDlw values of alkanes in mice (Table I, eq 6 and 7) fitted with the bilinear model (-1 and the parabolic model (- - -);” confidence limits are indicated for each value.

lipid phase and the aqueous phases, Vi/ V,. If the equilibrium approach of Higuchi and Davis’3is followed, a very similar equation (eq 5)17 can be derived from the system

(5) presented in Figure 1 (the Higuchi-Davis model was extended recently to a generalized model for ionizable substances by Martin and Hackbarth;’* however, no applications of the model-derived equations to practical examples were given). It should be noted that eq 3 ( b = 2a; /3 = 1) and eq 5 ( b = 1) are special forms of eq 4. Unsymmetrical curves with linear ascending and descending sides and a parabolic part within the range of optimal lipophilicity are obtained from eq 4; unlike the Mc Farland model, from which only symmetrical curves with log Po = 0 result, eq 4 is appropriate for a quantitative description of structure-activity relationships in all practical cases. Because of the characteristic curves resulting from eq 4 the name “bilinear model” was proposed. Equation 4 is a four-parameter equation, the term P being a nonlinear term, which must be determined by stepwise iteration or by a Taylor series iteration method; the application of both calculation procedures to the bilinear model has been described in detail.’7b For a practical comparison of the bilinear model and the parabolic model, 57 data sets of biological activity values in homologous series were selected (Table I); only sets having six or more data points are included in Table I. With a few exceptions, the variance of log P values for each set is five or more log P units. Structurally heterogeneous sets and homologous series with a log P variance of less than 3.5 units were excluded. All equations were calculated with a Taylor series iteration program. l o g 1 / C = a l o g P - l o g ( ~ P +l ) + c

Results The equations resulting from the bilinear model for the data sets of Table I are given in Table I1 together with the 95% confidence intervals of the linear terms a, b, and c and log Po values calculated from the bilinear model and the parabolic model. /3 is given in its logarithmic form because the values of /3 vary over several powers of ten. Not too much importance should be attributed to the confidence intervals of the linear terms a, b, and c ; in several cases the confidence intervals are extremely wide (e.g., eq 14, Table 11),although the overall regression is statistically significant; however, such wide confidence intervals are well known in nonlinear regression analysisz5

I

I -2

i

0

2

L

6

Log P

Figure 3. Hemolytic activity of (a) N-alkylpyridinium compounds (Table I, eq 57) and (b) N-alkylpiperidines (Table I, eq 35) fitted with the bilinear model (-1 and the parabolic model (- - -1.

in those cases where the data do not accurately conform to the used model. In the case of eq 9 and 10 the wide confidence intervals of the b term are caused by the fact that the value of b is determined by only one data point in each equation. No meaningful confidence intervals could be derived for the nonlinear term /3 from the Taylor series iteration method’7b. The statistical parameters obtained for each data set with the bilinear model and the parabolic model are given in Table I together with a partial F test (see below). Figures 2-5 illustrate the differences between the bilinear model and the parabolic model and demonstrate the goodness of fit obtained with each model. As compared with the parabolic model the bilinear model contains an additional parameter; therefore, it is not surprising that higher correlation coefficients r are obtained with the bilinear model in most cases. However, the standard deviations s account for this extra parameter; if the s values from both models are compared, lower standard deviations are obtained in 52 out of 57 examples with the bilinear model. To check the statistical signif-

Journal of Medicinal Chemistry, 1977, Vol. 20, No. 5

QSAR, the Bilinear Model

627

Table I. Comparison of the Bilinear Model and the Parabolic Model: Data Sets and Statistical Parameters Bilinear Parabolic model model Partial Eq

Compounds

6 Alkanes 7 Alkanes 8 Alkanes 9 Alcohols 1 0 Alcohols 11 Ethers 1 2 Ethers 1 3 Ketones 14 Ketones 1 5 RCOO1 6 RCOO1 7 RCHBrCOO18 RCHBrCOO' 1 9 RCHBrCOO20 RCHBrCOO21 RCHBrCOO' 22 RCHBrCOO23 RCHBrCOO24 RCHBrCOO' 25 RCHBrCOO26 RCHBrCOO27 RCHBrCOO28 RCHBrCOO29 RCHOHCOO30 RCHOHCOO3 1 RCHOHCOO32 ROSO;Na+ 3 3 a -Monoglycerides 34 a-Monoglycerides 35 N-Alkylpiperidines 36 Alkylguanidines 37 Alkylguanidines 38 Alkylguanidines 39 Diamidines 40 Diamidines 4 1 Diguanidines 42 Diguanidines 43 Diguanidines 44 RN+(CH,), 45 46 47 48 49 50 51 52 53 54 56 57 58 59 60 61 62

C,H,CH,N+(R)(CH,), N-Alkylpyridinium bromides N-Alkylnicethamide chlorides 5-Alkyl-8-hydroxyquinolines 5-Alkyl-8-hydroxyquinolines 4-R-Lincomycins Phorbol 12,13-diesters

Biological activity Mice, iv LD,, Mice, iv LD,,, Mice, iv AD,,, Frog ventricle, narcotic action Tadpoles, narcotic action Mice, iv LD,,, Mice, iv AD,,, Mice, iv LD,,, Mice, iv AD,,, P. omnivorum, MIC (100%) Red cell dove, MHC Red cell sheep, CH V. cholerae, MKC (;% 7.5) I V. cholerae, MKC (pH 8.5) V. cholerae, MKC (pH 6.0) B. lepisepticus, MKC (pH 6.0) B. lepisepticus, MKC (pH 7.5) D. pneumoniae, MKC (pH 6.5) S. hemolyticus, MKC (pH 6.0) D. pneumoniae, MKC (pH 7.5) D. pneumoniae, MKC (pH 8.5) S. hemolyticus, MKC (pH 7.5) S. hemolyticus, MKC (pH 8.5) B. lepisepticus, MKC (pH 7.5) B. lepisepticus, MKC (pH 8.5) D. pneumoniae, MKC (pH 7.5) Red cell sheep, CH s o Red cell dove, MHC Red cell dove, MHC Red cell dove, MHC Rat liver mitochondria, I , , M. fructicola, ED,, S. pastorianus, ED,, Vs. S. typhosa Vs. S. aureus Vs. S. typhosa Vs. S. aureus vs. Strep. uiridans S. typhosa, MKC S. aureus, MKC P. aeruginosa, MIC S. typhosa, MIC P. vulgaris, MIC P. vulgaris, MKC S. aureus, MIC Cl. welchii, MIC P. aeruginosa, MKC Cl. welchii, MKC Red cell sheep, CH ,, C. albicans, MKC Red cell sheep, CH Red cell dove. MHeo Vs. S. aureus A . niger, Inh (RBR) A . niger, Inh (RBR) S. lutea (RBR) Irritant activity, ED,,

n 6 11 12 10 10 7 7 6 7 14b 8 8 7 6 6 6 6 8 8 8 8 8 8 7 6 7 6 7 8 11

lob

6 6 8 8 8 8 8 7 6 9 10 10 10 12 12 12 12 6 11 6 9 20 11 9 8 6

r

s

0.998 0.996 0.978 0.998 1.000 0.995 0.999 O.98Oc 0.969 0.959 0.992 0.941 0.985 O.98Oc 0.986 O.98lc 0.991 0.939 0.959 0.934 0.942 0.942 0.943 0.995 0.993 0.996 0,999 0.999 0.998 0.999 0.997 0.999 0.996 0.995 0.999 0.994 0.993 0.990 0.997 0.999 0.937 0.975 0.974 0.973 0.993 0.983 0.990 0.992 0.996 0.984 0.996 0.998 0.970 0.977 0.989 0.998 1.000

0.038 0.039 0.101 0.133 0.068 0.066 0.022 0.188 0.209 0.191 0.147 0.362 0.185 0.180 0.187 0.235 0.158 0.519 0.444 0.530 0.488 0.523 0.522 0.106 0.191 0.176 0.069 0.057 0.049 0.050 0.103 0.017 0.051 0,119 0.041 0.202 0.193 0.286 0.169 0.107 0.168 0.140 0.132 0.153 0.100 0.172 0.122 0.125 0.171 0.170 0.154 0.071 0.265 0.082 0.073 0.063 0.041

r s F test' Ref (eq) 0.976 0.101 19.38* 17, 1 9 0.930 0.148 111.41* 17, 1 9 0.909 0.189 23.79* 19 0.978 0.458 76.66* 20 (25, 26) 0.995 0.215 63.30* 20 (29, 3 0 ) 0.925 0.215 39.84* 1 9 0.955 0.152 181.33* 1 9 1.85 1 9 0.962 0.213 5.48 1 9 0.908 0.305 0.844 0.346 26.13* 21 ( 4 8 ) 6.49 22 (18) 0.979 0.213 3.26 22 ( 1 9 ) 0.891 0.436 5.15 11 ( 2 3 ) 0.958 0.264 1.92 11 ( 2 9 ) 0.959 0.206 2.10 11 ( 4 6 ) 0.971 0.219 0.985 0.171 11 ( 4 7 ) 7.20 11 ( 4 9 ) 0.959 0.277 2.54 11 ( 5 3 ) 0.898 0.593 5.72 11 ( 5 4 ) 0.898 0.620 2.33 11 ( 8 2 ) 0.893 0.596 2.16 11 ( 8 4 ) 0.909 0.542 4.58 11 (87) 0.872 0.685 4.61 11 (88) 0.872 0.686 0.963 0.245 18.31* 11 (16) 0.997 0.108 11 ( 2 1 ) 0.987 0.259 5.68 11 ( 7 4 ) 0.995 0.145 11.30 22 ( 2 7 ) 0.988 0.150 24.64* 22 ( 2 6 ) 0.989 0.105 18.95* 11 ( 9 5 ) 0.992 0.142 56.88* 22 ( 2 3 ) 0.989 0.180 15.44* 11 (11) 0.994 0.030 6.61 11 ( 1 7 ) 0.990 0.065 2.84 11 ( 3 3 ) 0.989 0.152 4.20 23 ( 1 3 ) 0.997 0.073 11.75* 23 ( 3 7 ) 0.995 0.156 23 ( 1 4 ) 0.979 0.296 7.82* 2 3 ( 3 8 ) 0.984 0.314 2.01 23 ( 4 9 ) 0.986 0.305 10.06 11 ( 4 0 ) 0.996 0.161 4.72 11 ( 8 3 ) 0.880 0.208 4.19 11 ( 1 9 ) 0.949 0.183 5.91 11 ( 2 0 ) 0.974 0.123 0.12 11 ( 3 0 ) 0.960 0.172 2.87 11 ( 3 6 ) 0.979 0.158 14.62* 11 ( 4 2 ) 0.966 0.230 8.14* 11 ( 4 3 ) 0.962 0.219 21.07* 11 ( 4 5 ) 0.980 0.190 12.60* 11 ( 4 8 ) 0.992 0.213 2.67 11 ( 5 5 ) 0.961 0.243 9.31* 11 ( 5 8 ) 0.998 0.085 11 ( 7 6 ) 0.993 0.137 17.46* 22 ( 2 1 ) 0.961 0.292 4.60* 2 3 ( 4 3 j 0.955 0.106 6.55* 11 (80) 0.986 0.076 1.44 11 (103) 0.992 0.105 9.81* 11 ( 5 0 ) 0.978 0.320 182.67* 17 (44, 4 5 ) ~

An asterisk indicates a statistically significant value ( p < 0.05). other regressions are statistically significant ( p < 0.05). a

icance of these differences between both models a partial F test"b was applied (Table I): in 25 out of 53 cases, where the bilinear model gives a better fit, this difference is statistically significant ( p C 0.05). It is no argument against the bilinear model that the difference between both models is statistically significant in only 25 cases; in most examples the number of data points is too small to achieve statistical significance. If only examples with ten or more data points are considered, the bilinear model gives a statistically significant better fit in 14 out of 17 cases (more than 80%), while the four examples, where the parabolic model gives a better fit (eq 21,30, 41,54; Table I) consist

Isopropyl compound not included.

~~~

p > 0 . 0 5 ; all

only of six to eight data points. In cases where only such a few data points are present, fortuitous deviations may lead either to a better fit of the bilinear model or to a better fit of the parabolic model, due to the similarity of both models. If the residuals resulting from the bilinear model and from the parabolic model are c ~ r n p a r e d , ~ ~the ~ ' dif~'*~~ ferences between both models can be seen at a glance (Figure 6). While the residuals from the bilinear model are without any detectable regularity, the residuals resulting from the parabolic model are not only significantly greater but also systematically distributed. The curves are

628 Journal of Medicinal Chemistry, 1977, Val. 20, No. 5

Kubinyi

Table 11. Bilinear Model. Log l / C = a log P - b log (pP+ I) + c . Results and Log Po Values Log P,,calcd from a'

6 7 8 9

10 11

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

42 43 44 45 46 37 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

'n

0.524 ( i 0 . 3 6 ) 0.956 (r0.35) 0.799 ( i 0 . 6 9 ) 1.114 (iO.09) 1.192 ( i 0 . 0 5 ) 1.095 ( t 0 . 7 1 ) 0.647 ( t O . 1 1 ) 0.733 ( i 1 . 3 4 ) 0.682 ( i 1 . 4 7 ) 0.398 (iO.10) 0.960 (iO.29) 0.598 ( i 0 . 4 2 ) 0.707 (i 0.33) 0.840 (t 0.95) 0.710 ( i 0 . 5 9 ) 0 . 9 0 2 (i 1.00) 0.827 ( i 0 . 5 7 ) 0.809 ( ~ 0 . 6 5 ) 0.734 (10.48) 0.794 ( t 0 . 6 2 ) 0.857 ( i 0 . 6 2 ) 0.784 ( i 0 . 5 7 ) 0.785 ( i 0 . 5 7 ) 1.100( i 0 . 4 4 ) 0.885 (i 0.59) 0.881 ( i0.25) 1 . 2 3 6 ("0.39) 0.993 (i0.35) 0 . 8 3 1 (10.14) 0.962 ( = 0.06) 0.825 ( i O . 1 1 ) 0.146 (iO.10) 0.279 ( t 0 . 2 9 ) 0.846 (i 0.24) 0.912 ( t O . 1 1 ) 0.792 (20.23) 0.806 (i 0.22) 0.803 ( t 0 . 2 8 ) 1.002 ( i 0 . 2 3 ) 1 . 1 9 1 (20.33) 0.766 (i 0.73) 0.552 (20.20) 0.675 (20.26) 0.683 ( + 0.29) 1.047 ( i 0 . 1 9 ) 0.942 (i0.26) 0.983 ( ~ 0 . 1 9 ) 1.061 (i0.20) 2.139 (i 1.85) 1.150 ( t 0 . 3 1 ) 1.006 (iO.50) 1.028 ( ~ 0 . 1 2 ) 0.561 ( i 0 . 1 2 ) 0.381 ( t O . l l ) 0.547 ( i 0 . 1 4 ) 1.104 ( - 0 . 1 8 ) 0.193 (i0.06)

ba

Ca

0.886 (t0.33) 1.306 ( i0.34) 1.152 ( t 0 . 6 6 ) 4.344 ( i 4 . 3 8 ) 6.131 (119.14) 1.384 (50.68) 0.951 ( i O . l l ) 1.382 (t 1.22) 1 . 1 2 1 (21.39) 2.596 (23.71) 1.413 ( i 0 . 5 3 ) 1.378 (i 0.84) 1.471 ("0.60) 1.389 (k0.92) 1.869 ( i 1 . 2 2 ) 1.895 ( i 1 . 1 5 ) 1.939 ( - 0 . 8 6 ) 1.996 ( i 1.12) 2.291 (i 1.16) 1.957 ( i 1.22) 1.836 (i 1 . 0 3 ) 2.194 ( i 1 . 3 6 ) 2.195 ( i 1 . 3 6 ) 1.267 ( i 0 . 4 2 ) 1 . 1 4 7 (t 1.29) 1.097 ( - 0 . 4 5 ) 1.191 (20.37) 2.212 ( i 0 . 3 0 ) 1.650 ( i 0 . 1 5 ) 1.408 ( 2 0 . 1 7 ) 0.925 ( i 0 . 4 3 ) 0.679 ( i 0 . 2 0 ) 1.206 ( + 0 . 5 8 ) 1 . 0 3 1 (k 1.74) 1.022 (i 0.19) 0.734 ( - 0 . 6 0 ) 1.150 ( i 0 . 5 9 ) 0.835 (I 1 . 2 7 ) 1.787 ( 2 1 . 1 4 ) 1.394 ( i 0 . 8 1 ) 1.033 ( i 0 . 6 6 ) 0.937 (i 0.30) 1.186 (20.31) 1.439 ( - 0 . 3 5 ) 1.507 (kO.19) 1.274 ( i 0 . 3 1 ) 1.715 (k0.22) 1.376 ( i 0 . 2 3 ) 2.118 ( i 1 . 7 1 ) 2.041 ( - 0 . 3 7 ) 1.191 ( ~ 0 . 9 5 ) 1.316 (10.36) 0.702 (kO.20) 0.929 ( i 0 . 2 4 ) 0.936 ( i O . 1 9 ) 1.772 (20.30) 1.054 (kO.10)

95% confidence intervals are given in parentheses.

0.443 ( + 1.30) -1.498 (i 1.21) -0.794 ( i 2 . 4 1 ) -0.883 ( t 0 . 2 4 ) -0.797 ( t 0 . 1 6 ) 0.859 ( t 0 . 6 0 ) 1.469 (I 0.14) 1.321 ( i 1 . 6 6 ) 1.809 ( i 1 . 5 8 ) 3.826 ( i 0 . 3 0 ) 2.511 ( ~ 0 . 2 1 ) 2.498 ( t 0 . 6 2 ) 2.426 ( t 0 . 3 5 ) 2.117 ( ~ 0 . 5 5 ) 2.970 ( i 0 . 5 0 ) 2.912 ( i 0 . 6 5 ) 2.388 ( i 0 . 4 3 ) 3.369 ( i 0 . 8 9 ) 2.996 ( i 0 . 7 6 ) 2.765 (i 0.91) 2.476 (I 0.84) 2.382 (iO.90) 2.079 ( i 0 . 8 9 ) 3.784 ( i 0 . 8 0 ) 3.078 ( i 0 . 8 9 ) 3.164 (k0.33) 2.276 ("0.21) 1.149 ( i 0 . 7 6 ) 1.516 ( t 0 . 3 5 ) 1 . 3 0 5 (iO.10) 3.498 ( i 0 . 3 4 ) 5.060 (20.12) 5.119 (20.35) -1.282 ( + 1.18) --0.896 ( i 0 . 5 3 ) -0.236 (20.94) 0.382 (20.88) 0.275 (21.20) 2.512 ( ~ 0 . 3 3 ) 2.147 ( i 0 . 3 2 ) 2.866 ( i 0 . 3 1 ) 3.998 (-0.15) 2.878 ( - 0 . 1 7 ) 2.879 (:0.19) 4.757 ( I 0 . 1 2 ) 3.774 ( i 0 . 1 8 ) 2.643 ( 2 0 . 1 3 ) 3.723 ( t 0 . 1 4 ) 3.871 ( i 1 . 8 2 ) 3.159 (10.21) 2.919 ( - 0 . 4 0 ) 2.559 ( i O . 1 0 ) 3.272 ( ~ 0 . 3 8 ) --0.604 (I 0.36) --1.230 (F 0 . 4 4 ) -0.545 (iO.11) 9.373 ( 2 0 . 3 7 )

Log P -4.956' -4.41 2 -4.577b -6.433' -6.229' -2.037 -2.623 -2.668 -2.325 -0.728 -1.743 -3.621 -1.918 -1.560 -2.455 --1.883 -2.208 -3.444 -3.828 -3.599 -3.394 -3.829 -3.829 0.133 -0.953 --1.977 -1.750 -3.454 -4.100 ~-3.555 0.594 - 2.806 --2.814 -7.059' -.6.409' -6,710' -6,756' -7,195' -2.309 -2.127 -1.099 -2.030 -1.980 -2.065 -1.438 - 1.800 --1.735 --1.656 -0.358 -2.047 -1.986 -2.499 ~-5.97 4 ' -5.128 -4.81 4 -1.451 --9,983c

Bilinear model

Parabolic model

5.12' 4.85b 4.93b 5.97' 5.61" 2.62 2.95 2.72 2.52 -0.01 2.07 3.51 1.88 1.74 2.24

5.38b 5.26' 5.23' 6.73' 8.82' 3.59 3.56 2.85 2.71 -0.30 2.40 3.21 1.76 1.97 1.93 1.79 1.89 2.91 2.82 3.05 3.15 3.02 3.02 2.03 1.86 3.38 3.75 3.36 4.16 4.25 1.44 1.93 1.99 8.73' 7.60' 9.20" 7.75' 10.21' 2.62 3.35 2.13 2.38 2.21 1.97 2.18 2.68 1.99 2.65 2.59 2.25 3.15 3.56 7.62' 4.74 5.03 1.78 7.69'

1.84

2.08 3.28 3.50 3.43 3.34 3.57 3.57 0.69 1.48 2.59 3.36 4.11 3.89 0.32 2.24 2.29 7.72' 7.33' 7.13" 8.59' 2.42 2.89 1.56 2.19 2.10 2.02 1.79 2.25 1.86 2.18 2.16 2.72 3.05 6.57' 4.97 4.96 1.67 9.33c

Log P = 3.3g2' was used for n-pentane instead of log P = 2.50.'7.'9

was used as hydrophobic parameter instead of l o g P.

strikingly similar to those presented by Mc Farland;" in all cases the calculated log 1 / C values within the range of log Po and far away from log Po are underestimated by the parabolic model (positive residuals), while the values between are overestimated (negative residuals). Discussion The bilinear model constitutes a further progress in the quantitative description of nonlinear structure-activity relationships. It is mathematically derived from a simple model system, it is generally applicable, and it represents a smooth synthesis of both linear and nonlinear parts of

structure-activity relationships; starting with a strictly linear relationship between biological activity and hydrophobic character, a point is reached, where the structure-activity relationship changes within approximately two log P units to a linear decrease of biological activity. Although this cut-off point is no sharp break in the structure-activity correlation, the change is more pronounced in the bilinear model than in the parabolic model. Identical equations are obtained if a probability approach or an equilibrium approach is used for the derivation of the bilinear model; thus the question, if there

Journal of Medicinal Chemistry, 1977, Vol. 20, No. 5 629

QSAR, the Bilinear Model 7

A

. -

Bilinear Model

Parabolic Model

I

- 0.1

0 6 m

0

5

- 0.1

4

- 0.1

J

0

3

-0.1 J

v

-01 J

v

I

\/

\ I V! .

2

1

0

6

4

2 Log P

Figure 4. Bactericidal activity of benzyldimethylalkylammonium compounds (a) vs. S. aureus (Table I, eq 50)and (b) vs. C1. welchii (Table I, eq 53) fitted with the bilinear model (-) and the parabolic model (- - -).

relationships in heterogeneous sets and to dissociate hydrophobic, electronic, and steric effects more precisely. Acknowledgment. I am indebted to 0. H. Kehrhahn for many stimulating discussions and for his help in running the calculations. References and Notes

. 0

l1

c

m -J 0

10 /

9

8

,

~~

I-

4

,

I

6

8

10

Tr

I

I

12

14

Figure 5. Irritant activity of phorbol 12,13-diesters (Table I, eq 62) fitted with the bilinear model (-) and the parabolic model (-

- -);I7

Figure 6. Analysis of residuals obtained with the bilinear model and the parabolic model for (a) eq 6, (b) eq 7, (c) eq 35,(d) eq 50,(e) eq 53,and (f) eq 57 (Table I; Figures 2-4): A = log l / C o h d - log l / C c d d . The dotted lines indicate log Po (from the bilinear model).

approximate standard deviations are indicated for each

value.

are nonequilibrium or quasiequilibrium (steady state) conditions in biological systems, can be ignored if the bilinear model is used. I t is recognized that the bilinear model is-like other models-an attempt to simulate a complex process in a rather simplistic way; there may be other factors which could cause a departure from linearity. However, the model “works”; in a large number of cases a nearly perfect fit results from the bilinear model. If only a few data points are available or if the log P values vary within a small range, the parabolic model is a good approximation of the bilinear model and seems to be the superior model. However, if enough data points are present, if the biological data are measured with high accuracy, and if the log P values vary over a wide range, the bilinear model should be preferred. Due to the lack of systematic deviations it will be possible in the future to apply this model also to nonlinear structure-activity

(1) For paper 6,see ref 17b. (2) H. Meyer, Arch. Exp. Pathol. Pharmacol., 42,109(1899). (3) E. Overton, 2. Physiol. Chem., 22, 189 (1897). (4) E. Overton, “Studien ueber die Narkose”, Fischer, Jena, 1901,p 45. (5) C.Hansch and T. Fujita, J. Am. Chem. SOC.,86,1616(1964). (6) C.Hansch, Acc. Chem. Res., 2, 232 (1969). (7) C. Hansch in “Drug Design”, Vol. I, E. J. Ariens, Ed., Academic Press, New York, N.Y., 1971,p 271. (8) C.Hansch and W. J. Dunn, 111, J. Pharm. Sci., 61,l (1972). (9) J. Ferguson, Proc. R. SOC.London, Ser. B, 127,387(1939). (10) J. T.Penniston, L. Beckett, D. L. Bentley, and C. Hansch, Mol. Pharrnacol., 5, 333 (1969). (11) C.Hansch and J. M. Clayton, J . Pharm. Sci., 62,l(1973). (12) J. W. Mc Farland, J . Med. Chem., 13,1192 (1970). (13) T.Higuchi and S. S. Davis, J . Pharm. Sci., 59,1376(1970). (14) R.Franke and W. Schmidt, Acta Biol. Med. Ger., 31,273 (1973). (15) S.H. Yalkowsky and G. L. Flynn, J. Pharm. Sci., 62,210 (1973). (16) R. M. Hyde, J . Med. Chem., 18, 231 (1975). (17) (a) H. Kubinyi, Arzneim.-Forsch., 26,1991 (1976); (b) H. Kubinyi and 0. H. Kehrhahn, ibid., in press. (18) Y. C. Martin and J. J. Hackbarth, J. Med. Chem., 19,1033 (1976). (19) R. Jeppsson, Acta Pharmacol. Toxicol., 37, 56 (1975). (20) C.Hansch, A. R. Steward, J. Isawa, and E. W. Deutsch, Mol. Pharmacol., 1, 205 (1965). (21) C. Hansch and E. J. Lien, J. Med. Chem., 14,653 (1971). (22) C.Hansch and W. R. Glave, Mol. Pharmacol., 7,337 (1971). (23) E. J. Lien, C. Hansch, and S. M. Anderson, J . Med. Chem., 11, 430 (1968). (24) A. Leo, P. Y. C. Jow, C. Silipo, and C. Hansch, J . Med. Chem., 18,865 (1975). (25) W. J. Westlake, J . Pharm. Sci., 60,882 (1971). (26) N. R.Draper and H. Smith, “Applied Regression Analysis”, Wiley, New York, N.Y., 1966. (27) C. Daniel and F. S. Wood, “Fitting Equations to Data”, Wiley-Interscience, New York, N.Y., 1971.

Quantitative structure--activity relationships. 7. The bilinear model, a new model for nonlinear dependence of biological activity on hydrophobic character.

QSAR, the Bilinear Model replaced by normal equations with fewer unknowns. Kubinyi and Kehrhahn give a number of more typical examples, and in partic...
619KB Sizes 0 Downloads 0 Views