Computer Programs in Biomedicine 10 (1979) 29-33 © Elsevier/North-Holland Biomedical Press

COMPUTER SIMULATED MODELING OF BIOMOLECULAR

SYSTEMS

K. SUNDARAM and Subhashini SRINIVASAN Department of Crystallography and Biophysics, University of Madras, Guindy Campus, Madras-600025, India

This paper describes the algorithm of a program used to simulate three dimensional models of molecules. In addition to open ended molecules the program also enables simulation of structures with constraints in the form of cyclic regions or fixed location of particular atoms. Several molecules can be handled in a single run and each molecule can have any number of constraints. Further, any number of conformations can be obtained for each constrained region. The program can be used for research in several areas of molecular biology, e.g., structure determination, conformational analysis and topographic comparisons.

1. Introduction

2. The method of simulated model building

Many fundamental processes in biology depend on the specific three dimensional architecture of biomolecular systems [1 ]. Reasonably good information on the architecture can be obtained by building bent wire models of the type of Kendrew models or Dreiding models. Simulation o f such models using a computer, rather than actual physical model building, has many advantages: economy of material and time, precision, versatility, ease of manipulation and viewing. For all these reasons much hope has rested on theoretical methods of predicting the three dimensional shape molecules would assume in a given biological milieu. Theoretical methods of structure determination have not yet developed to a level of accuracy comparable to experimental methods. But the great advantage is the speed with which one can generate structures (with present day computers) without ever even making the compounds. While methods are developed to predict the complete structure by the use of empirical potential functions for the various bonded and non-bonded interactions [2], even a simple simulation of molecular models turns out to be rather useful. In this article we describe a computer program which enables the simulation o f biomolecular systems satisfying prescribed constraints. Several laboratories have developed similar molecular simulation programs over the last decade, especially for use with interactive computer graphic facilities. For a sample (not exhaustive) list see [ 3 - 9 ] .

2.1. Simulation o f open ended molecules Simulation of a bent wire model amounts to obtaining the Cartesian coordinates of the atomic centres with reference to the central coordinate system. A neat algorithm for this has been provided by Thompson [10]. The input data for Thompson's algorithm are the internal parameters: bond length, bond angle and dihedral angle at each atomic posmon. A reasonably good estimate can be made of the bond lengths and bond angles based on the numerous solved structures available in the literature. Dihedral angles corresponding to conjugated regions can also be estimated. Recent research on the conformation of dimers is leading to the general conclusion that rotations around single bonds are also restricted and the corresponding dihedral angles belong to a small set of probable values. Starting from a reasonable set of internal parameters, the Cartesian coordinates of all the atoms in a molecule are obtained by defining a local frame at each atom. Then by using a classical procedure developed by Eyring [ 11 ], the coordinates of each atom with reference to the base coordinate system are obtained by expressing them in local frames situated on successive atoms preceding it. The base coordinate system is defined by the positions of the three atoms: 1,2 and 3. The origin is taken at atom 1. Atom 2 lies on the negative X-axis and atom 3 lies in the first or the second quadrant of the 29

K. Sundaram, S. Srinivasan, Simulated modeling o f biomolecular systems

30

X Y plane. To this base coordinate system, atoms are added one-by-one specifying the bond length, bond angle and dihedral angle of each successive atom. Let j be any atom in a molecule and j', j" ..... 2, 1 the linear sequence of atoms connecting atom j to 1. The /th bond distance R] is the distance between atoms ] and I • The lth bond angle is the non-reflex angle 11 ! • The/th dihedral angle is the angle by which the plane j]'/' is rotated clockwise (as viewed from ]'j" direction) to include atom ] " in it. The coordinates of atom j in the local frame at ]' are given by the following matrix equation: .t

qj,

=

.

. .I

Bjq

s3- n, ... n;,,. n;,. n;. q;

For further details reference may be made to [12].

2.2. Cyclic and constrained molecules The method of simulation of wire model, described above, treats a molecule as a tree-like structure with a main trunk (backbone), primary (or first order) side chains, secondary side chains and so on. Very often some constraints have to be imposed on this free simulation, to take account of the cyclic regions, prescribed end-to-end dimensions, etc. In order to do this a cyclic region of n atoms is taken as a linear chain of (n + 1) atoms where the (n + 1)th atom is the repetition of the first atom. If all the 3n internal parameters are independently specified, atom (n + 1) would not fall on atom 1. Any 6 (or more) of the internal parameters are now varied in order to swing the local frame at (n + 1) exactly in coincidence with the local frame at atom 1. This is achieved by taking 3 sets of points in both the frames to be matched. The base system coordinates of the two sets of points in both the frames to be matched are then given by: 4

q~'l(i) = ~ Al (i, j) j=c~

4

qC~'(n+l)(i) = ~ A(n+l)(i, j) j=~ where

.It

where q are the four vectors whose components consist of the three Cartesian coordinates of the point and a constant fourth component of value unity. The B's are 4 X 4 matrices defined in terms of the internal parameters of the ]th atom. The coordinates of the jth atom with reference to the base coordinate system located at atom 1 are obtained by a series of transformations as follows:

q, :

and

A / = B1B~ B3 "'" B i

and a = 1,2, 3 refer to the three points and i and j are indices referring to row and column numbers, respectively. The difference in the coordinates of the corresponding points can be written as a single array f ( k ) of 9 elements:

f ( k ) = q~'(n+l )(i) - q~'l(i) In the method of damped least squares [13], which was found valuable for the ring closure problem [14], the iterative step is described by the vector equation: ~Di+I = ~ " _ [JJ + p E ] - l ~ f i

where

p = 2 I~fill/llfill z / i s a 9 × m matrix of the derivatives o f f i ( k ) with respect to the internal parameters and E is an m × m unit matrix. The superscript i in the above equation is the iteration number and m is the number of dependent variables. For the evaluation of the derivatives o f f ( k ) with respect to ¢(m)'s the expressions derived by Thompson [10] are used:

( dqo~/( dRi) : gtx,j (dq~/(da/) = --az, i X [qko -- ~o'] (dq~/(d~j) = ax,/' × [qko -- qlo'] = ax,j' X [qko -- ¢o"] where ~x,j is the unit vector along the x-axis in the local frame at atom ] and ~x,j' and az,/' are unit vectors along the x-axis and the z-axis, respectively, of the local frame at atom j'.

2.3. Docking control In manual model building a closed loop is formed by simultaneously aligning the ends while moving them close together. Such a procedure helps avoid

31

K. Sundaram, S. Srinivasan, Simulated modeling o f biomolecular systems

N°. ,NATs, °''T°M't

i

"

Y

CULCULATECARTESIAN COORDINATES OF ALL ATOMS

1

NO

[,EAO

OEOMETR~

½

OAT~

I

IWRXTE ~Eo~ErRY I DATA IN FILE

YES YES

[~o. TO T~EA' ANOTHER MOLECULE

NO

/---~T T.E T

READDAGEOMETRYTAI FROM FILE

t 'READ ATOMS TO COINCIDE LllL2 EP$oCON$ToMAXITR PLTSToPLTF

t

I CALCULATE NONCLOSURE PARAMETER(DELF)

1

_

t I"TER"'" "A"l

+ +

YES

INTERNAL PARAMETERS (DEPENDENT VARIABLE J i

SOLVE ITERATIVE EQUATION

t GOES TO ANOTHER

ITER-O INITml

+ IREAD DATA RELEVENT ] I SET i J TO NEW CONFORMATION~--~ ITERBO I NEWLP~IoJ~VALUE I UNIT'INIT+~

YES

] YES >

ICALL O~AW~ 1 YES

~TREAT ANOTHER LEYELZCREGION

l CALLDRAW ~ I

I

~RINT COORDINATESJ OF ALL ATOMS [ ~

PRINT MAX/TR AND 1 FINAL NONCLOSURE PARAMETER

t N° Fig. 1. Flowchart.

short contacts. In analogy with the space shuttles, we call this the parallel docking. It is really disappointing to see that by choosing the three reference points as (1, I, 1), (0, 1, 1) and (0, O, 1) we get ~ther a screw type of docking, i.e., the ends have a tendency first to move close to each other and align later. Such a clo-

sure is often undesirable as many short contacts creep in during the final alignments. Interesting docking patterns can be obtained by proper selection of reference points in the two frames. In this paper we have tried to attain parallel docking by taking the reference points having coordinates (r, r, r), (0, r, r),

K. Sundaram, S. Srinivasan, Simulated modeling of biomolecular systems

32

Table 1 Amounts by which the body movement parameters should be changed for a rotation by ~0degrees around various axes Rotation around

Parameter cq

#1 ~0 0 0

c~2

#2

#3

0 0 0

0 -~ 0

0 0

X

0 0 0

Y

~0

90

0

90

o

z

¢, 0

o 0

o 0;

o 0

o 0

(0, O, r) where r is the distance between atom 1 and atom (n + I) at any given stage of closure.

2.4. Body rotation and translation According to the definition of the internal parameters the bond length of the first atom, the bond angles of the first two atoms and the dihedral angles of the first three atoms are indeterminate. The indeterminacy of the 6 parameters R 1, al,/31, ~2,132 and /33 actually represents the 6 degrees of freedom of the molecule as a whole. Specification of values other than 180 ° for 02 and nonzero values for other 5 of the above parameters enables one to position the base coordinate system in any general tanslated and rotated manner relative to the molecule. Table 1 gives a way to achieve rotation about the three coordinate axes of the base system.

dealing with cyclic or constrained regions, and Thompson's expressions for finding the elements of the J matrix. Iterative steps continue till the specified degree of closure is attained or till the iteration number exceeds the maximum number of iterations specified, whichever occurs earlier. A subroutine called MATINV is used to invert the matrix of dimension m × m (where m is the number of dependent variables prescribed) to solve the set of simultaneous equations determining the dependent variables. All input and output information refers to the atoms in terms of a numbering system of appeal to the user. This can, for instance, be the conventional chemical numbering. Internally the program has to work in terms of a numbering sequence related to the tree-like topology. The subroutine NOSER gives the internal sequence number of any atom given its external number. The subroutine DRAW 1 helps visualise the simulated model at each iterative step by plotting the X Y projection of the wire-like model.

3. Program structure

3.1. Brief description o f program components The flowchart of the main program is given in fig. I. The program enables any number of molecules, any number of cyclic regions in each molecule and any number of conformations of each constrained region to be handled in a single run. The main program uses Thompson's algorithm for the generation of Cartesian coordinates of all the atoms from the specified internal parameters, a modified version of the damped least squares method for

Fig. 2. Partial superposition in space of chlorpromazine and dopamine.

K. Sundaram, S. Srinivasan, Simulated modeling o f biomolecular systems

33

4. Sample run

References

As a typical sample run we provide an o u t p u t obtained in the process o f developing the suspected topography o f dopamine receptor in brain. The line drawing (fig. 2) obtained using DRAW 1 shows a molecule o f chlorpronazine (competitor for the dopamine receptor) partially superposed in three dimensions on the natural neurotransmitter dopamine b y requiring an atom o f the former molecule to coincide with a specified atom o f the latter. Further details o f such an application to drug design have been published in [15].

[1] A. Rich and N. Davidson, eds., Structural Chemistry and Molecular Biology (Freeman, San Francisco, 1968). [2] M. Bixon and S. Lifson, Tetrahedron 23 (1967) 769. [3] C. Levinthal, Sci. Am. 214 (1966) 42. [4] R. Diamond, in: Crystallographic Computing Techniques, p. 336, F.R. Ahmed, ed. (Munksgaard, Copenhagen, 1976). [5] K. Nagano, in: Crystallographic Computing Techniques, p. 344, F.R. Ahmed, ed. (Munksgaard, Copenhagen, 1976). [6] C.N. Morimoto and E.F. Meyer,jr, in: Crystallographic Computing Techniques, p. 488, F.R. Ahmed, ed. (Munksgaard, Copenhagen, 1976). [7] R.J. Feldmann, S.R. HeUer and C.R.T. Bacon, J. Chem. Doc. 12 (1974) 234. [8] R. Langridge, Fed. Proc. Fed. Am. Soc. Exp. Biol. 33 (1974) 2332. [9] G.R. Marshal, H.E. Boshard and R.A. Ellis, in: Computer Representation and Manipulation of Chemical Information (W.T. Wipke et al. eds.) pp. 203-237 (John Wiley, New York, 1974). [10] H.B. Thompson, J. Chem. Phys. 23 (1967) 769. [11] H. Eyring, Phys. Rev. 39 (1932) 746. [12] K. Sundaram, Int. J. Quant. Chem. 8 (1974) 565. [13] K. Levenberg, Quart. Appl. Math. 2 (1944) 164. [14] A. Vitek, Coil. Czech. Chem. Commun. 33 (1968) 1601. [15] K. Sundaram, S. Mahajan and R.K. Mishra, Physiol. Chem. Phys. 6 (1974) 469.

5. Hardware/software specifications The program was written in F O R T R A N IV and run on IBM 370/55 machine. With minor alterations the program can be adapted to any machine supporting F O R T R A N .

6. Mode of availability The complete set o f program listings, as written for IBM 370 system, can be obtained by writing to the authors.

Computer simulated modeling of biomolecular systems.

Computer Programs in Biomedicine 10 (1979) 29-33 © Elsevier/North-Holland Biomedical Press COMPUTER SIMULATED MODELING OF BIOMOLECULAR SYSTEMS K. S...
306KB Sizes 0 Downloads 0 Views