suppmat1848. 280KB Jun 05 2011 09:30:45 PM

SUPPLEMENTARY MATERIAL to Manuscript # 2952
Piero Procacci1z Tom A. Darden2, Emanuele Paci3y Massimo Marchi3
1

Centre Europeen de Calcul Atomique et Moleculaire (CECAM) Ecole Normale Superieure de
Lyon 46 Allee d'Italie, 69364 Lyon, FRANCE
2 National Institute of Environmental Health, Sciences Research Triangle Park, NC 27709
3 Section de Biophysique des Prot
eines et des Membranes, DBCM, DSV, CEA,

Centre d'Etudes,
Saclay, 91191 Gif-sur-Yvette Cedex, FRANCE

Permanent address: Dipartimento di Chimica, Universita degli Studi di Firenze, 50120 Firenze,
Italy
z

Present address: Laboratoire de Chimie Biophysique, Institut Le Bel, Universite Louis Pasteur,
67000 Strasbourg, France
y




Author to whom all correspondence should be addressed

1

VI. ORAC GENERAL FEATURES
As we shall see, many ORAC 's commands allow the opening of external les. No unit
number needs to be provided as ORAC opens sequentially the required les assigning at each
le a unit number according to their order of occurrence in the input le.

A. The ORAC Input Files: sys.mddata
Since ORAC is designed not as a modeling interface, but as a molecular dynamics program which performs computing intensive tasks, it works only in a non{interactive batch
mode: The user must provide an input le (hereafter referred to as sys.mddata) containing
commands for execution, and a series of auxiliary les the names of which are also provided
in sys.mddata. At execution time, the input le is read from standard input in free format.
This means that each input line is read as a character string and parsed in the composing
substrings, a series of characters separated by blanks or commas. Each substring represents
an instruction and is interpreted by speci c routines. A line having the \#" character in
column 1 is always considered a comment.

ORAC 's instruction set has been designed to include three di erent kinds of instructions:
environments, commands and subcommands. A le sys.mddata is made out of a series of
environments, the order of which being unimportant, including a series of commands which
in turn might use a few subcommands. The environment name is a string beginning always
with the & character followed by capital letters. Each environment ends with the instruction
&END. Environments are reminiscent of the fortran namelist, but have not been programmed
as such and are portable. Command names are characters strings containing only capital
letters. Each command reads a variable set of parameters which can be characters and/or
numbers (real or integer). Moreover, commands composed of more than one input line
(structured commands) also exist. Each structured command ends with the instruction END
and allows a series of subcommands in its inside. Subcommands are always in lower case
and can read substrings containing characters and/or real or integer numbers.
In the following paragraphs we will discuss brie
y the basic structure of the input to
ORAC . This is intended to be a very concise and by no means exhaustive guide to ORAC 's
2

environments directives. More details about the speci c environments commands and their
syntax, including the syntax of the auxiliary topological and potential les, will be found in
the subsequent sections where a practical example will be illustrated. For a complete description of all supported environments and commands the reader is referred to the ORAC manual

[48].
In general, the environments speci ed in sys.mddata are roughly classi ed into three
categories:


The Description Environments contain commands referring to the structure of the
system and to the interactions potential. These commands may instruct ORAC , for
example, to use a particular potential form, to adopt a simulation box of a given
size and shape, to insert solvent molecules, to read the potential and the topology
parameters les, to add extra topology etc..



With the Simulation Environments commands, one can choose the kind of simulation
to perform, e.g., the temperature and the pressure of the systems, the integration
scheme to be used etc..



The Output Environments commands control the output of the simulation such as

properties calculation, binary or ascii history les, restart les.
1. Description Environments

To this category belong the environments &SETUP, &SOLUTE, &SOLVENT, &PARAMETERS.
In &SETUP the box size is speci ed by appropriate arguments to the commands CELL and
CRYSTAL. The PDB lename containing the solute and/or solvent coordinates is also entered
in &SETUP through the command READ PDB.
Commands de ning certain potential options for the solute molecules can be speci ed
in the &SOLUTE environment. Examples of such commands are STRETCHING which allows
for bond stretching, I-TORSION which de nes the functional form of the improper torsion
potential, AUTO DIHEDRAL which is used when all the possible proper torsions of the solute
molecules are to be included in the potential, etc. The command INSERT is also available
3

from this environment, and it is used to ll the simulation box with solvent. Obviously,
when INSERT is speci ed, the environment &SOLVENT must also be present.
All the parameters needed to completely de ne the atomic and geometrical structure of
the solvent molecule and its LJ and electrostatic interaction potentials can be speci ed in
the environment &SOLVENT. Although ORAC can not generate solute molecules coordinates,
simulations of systems containing only solvent molecules can be run by the program without the need to read any external coordinate les. To do so, in addition to de ning the

appropriate commands in the &SOLVENT environment, the crystal basic cell and the number
of replicas to be included in the MD simulation must be constructed using the commands
CELL and CRYSTAL of the environment &SETUP. Alternatively, the solvent may also be read
from the PDB le speci ed in &SETUP.
The &PARAMETERS environment has been designed to de ne a series of operations strictly
connected with the topology of the solute molecules. Thus, &PARAMETERS must appear in
the input le only if the environment &SOLUTE is also present. In &PARAMETERS the primary
structure of the solute is speci ed in the structured command JOIN as a sequence of molecular
units or residues forming the solute molecules, e.g. the amino acid residues. Each unit is
labelled by a name to which corresponds topology data (atomic labels, connectivity, etc.)
in an ascii topological le, hereafter referred to as eld.tpg. The name of the le eld.tpg is
speci ed in another &PARAMETER command READ TPG ASCII. In the le eld.tpg the atomic
charges on each unit of the solute are also de ned. The solute potential parameters, which
include, stretching, bending, proper torsion, improper torsion and non{bonded parameters,
are read in from the other ascii parameters le, hereafter referred to as eld.prm. The
structure and format of both the eld.tpg and eld.prm les will be examined later on in this
section. The name of the le eld.prm can be speci ed in the command READ PRM ASCII
field.prm. While the intra{residue topology is de ned in the eld.tpg le, inter{residue
topology, such as disulphur bonds and the added topology (bending, and torsions involving
the two sulphur atoms), can be speci ed in &PARAMETERS using the structured command

ADD TPG.

4

2. Simulation Environments

To this category belong the environments &SIMULATION, &RUN, &INTEGRATOR,
&POTENTIAL. In &SIMULATION, parameters and keywords connected to the type of simulation to be performed must be speci ed. Temperature and pressure are entered in this
environment. A regular MD is performed only if the command MDSIM is entered. If, along
with MDSIM, other commands such as STRESS and/or CONST TEMP are given, an extended
system simulation in the speci ed ensemble is performed. Simple minimizations (steepest
descent only) are done if the command MINIMIZE is present.
The &RUN environment contains commands specifying actions to be taken during the MD
run. For examples, the length of the rejection phase, when velocities are scaled, is speci ed
by REJECT, the length of the production run by TIME, the printing interval for instantaneous
energies by PRINT, etc. Moreover, &RUN includes the command CONTROL which speci es if the
simulation must be run either from new input coordinates or from a restart le containing
coordinates and velocities generated from an older simulation.
The integration algorithm to be used during the simulation and the time step size are
speci ed in the &INTEGRATOR environment. In case the selected integrator was r{RESPA,

the command TIMESTEP reads the largest timestep of the algorithm, namely th in Eq.
IV.51. Only two mutually exclusive commands can be provided to select the integrator:
SINGLE STEP, and in this case a conventional single step MD is performed, or the structured
command MTS RESPA to perform a NVE ensemble simulation with the r{RESPA integration
algorithm. SINGLE STEP must be speci ed if the dynamics is carried out at constant temperature and/or pressure, i.e. if CONST TEMP and/or (ISO)STRESS are given in &SIMULATION.
The structured command MTS RESPA de nes the parameters of the r{RESPA integrator,
namely the radii and healing lengths of the short, medium and long range shells for the
non{bonded interactions, and the time steps de ned in Eq. IV.51. MTS RESPA includes also
a subcommand de ning the reference system associated with the calculation of reciprocal
space contribution to the SPME or to standard Ewald summation.
The &POTENTIAL environments is used to de ne various parameters related to the non{
bonded potential interaction and a ecting both solute and solvent molecules. &POTENTIAL
allows commands to set up cuto schemes (commands GROUP CUTOFF and EWALD), to change
5

the direct lattice cut-o (commands CUTOFF, GROUP CUTOFF, to modify the radius of the
Verlet Neighbor list and the frequency of its evaluation (command UPDATE) or the parameters of the linked cell neighbor list (command LINKED CELL). Also, the reciprocal space
convergence parameter of the Ewald sums ( in Eq. III.20) must be speci ed here (command EWALD), along with the grid constants K1 ; K2; K3 and the order n of the B-spline
interpolation if SPME is used.
3. Output Environments


To this category of environments belong &INOUT, &PROPERTIES.
The &INOUT environments handles the output operations carried out by ORAC on les
other that the standard output. Commands are provided that instructs the program to
save coordinates to a le and how frequently this must happen. The binary trajectory les
can be written onto sequential (command DUMP) or direct access les (command DUMP RAND).
ORAC provide, of course, the possibility of writing down coordinates le in PDB ascii format.
This is accomplished by the command ASCII.
The &PROPERTIES directive is used to compute statistical properties at the same time as
the simulation is being carried out. ORAC can compute radial distribution functions, structure factors (GOFR, SOLVENT GOFAR), velocity autocorrelation functions (VACF, MTS VACF),
infrared spectra MTS SPECTRA, root means square deviation from a reference structure X RMS.
The &PROPERTIES environment and the corresponding read properties.f fortran source, provides a simple framework to the programmer for adding the command for the calculation of
a new \property". Interfacing an user developed property computation to ORAC requires,
in principle, a very limited programming e ort. This is discussed furtherly in the manual
[48].

B. ORAC Auxiliary Files
Compared to molecular liquids, simulating proteins, or any complex biomolecule, poses
additional problems due to the molecules' covalent structure, the knowledge of which must
preceed any evaluation of the potential energy of the system.

6

The covalent topology of any complex biomolecule can be computed from the structure
of its constituent residues. In ORAC , to curtail the complexity of the input data, only
minimal information on each residue needs to be provided, such as the constituent atoms,
the covalent bonds and, in case of polymers or biopolymers, the terminal atoms used to
connect the unit to the rest of the chain. In addition, in order to assign the correct potential
parameters to the bonds, bendings and torsions of the residue, the type of each atom needs
to be speci ed. Finally, to each atom type must correspond a set of non{bonded parameters.
When the bonding topology of the di erent residues contained in the solute molecule(s)
is known, these units are linked together according to their occurrence in the sequence. In
this fashion the total bonding topology for the molecule is obtained. From this information,
all possible bond angles are collected by searching for all possible couples of bonds which
share one atom. Similarly, by selecting all couples of bonds linked among each other by a
distinct bond, all the torsions can be obtained.
The following sections sketches the format of the topology and force eld parameters les
read by ORAC ( eld.tpg and eld.prm, respectively). The topology and force eld parameters
les are strongly dependent from each other and together fully de ne the molecular force
eld of the solute molecule(s).


C. ORAC Auxiliary Files: eld.tpg
ORAC is instructed to read the topology le by the command
READ TPG ASCII

eld.tpg

of the &PARAMETERS environment. File eld.tpg contains information on the series of residues
needed to de ne the topology of the actual solute molecules. This information is provided
through a series of free format keywords and their corresponding input data as done in the
main input le sys.mddata. In this way, ORAC reads the solute connectivity, the atomic
charges, the atomic labels corresponding to those found in the PDB le, and the atomic
types according to the chosen force eld (i.e. AMBER, CHARMM or others). Moreover,
the atomic groups and the improper torsions are also de ned.
7

As for sys.mddata, the le eld.tpg is parsed and the composing substrings of each line
are interpreted. Comment lines must have the \#" character in column 1. Each residue or
unit de nition starts with the keyword
RESIDUE


residue name

where residue name is a character label which must match labels found in the command
JOIN of the environment &PARAMETERS, and must end with the keyword RESIDUE END. These
residue delimiting keywords are the only one in capital letters in eld.tpg. In Fig. 3 we give
two example of \residue" de nition, i.e. the alanine N terminus and the molecule of acetone,
coded with the strings ala-h and aceto, respectively.
Atom type de nitions and charges are read in between the keywords atom and end. For
each atom three strings must be entered: the PDB atom label, the potential type according
to the selected force eld as speci ed in eld.prm (see later on in this section) and the point
charge in electron units. Groups are composed of all atoms entered between two successive
group keywords. The PDB labels must be all di erent from each others since they are used
to establish the topology and connectivity of the solute.
In the alanine example, the atomic types and the charges are those of the AMBER force
eld. For acetone, instead, the atomic types are still those of AMBER, while the charges are
obtained from the MOPAC program [49] with the ESP tting procedure [50]. Four groups
are de ned for alanine and three groups for acetone. When using r{RESPA with SPME the
groups should be de ned as small as possible (ideally they should be composed of two, three
atoms) to enhance the stability of the fast integrator. De ning large groups, on the other
hand, allows substantial saving of memory, since it decreases the size of the nested Verlet
neighbor lists used by the r{RESPA algorithm. Hence, in selecting the group size for large
biological systems a compromise must be made.
The bond connectivity is speci ed between the keywords bond and end by providing the
series of bonds present in the residue. Each bond is speci ed by two atom labels corresponding to the atoms participating to the bond. In the example in Fig. 3, residue ala-h has
eleven bonds while nine bonds are found in aceto.
All possible bendings and proper torsions are computed by ORAC from bond connectivity
and need not to be speci ed. Improper torsions must instead be provided. Improper torsion
8

are used to impose geometrical constraints to speci c quadruplets of atoms in the solute. In
modern all{atoms force elds, improper torsions are generally used to ensure the planarity
of an sp2 hybridized atom. The convention in ORAC to compute the proper or improper
torsion dihedral angle is the following: If r1; r2; r3; r4 are the position vectors of the four
atoms identifying the torsion, the dihedral angle  is de ned as
"
#
(
r2 , r1 )  (r3 , r2 ) (r3 , r2 )  (r4 , r3 )
 = arcos
(VI.1)


jr2 , r1jjr3 , r2j

jr3 , r2jjr4 , r3j

In case of improper torsions involving a terminal atom a particular quadruplet of atoms must
be selected. For instance, the alanine N{terminus is connected to the peptide chain from
only one end. One improper torsions must then be speci ed involving the amino nitrogen
(n+) of the subsequent residue to ensure planarity of the peptide planes.
There are other, less important, topology directives in eld.tpg which allows, e.g., to omit
speci c bendings, de ne hydrogen bonds, etc. For a complete description we again refer to
the manual [48].

D. ORAC Auxiliary Files: eld.prm
While the le eld.tpg provides electrostatic point charges for each atom of the residues,
ORAC reads the potential parameters for the bonded and the Lennard{Jones interactions
from the le eld.prm. The parsing of this additional auxiliary le is carried out in the
same way as for eld.tpg and sys.mddata. In general, the parameters for a given interaction
are listed between two keywords: A rst keyword identifying the type of interaction (e.g.
BOND, BENDING, etc.) and the keyword END. The order of the type of interactions and their
associated parameters in the auxiliary le is unimportant.
Bond stretchings (interaction keyword BOND) are entered specifying on a line, the two
atoms involved, followed by a numeric string providing, in this order, the force constant Kr
and the equilibrium bond distance r0 (see Eq. II.7). For example:
BOND
....
# AMBER carbonyl
c
o
570.00
....
END

stretching in Kcal and A
1.229

9

For a bending (interaction keyword BENDING), three atoms must be entered along with
the force constant and the equilibrium bending angle in this order, (K and 0 in Eq. II.8).
In the atom sequence, the vertex atom is given as second while the order of the other two
atoms is immaterial:
BENDING
.....
# h20 bending in Kcal and rad
hw ow
hw
100.0
104.52
.....
END

For each proper torsion (interaction keyword TORSION PROPER) four atoms must be provided, the second and the third atoms being those on the dihedral angle axis. After the
atom sequence the barrier height k, the angle
and the integer n (see Eq. II.13) must
follow. For example:
TORSION PROPER
......
#
x
ct
ct
.......
END

x

Kcal/mole
0.1556

Gamma
0.0

n
3

The symbol x is the wild card symbol and is representative of any atomic type.
In the improper torsion potential (interaction keyword TORSION IMPROPER) again the
quadruplet of the atoms involved must be speci ed. The CHARMM harmonic torsional
form (C1 = 1 in Eq. II.13) or the AMBER form (C1 = 0 in Eq. II.13) are assumed if two or
three additional numeric characters are provided, respectively. The following is an example
of the two possibilities:
TORSION IMPROPER
......
# for this torsion choose AMBER
x
x
n
h
1.00 180.0
# for this instead choose CHARMM
cpb
cpa
nph
cpa
20.80
0.0
....
END

2

Finally, the Lennard{Jones non{bonded atomic parameters are speci ed by entering 6 characters: The rst is the atomic type according to the chosen force eld, the second and
10

third are the Rmin1 and  constants, the fourth and fth are the 1-4 interaction Rmin and
 constants, and the last is the atomic mass. To obtain cross interaction potentials, the
Lennard{Jones parameters are combined according to standard sum rules (see Eq. II.3).
For the Lennard{Jones 1-4 non{bonded interactions, the potential function may be multiplied by a so{called 1-4 factor, usually less or equal to 1. If zeros are entered in the fourth
and fth elds, the 1-4 factor is set by the command LJ-FUDGE on environment &SOLUTE.
If some speci c Lennard{Jones 1-4 interactions need to be multiplied by some alternative
constants the resulting Lennard{Jones constants must be entered in the these elds. In the
following, examples of the various alternatives are shown:
NONBONDED MIXRULE
.....
# type o has the 1-4 factor provided by LJ-FUDGE in &SOLUTE
o
1.661
0.210
0.000
0.000 16.0
# type oa is same as o but has the 1-4 factor equal to 1
oa
1.661
0.210
1.661
0.210 16.0
# type ob is same as o but has a different 1-4 potential
ob
1.661
0.210
1.861
0.105 16.0
....
END

We stress that a 1{4 factor might also be speci ed for the 1{4 electrostatic interaction by
means of the command QQ-FUDGE in the &SOLUTE environment.

VII. A TYPICAL EXAMPLE: BPTI IN WATER SOLUTION
ORAC is a general MD code which can simulate a variety of systems ranging from sim-

ple homogeneous
uids and solids to complex heterogeneous systems. Here, we provide an
example run for a solvated biomolecule. This is the type of systems that ORAC has been
designed to simulate and for which the highest performance can be achieved. We chose to
simulate the typical guinea pig of proteins simulation, namely, the Bovine Trypsin Pancreatic Inhibitor (BPTI), in water and at 300 K. We start our simulation from the available
experimental X-ray structure of the orthorhombic type I crystal at low temperature [51]. In
corresponds to the minimum of the Lennard{Jones potential and is related to the  parameter by  = 2Rmin 2,1=6
1R

min

11

the following sections we go through all the steps that are needed to prepare the system for
a typical MD run and to run the simulation itself. In particular, we discuss the following
sequential steps:


Step I: Minimization of the protein structure in vacuum using the AMBER force eld
by means of r{RESPA MD at 20 K.



Step II: Solvent (water molecules) are added into the simulation box. The solvent
structure is relaxed at 300 K with a short r{RESPA simulation.



Step III: A few ps of molecular dynamics simulation at constant pressure and at 300
K is performed in order to nd the equilibrium density at P=1 MPa.



Step IV: A simulation of the hydrated BPTI at the equilibrium density at 300 K is
performed using NVE r{RESPA Molecular Dynamics

The discussion that follows is propaedeutic to the program usage.

A. Step I: Starting a Run from the X-ray PDB le
Our example run was started from the X-rays coordinates of the native bovine pancreatic
trypsin inhibitor taken from the protein data bank at the Brookhaven national laboratory, le
pdb1bpi.ent. The PDB coordinate le contains 58 residues for a total of 460 non hydrogen
protein atoms, a phosphate anion (5 atoms) and 167 water oxygens. Although ORAC is
able to read the PDB le as is, in le pdb1bpi.ent the GLU7 and ARG53 residues, and
the phosphate anion are given in two alternative conformations named A and B. Thus, we
retained only the \B" conformation and erased the coordinates of the \A" conformation.
These changes done, the input le sys.mddata is given in Fig. 4.
1. Description of the Input File

Although, as we saw in the previous section, the order of the environment commands is
immaterial, we chose to order them according to the same arbitrary subdivision and order
used before. Thus, the description environment are given rst. Since at this stage no solvent
is present, only the environments &SETUP, &SOLUTE and &PARAMETERS are speci ed.
12

a. &SETUP In &SETUP only two commands are entered: CRYSTAL, where the simulation
cell parameters (a, b, c, , and
discussed in Sec. II B) are provided, and READ PDB
bpti xray.ent, where bpti xray.ent is the lename of the initial solute coordinates obtained
from the Brookhaven PDB.
b. &SOLUTE The &SOLUTE environment contains ve commands: i) STRETCHING prevents ORAC from enforcing constraints on bonds which are in con
ict with the r{RESPA
integrator to be used. ii) Two commands are used to de ne the 1{4 multiplicative factors for
electrostatic, QQ-FUDGE, and Lennard{Jones, LJ-FUDGE, interactions. These are discussed
in section VII D. iii) RESET CM shifts the origin of the simulation box to the center of mass
of the solute. iv) The command SCALE CHARGES instructs ORAC to distribute any excess
charge2 over the rst two solute molecules, i.e. the BPTI and the phosphate ion. Since there
is no &SOLVENT directive, the 167 crystallographic water molecules along with the phosphate
anion are considered as part of the \solute".
c. &PARAMETERS In the &PARAMETERS environment we enter the lenames of the
topology and parameters auxiliary les by using the commands READ ASCII TPG and
READ ASCII PRM, respectively. The two les amber95.tpg and amber95.prm corresponding
to the AMBER force eld are provided with the ORAC distribution les. If no hydrogen
coordinates are provided in the PDB le, as is generally the case, ORAC generates the hydrogen atoms according to simple geometric rules. The structured command JOIN is used
to de ne the residues sequence given in the PDB le. We notice that n identical and consecutive \residues" like the water molecules hoh can be speci ed with the format hohn.
In addition, by entering the subcommand BOND of the structured command ADD TPG, three
extra bonds corresponding to the three disulphur bridges (namely CYS5{CYS55, CYS14{
CYS38, AND CYS30{CYS51) are added. Finally, the binary le bpti amber95 osf.prmtpg
containing the full topology and interaction parameters of the system is written by the
command WRITE PFR BIN. The expensive computations of the topology and parameters le
can be avoided in subsequent runs by reading the le created by WRITE PFR BIN with the
Using the standard protonation at PH 7 for his,glu,asp,arg,lys and charge -3e for the phosphate
anion, the system has a total charge of +3e.
2

13

command READ PER BIN.
d. &SIMULATION
In the example, &SIMULATION indicates that a normal MD simulation is
to be run at the temperature of 20 K with an oscillation band width of 10 K . By specifying
MDSIM in &SIMULATION and by selecting r{RESPA as the integrator in &INTEGRATOR, the
minimization is run with a r{RESPA NVE MD algorithm rather than using the ORAC minimization algorithm, i.e. the module drvmin3.
e.
&INTEGRATOR The rst command entered in the &INTEGRATOR environment is
TIMESTEP. Since r{RESPA is used as the integrating algorithm, the time step given in input
to the command TIMESTEP correspond to the th time step in Eq. IV.51. The parameters of
the integration algorithm are given in the structured command MTS RESPA. In the example,
the r{RESPA multiple time steps scheme includes ve time steps of which two time step
involving bonded forces (step intra) and three steps involving non{bonded forces (step
nonbond). The rst eld after the subcommands represents the integers, n0; n1; m; l; h
in Eq. IV.51 associated with each time step. Therefore, in the example we have that:
th = 16:0=1 fs, tl = 16:0=4 = 4:0 fs, tm = tl=4 = 1:0 fs, n1 = m=2 = 0:5 fs and
n0 = n1=2 = 0:5=2 = 0:25 fs. The long, medium and short range potentials (Vh ; VlandVm
in Eqs. IV.37 through IV.39) are de ned sequentially by the commands step nonbond. For
each of these one real number, corresponding to the shell radius r, must be entered. For
each shell radius, two more optional parameters can be speci ed, i.e. the corresponding
healing length  and the neighbor list o set r, the neighbor radius for each shell being
de ned as rlist = r +  +r.). The values for the healing lengths and the neighbor list o sets
given in this example have been tested for an energy conserving 5 time steps algorithm in
solvated protein at 300 K [12]. The nal keyword reciprocal in the second subcommand
step nonbond indicates that the reciprocal lattice sum must be computed during the l-th
time step. The option very cold start is used to prevent the simulation from crashing
due to an initial system very far from equilibrium. The argument following the command
very cold start is the maximum allowed increment per step of a Cartesian coordinate in
To use drvmin, MINIMIZE should have been entered in place of MDSIM along with the choice of a
single time step integrator
3

14

unit of 
A. Since during minimization the total system energy does not need to be conserved,
the parameters of the r{RESPA algorithm can be selected with more freedom.
f. &POTENTIAL In the &POTENTIAL environment the command EWALD speci es that the
SPME method will be used in the simulation. The value of the convergence parameters
is given in 
A,1 and must follow the keyword pme. The subsequent four integers are
the constants K1; K2; K3 (see Eq. III.30) determining the neness of the grid in reciprocal
space, and the order n of the B-spline interpolation. In this example the relative accuracy
jE , Eexactj=Eexact of the Coulomb energy is in the order of ' 10,4. The SPME reciprocal
lattice contribution Vqr is assigned to the l shell by the structured command MTS RESPA in
&INTEGRATOR. Following EWALD, the command UPDATE indicates that the Verlet neighbor list
is to be recalculated every 40.0 fs with a cuto 1.5 
A larger than the potential cuto . In
this example, the size of the system is not suciently large to make it convenient to use the
linked-cell neighbor lists (accessed with the command LINKED CELL) rather than the more
conventional Verlet lists.
g. &RUN The rst command of the environment &RUN, CONTROL 0, speci es that the
simulation is not commenced from a restart le and that the velocities must be initialized
from scratch. The subsequent REJECT 496.0, indicates that 496 fs of simulation with velocity rescaling will be carried out. Velocity rescaling will occur each time that the system
temperature goes beyond the oscillation bandwidth of 10 K de ned in the environment
&SIMULATION. The command TIME is used to de ne the length of the production run with
no velocity rescaling. Since Step I is a minimization, this length is set to zero. The last
command PRINT 2.0 indicates that intermediate results are to be written every 2.0 fs.
h. &INOUT The output generated by ORAC is speci ed in the environment &INOUT and
consists of a binary restart le printed every 248 fs and of an PDB le printed every 496 fs,
i.e only at the end of the run. While the restart le is rewound at each print, the PDB le
is not and con gurations accumulate during the run.
2. Results and Output from the Run

At execution time, if syntax errors or incompatible options are detected in sys.mddata,
ORAC aborts with an error message before attempting any calculation. If no error is found,
15

the program builds up the molecules of the system using the sequence speci ed in the
structured command JOIN and the topology de nition given in the topology le amber95.tpg.
As the next step, ORAC tries to match bonds, bends, proper and improper torsions with
the potential parameters speci ed in amber95.prm. If matching fails ORAC stops with
an error message. Finally, before the simulation can begin, the PDB le bpti.pdb is read
in. This preliminary phase, which constructs the system topology, the parameters arrays
and corresponds to the execution of the modules start, read input, join and bldbox, may
take several minutes for large size biomolecules. The successful completion of this phase is
signaled by the printing of a synthetic system description and topology information. For
our example, the following output is obtained:
***************************************************************
*
Solute TOPOLOGY List
*
*
*
*
1398 Atoms
1244 Bonds
1244 FLexible Bonds *
*
0 Rigid Bonds
1799 Angles
2732 P-Torsions
*
*
199 I-Torsions
2347 1-4 Inter.
524 Atomic Groups
*
*
*
***************************************************************

Subsequently, ORAC enters the routine mtsmd instructed by the directive MTS RESPA in
&INTEGRATOR and the simulation starts.
When running with r{RESPA, at the very beginning of the run, ORAC prints out an
estimated CPU time for the scheduled run. The cost per force call for each of the potential
contributions in Eqs. IV.35 to IV.39 is also printed. This output helps in tuning the eciency
of the integration schemes. For the simulation length speci ed in the input le we obtained
the following output on a DEC alpha 3000/800 workstation4:
CPUtime for m-contribution: RECP =

0.00 DIR =

0.137 TOT =

0.137

CPUtime for l-contribution: RECP =

0.37 DIR =

0.444 TOT =

0.811

CPUtime for h-contribution: RECP =

0.00 DIR =

0.683 TOT =

0.683

THEORIC SPEED UP FOR NON BONDED PART =

The DEC
benchmark.
4

alpha 3000/800

4.27

workstation runs at 30 MF(Mega
ops) per second for the

16

Linpack

CPUtime for n1-contribution

=

0.0654

CPUtime for n0-contribution

=

0.0215

OVERALL THEORIC SPEED UP =

11.48

Expected CPU time for the RUN:

0 hours and

4 min

Expected average time per M step:

0.60 sec.

Expected average time per femto :

0.60 sec.

Thus, the run is expected to last for 4 minutes. This estimate is quite accurate. The
e ective CPU time at the end of the simulation was 306 seconds. While the simulation is
running, intermediate results are printed to standard output. These include various energies
(in KJ per mole of \solute" thus encompassing all the 1398 atoms in the simulation box)
and temperatures (in K). The following is an example of the output:
Tstep
Coulom
Ener14
Angle
TotTemp

=
=
=
=
=

494.000
-17820.143
1009.599
616.197
21.2

Total
Recipr
Bonded
I-Tors
RotTemp

=
=
=
=
=

-15684.279
-10938.008
2134.706
29.845
.000E+00

TotPot
NonBond
Stretch
P-Tors
TraTemp

=
=
=
=
=

-16053.392
-18188.098
348.402
1140.262
.000E+00

Tstep
Coulom
Ener14
Angle
TotTemp

=
=
=
=
=

496.000
-17777.902
1019.448
601.302
25.4

Total
Recipr
Bonded
I-Tors
RotTemp

=
=
=
=
=

-15683.211
-10938.008
2035.096
29.942
.000E+00

TotPot
NonBond
Stretch
P-Tors
TraTemp

=
=
=
=
=

-16126.250
-18161.346
265.103
1138.749
.000E+00

The meaning of the symbols in the output is self evident: Tstep is the instantaneous
simulation time in fs; Total is the total energy; TotPot is the total potential energy, Coulom
is the electrostatic energy; Recipr is the reciprocal SPME lattice energy; NonBond is the total
electrostatic + Lennard{Jones non{bonded energy. Ener14 is the 1-4 non{bonded Lennard{
Jones interaction energy; Bonded is the total energy due to intra{molecular interactions, and
Stretch, Angle, I-Tors, P-Tors are the stretching, bending, improper and proper torsion
contributions, respectively.
At the endo of the run, the last con guration is saved to both a binary restart bpti1.rst
le and to an ascii PDB le bpti1.pdb.
17

B. Step II: Adding the Solvent
The next step consists in hydrating the protein. To do so, the simulation box containing
the protein is lled with solvent molecules generated on a regular grid. Only molecules at
a sucient distance from any protein atom are included. Subsequently, a short simulation
of about 1 ps in the NVE ensemble at about 300 K is carried out in order to randomize the
solvent molecules around the protein. To accomplish this task, the le sys.mdata of step I
needs to be modi ed. We show in Fig. 5 the input le for Step II.
1. Changes to the Input File

a.

&SETUP

The environment &SETUP is changed to:

&SETUP
CRYSTAL 35.0 35.0 35.0 90.0 90.0 90.0
READ_PDB bpti1.pdb
INSERT 0.75
CELL sc 11 11 11
&END

Here, two new commands INSERT and CELL are used. The real argument to INSERT speci es
the criterion for discarding overlapping molecules. A solvent molecules is discarded if the
distance between any atom of the solute and that of any atom of the solvent molecule is
risjp < radius  (is + jp);

(VII.2)

where radius is the argument to INSERT, r is the distance between the is{th atom of
the solvent molecule and the jp{th atom of the protein, and  ;  are the corresponding
Lennard{Jones diameters. Trial and error has shown that a reasonable solvent density can
be achieved with values of radius in the range between 0.6 and 0.8 units. In this example it
was set to 0.75.
CELL generates a periodic structure of solvent molecules with a simple cubic (keyword
sc) repeating unit. This basic cell is repeated in the three directions 11 times (keyword
11 11 11) as to reproduce, approximately, the water density at 300 K. Body and face
center cubic cell could have also been chosen with keywords bcc and fcc, respectively.
Since the simple cubic lattice has one molecule per repeating unit, 113 = 1331 solvent
isjp

is

18

jp

molecules are added to the simulation box which already contained 167 crystallization water
molecules. The equilibrium density at 300 K will be obtained in step III when running
the constant pressure simulation. The initial protein coordinates, read by the command
READ PDB bpti1.pdb, were obtained from the simulation described in Step I.
b. &PARAMETERS Since all topology and force eld information has already been generated in the previous step, the &PARAMETERS environment are changed to:
READ_PFR_BIN

bpti_amber95.prmtpg

This instructs ORAC to read the the complete topology of the solute from the binary le
bpti amber95 osf.prmtpg. In this fashion, the expensive computations of the protein topology
and force eld parameters arrays are skipped.
c. &SOLVENT A new environment appears in the input le to signal that solvent
molecules are present in the system, namely:
&SOLVENT
ATOM
o
1
ATOM
h
2
ATOM
h
2
INTERACTION
INTERACTION
STRETCHING
STRETCHING
BENDING
&END

1
2
1
1
2

P
P
P

16.0
0.0
0.0
1.0
0.81650
-0.57735
1.0
-0.81650
-0.57735
3.1656 0.1554 -0.82
1.6
0.0
0.41
2
524.86
1.0
3
524.86
1.0
1
3
55.00 109.47

0.0
0.0
0.0

The rst three instructions de ne the coordinates of the atoms contained in each of the solvent molecules. The command ATOM expects the atom symbol, type, rank and mass followed
by its coordinates. The atom rank informs ORAC if the site should be considered as primary
or secondary in the calculation of constraints (see Ref. [23]). Acceptable ranks are P or S
for primary and secondary atoms, respectively. The interaction atom type must be de ned
by INTERACTION which associate Lennard{Jones parameters and charges to atomic types.
In the example parameters for the SPC water model are provided. Moreover, commands
STRETCHING and BENDING de ne the intra{molecular parameters for the solvent molecule.
The parameters (in KCal/mole and 
A) are taken from the CHARMM force eld. Finally,
we stress that without the environment &SOLVENT the changes made to &SETUP will produce
an error condition.
19

d. Additional Changes In the new simulation step the temperature is modi ed to 300
K and the rejection phase increased to 992.0 fs. Thus, the argument to the command
TEMPERATURE in &SIMULATION is replaced with TEMPERATURE 300.0 20.0 to raise the temperature to 300 K with a oscillation band of 20. In addition, the command MTS RESPA in
&INTEGRATOR is changed by removing the keyword very cold start. Finally, the rejection
phase was increased to by modifying the command REJECT in &RUN to REJECT 992.0.
2. Results and Output for Step II

In step II, most of the computational time of the preliminary phase is spent in hydrating the BPTI molecule. This operation involves the time consuming calculation of all the
contacts between the protein and the solvent molecules. Compared to Step I, ORAC skips
the computation of the topology and force eld arrays needed for the simulation phase and
instead reads the binary le bpti amber95.prmtpg generated in Step I.
In the example the protein hydration is completed with the following message on standard
output:
495 molecules over

1331 have been removed

This means that 836 water molecules have been left in the 42875 
A3 cubic box. The total
number of atoms of the new system is now = 1398 + 836  3 = 3 906.
ORAC changes the output of the intermediate results according to the simulated system.
Since in Step II the system consists of \solute" and \solvent" molecules, the printout of the
intermediate results will have a di erent format that in Step I, namely
N

Tstep
SlvCoul
SlvInt
SltCoul
SltBond
SltItor
S-SCoul
SlvRoTem

=
=
=
=
=
=
=
=

496.000
-39226.780
4584.133
-18441.684
5327.410
131.662
-4187.726
321.402

Total
SlvRec
SltTot
SltL-J
SltStr
SltPtor
SltTemp
TotTemp

=
=
=
=
=
=
=
=

-31480.790
-9510.855
-8586.383
-954.148
1741.395
1375.104
314.651
316.331

;

SlvPot
SlvReal
SltPot
SltHyd
SltBen
S-SPot
SlvTrTem

=
=
=
=
=
=
=

-34304.498
-29715.925
-14068.423
.000
2079.249
-3097.657
308.998

Here, the pre x Slt in the output labels corresponds to energies or temperatures of the
\solute". On the other hand, solvent properties are indicated by the pre x Slv. Thus,
20

is the bending energy of the solute and SlvRoTem is the rotational temperature of
the solvent.
Since we have used the same r{RESPA algorithm and the same SPME parameters as in
Step I, we expect that the CPU time spent to compute energies and forces will scale linearly
with the number of particles. Indeed, although the SPME algorithm scales in general
with log , at small (  20000) the algorithm is e ectively linear in .
In reality, although the number of particles increases ' 2.8 times from Step I to II
(3915 1398 ' 2 8), Step II takes only 2.3 times more cpu time than step I corresponding to
1.4 s per fs on our alpha 3000/800 workstation. This smaller than expected increase is due
to di erences in the direct force routines handling solvent and solute5. In Fig. 6 we show the
starting system con guration after solvent is added at time = 0 and the nal con guration
at ' 0 5 . We see that at the end of Step II the con guration of the solvent molecules
appears to be suciently randomized.
SltBen

N

N

N

=

N

N

N

:

t

t

:

ps

C. Step III: Obtaining the Equilibrium Density at 300 K
In the previous step we have run the hydrated protein at constant volume, guessing the
equilibrium density. Here, we perform instead a constant pressure simulation at 300 K and
at atmospheric pressure (P = 0.1 MPa) in order to obtain a better estimate of the system
volume. As a starting point for the run, we use the solute and solvent coordinates obtained
after 992 fs of simulation in Step II and contained in le bpti2.pdb. In Fig. 7 we show the
input le for Step III.
1. Changes to the Input File

Currently, ORAC can not carry out simulations at constant pressure with
multiple time step r{RESPA algorithms. In order to perform the simulation of Step III at a
reasonable speed, constraints need to be used. To do so, the keyword HEAVY is added after
a.

&SOLUTE

In the solvent{solvent and solvent{solute routines there is no need to perform an additional loop
over the \masked list" to exclude the bonded contacts between neighboring groups.
5

21

the command STRETCHING of the environment &SOLUTE to impose constraints only to bonds
involving hydrogen atoms. All the X , H bonds in the solute and the H , H distance within
each crystallographic water molecule will be constrained. Moreover, the command INSERT
&SOLUTE) is removed as the starting con guartion is already solvated.
b. &SETUP
In the environment &SETUP the PDB coordinate le produced at the end of
Step II is now read through the command READ PDB bpti2.pdb, the le bpti2.pdb containing
both solvent and solute coordinates.
c. &SOLVENT As for the solute, we need here to remove the internal degrees of freedom of
the solvent molecules and impose constraints. This is performed by replacing the commands
STRETCHING and BENDING with the appropriate constraints. The environment will then be:
&SOLVENT
ATOM o 1
ATOM h 2
ATOM h 2
INTERACTION
INTERACTION
CONSTRAINT
CONSTRAINT
CONSTRAINT
&END

P 16.0
0.0
0.0
0.0
P 1.0
0.81650 -0.57735 0.0
P 1.0
-0.81650 -0.57735 0.0
1 3.1656 0.1554 -0.82
2 1.6
0.0
0.41
1 2
1 3
2 3

To carry out the simulation in the NPH ensemble, a new command
should be added to the environment &SIMULATION:
d.

&SIMULATION

ISOSTRESS PRESS-EXT 0.1 BARO-MASS 40.0 COMPR 5.3D-04

This command instructs ORAC to run a simulation allowing only for isotropic changes of
the simulation box volume. This is done by using the Andersen [17] extended Lagrangian
method. While the imposed external pressure (in MPa) is provided after the keyword
PRESS-EXT, ORAC computes the mass of the barostat according to Eq. V.61. Thus, the
command ISOSTRESS reads optionally (as it is done in the example) the frequency of the
barostat (keyword BARO-MASS) and the system compressibility (keyword COMPR) corresponding to !Q and B ,1 in the Eq. V.61, respectively. Finally, the system temperature de ned
by the command TEMPERATURE is left unchanged at the value of 300 K used in Step II.
e.
&INTEGRATOR, &RUN and POTENTIAL We choose a time step of 1.0 fs by using
the command TIMESTEP 1.0 of environment &INTEGRATOR and replace MTS RESPA with
22

. In the environment &RUN, we change the lengths of the rejection phase to
2000.0 fs and then set to 6000.0 fs the length of the production phase, 6000.0 fs. Finally, in
&POTENTIAL we impose a direct lattice cuto of 10.0 
A with the command CUTOFF 10.0.

SINGLE STEP

2. Results and Output for Step III

Once the rst 2000.0 fs of the rejection phase are completed, ORAC reports that:
Temperature has been rescaled

4 times

These four velocity rescaling occur near the beginning of the rejection phase, which means
that the sample was already somewhat thermalized. At the end of the simulation, after the
6 ps un{scaled run, the average temperature is 315 K. As an example of ORAC 's typical
output during the onstant pressure simulation run, we show the intermediate results at time
t = 5994f s during the production phase:
Tstep
SlvCoul
SlvInt
SltCoul
SltBond
SltItor
S-SCoul
SlvRoTem

XYZ
ABC

=
=
=
=
=
=
=
=

5994.000
-33046.281
.000
-16082.588
4565.693
155.260
-16720.347
320.220

Total
SlvRec
SltTot
SltL-J
SltStr
SltPtor
SltTemp
TotTemp

TotPre
=
69.33 ConPre
TmpPre
=
38.20 Volume
..... cell parameters ....
34.0371
34.0371
34.0371
90.0000
90.0000
90.0000
........
........
........

=
=
=
=
=
=
=
=
=
=

-42608.023
-8600.088
-8167.073
-1174.510
685.674
1395.112
318.038
320.073

SlvPot
SlvReal
SltPot
SltHyd
SltBen
S-SPot
SlvTrTem

=
=
=
=
=
=
=

-28504.079
-24446.193
-12691.405
.000
2329.648
-12642.611
322.703

-44.02 KinPre
=
113.36
39431.87 PV
=
2.3748
....
stress
.....
.1152E+06 -.7702E+05 .1324E+06
.6704E+05 -.6433E+05 .6976E+05
.4689E+05 .1876E+06 -.3569E+06

With respect to the NVE runs, information about the instantaneous values of the pressure, the cell parameters and the stress tensor are now added to the output. Since the
example only allows for isotropic volume changes, the angles do not vary and the edges
change only isotropically (for cubic lattice the three cell edges are equal) in the output.
The command PROPERTY in the environment &RUN can be used to compute system averages and test whether the system has reached statistical equilibrium. This command is
active only in the production phase after the rejection part of the simulation is over. It instructs ORAC to print the running averages and their standard deviations at time intervals
23

de ned by its argument. In an equilibrated sample the running averages and their standard
deviation must not change with time. The output produced by the command PROPERTIES
at the end of our 6000 fs production run is shown in Fig. 8
In Fig. 9 we plot the volume as a function of time for the total 8 picoseconds of the
run (2 ps of rejection and 6 of production). The average value in the 6 ps production phase
(indicated by the straight line) is 39509 
A3 to be compared to the starting volume of 42875

A3. Thus, the cell has shrunk 3366 
A3 which, assuming a water molecular volume of 30 
A,
corresponds to the volume of about 112 water molecules. As in Step I and II, the les
produced by ORAC at the end of step III are the standard output, and the PDB and restart
les entered as arguments to the commands ASCII and RESTART of the environment &INOUT,
respectively.

D. Step IV: Production Run with Multiple Time Steps and SPME
Step IV consists in a production run carried out with a fast and energy conserving r{
RESPA algorithm. During such a run some properties of the system at equilibrium are
computed and analyzed. As stated previously, ORAC can compute at run time some general
properties of the system such as root mean square displacements or power spectra of velocity
autocorrelation functions. For a more complete analysis the coordinates of all particles in
the system can be written to a le in binary or ascii format.
The input le for Step IV is similar to that discussed for Step II and is shown in Fig. 10.
With respect to Step II we must change a series of environments. In rst place, we remove
the command INSERT of the &SOLUTE environment. Then, we replace the cell parameters
in the &SETUP environment with those obtained from Step III, i.e. with a cell edges of
34.0590 
A. Again in the same environment, the PDB coordinate le obtained from Step
III must be read by READ PDB. In addition, the commands MOLECULES 836 and READ PDB
should be added to the environment &SOLVENT. While the former indicates that there are
836 solvent molecules, the latter states that the coordinates of the solvent must be read from
a PDB le.
The r{RESPA integration algorithm needs to be changed in &INTEGRATOR. Indeed, the
algorithm used during the rejection phase does not conserve the total energy to an acceptable
24

level for a production run. Thus, we decrease the th timestep and modify the relative
magnitude of the other time steps:
&INTEGRATOR
TIMESTEP
10.2
MTS_RESPA
step intra
3

Dokumen yang terkait

ANALISIS FAKTOR YANGMEMPENGARUHI FERTILITAS PASANGAN USIA SUBUR DI DESA SEMBORO KECAMATAN SEMBORO KABUPATEN JEMBER TAHUN 2011

2 53 20

EFEKTIVITAS PENDIDIKAN KESEHATAN TENTANG PERTOLONGAN PERTAMA PADA KECELAKAAN (P3K) TERHADAP SIKAP MASYARAKAT DALAM PENANGANAN KORBAN KECELAKAAN LALU LINTAS (Studi Di Wilayah RT 05 RW 04 Kelurahan Sukun Kota Malang)

45 393 31

FAKTOR – FAKTOR YANG MEMPENGARUHI PENYERAPAN TENAGA KERJA INDUSTRI PENGOLAHAN BESAR DAN MENENGAH PADA TINGKAT KABUPATEN / KOTA DI JAWA TIMUR TAHUN 2006 - 2011

1 35 26

A DISCOURSE ANALYSIS ON “SPA: REGAIN BALANCE OF YOUR INNER AND OUTER BEAUTY” IN THE JAKARTA POST ON 4 MARCH 2011

9 161 13

Pengaruh kualitas aktiva produktif dan non performing financing terhadap return on asset perbankan syariah (Studi Pada 3 Bank Umum Syariah Tahun 2011 – 2014)

6 101 0

Pengaruh pemahaman fiqh muamalat mahasiswa terhadap keputusan membeli produk fashion palsu (study pada mahasiswa angkatan 2011 & 2012 prodi muamalat fakultas syariah dan hukum UIN Syarif Hidayatullah Jakarta)

0 22 0

05 BHS JEPANG

0 14 16

Pendidikan Agama Islam Untuk Kelas 3 SD Kelas 3 Suyanto Suyoto 2011

4 108 178

HUBUNGAN ANTARA KELENTUKAN DAN KESEIMBANGAN DENGAN KEMAMPUAN BACK OVER DALAM SENAM PADA SISWA SMA NEGERI 05 BANDAR LAMPUNG

0 42 1

KOORDINASI OTORITAS JASA KEUANGAN (OJK) DENGAN LEMBAGA PENJAMIN SIMPANAN (LPS) DAN BANK INDONESIA (BI) DALAM UPAYA PENANGANAN BANK BERMASALAH BERDASARKAN UNDANG-UNDANG RI NOMOR 21 TAHUN 2011 TENTANG OTORITAS JASA KEUANGAN

3 32 52