Applying geometry constraints, freezing coordinates and fixing distances, etc

A wide range of methods exist that allow constraints to be applied to a geometry. The most important are:

Cartesian coordinates can be fixed by setting the appropriate optimization flags to zero.
Internal coordinates can be fixed by setting the appropriate optimization flags to zero.
Cartesian coordinates can be set equal using symmetry
Internal coordinates can be set equal by using symmetry
An interatomic distance can be set to a multiple of another interatomic distance
An atom can be frozen in a DRC
Individual internal coordinates can be fixed in a Cartesian geometry optimization
The overall structure can be biased in favor of a reference geometry
An overall structure can be biased (moved) towards the structure on the other side of a barrier.

Cartesian coordinates can be fixed by setting the appropriate optimization flags to zero.

Any coordinate, internal or Cartesian, can be optimized if the optimization flag is set to "1" and the coordinate is not otherwise defined, for example by SYMMETRY. This flexibility applies to all atoms, including atoms 1, 2, and 3. If the coordinates of atom 1 are flagged to be optimized, and atom 1 has no connectivity (the usual situation), then it is defined in Cartesian coordinates and its position can change during a geometry optimization. If atom 2 has a connectivity, then setting the angle or dihedral is meaningless, and these will be turned off before the calculation starts. The same applied to the dihedral of atom 3, if it, too, is defined in internal coordinates.

In this example, an acetylene molecule is oriented along the "z" axis, and the second carbon atom is fixed in space at coordinates (0.0, 0.0, 1.0). This calculation is most easily set up using Cartesian rather than internal coordinates. Acetylene is perfectly linear, so the "x" and "y" coordinates can be fixed at 0.0. On optimizing this system, carbon atom 1 would move down along the "z" axis, in order from the C-C distance to increase from 1.0 Ångstroms.

 PM6
 Acetylene oriented along the "z" axis, 
 Atom 2 is fixed at coordinates (0.0, 0.0, 1.0)
  C     0.00000000 +0   0.00000000 +0   0.00000000 +1 
  C     0.00000000 +0   0.00000000 +0   1.00000000 +0 
  H     0.00000000 +0   0.00000000 +0   2.00000000 +1 
  H     0.00000000 +0   0.00000000 +0  -1.00000000 +1

Freezing Cartesian coordinates is useful when positioning charged species in an electric field, for example a fluoride ion at coordinates (0.0, 0.0, 1.0) when a field of 1V per Ångstrom is applied along the "z" direction. Less common is the use of Cartesian coordinates in small systems.

Internal coordinates can be fixed by setting the appropriate optimization flags to zero.

Consider chloromethane, CH₃Cl, in which two constraints are applied: the C-Cl distance is fixed at 1.7 Ångstroms, and one C-H distance is to be fixed at 1.1 Ångstroms. The data set, in internal coordinates, might look like this:

  PM6
Methyl chloride, C-Cl distance defined as 1.7 Angstroms,
 One C-H distance defined as 1.1 Angstroms
 Cl         0.0 +0    0.0000000 +0    0.0000000 +0                    
  C         1.7 +0    0.0000000 +0    0.0000000 +0    1    0    0
  H         1.0 +1  109.2843786 +1    0.0000000 +0    2    1    0  
  H         1.1 +0  109.2843786 +1  120.0000000 +1    2    1    3   
  H         1.0 +1  109.2843786 +1 -120.0000000 +1    2    1    3

In this system, there are only two constraints although there are eight optimization flags set to zero. This is because six of the optimization flags are for the three trivial translations (atom 1) and three trivial rotations (atoms 2 and 3)

Cartesian coordinates can be set equal using symmetry

This is a rarely used option. An example is given in Figure 2 of SYMMETRY.

Internal coordinates can be set equal by using symmetry

Many small molecules have symmetry in that various bond lengths or angles or dihedrals are equal, or are simply related. For an example, see Figure 1 of SYMMETRY. If symmetry is used, and a dependent coordinate is flagged for optimization, the flag will be turned off before the calculation is started.

An interatomic distance can be set to a multiple of another interatomic distance

In some solids, particularly the binary solids such as NaCl or CsCl, the number of independent geometric variables is small. In the cases just mentioned, the number is precisely one. By setting up a symmetry-constrained calculation involving only one variable, the computational effort required for geometry optimization can be reduced considerably. An example of the steps involved in setting up such a calculation are as follows:

First, set up the Large Unit Cell (LUC) system for NaCl in internal coordinates, see Tv.. For a 3 by 3 by 3 LUC constructed using MAKPOL, the resulting data set has 108 NaCl units (Z=4).
Run a 0SCF calculation with AUTOSYM.. Edit the resulting arc file to make a new data set. During this process, change every angle optimization flag to zero - all the angles are symmetry-defined by the space-group, and therefore should not be optimized.
There are five interatomic distances that are flagged for optimization, but all of them are related to the distance between atoms 1 and 2, that is, to the bond-length of atom 2. This distance, 2.82Å, can be used to symmetry-define all the other distances using symmetry function 19. Starting with the second unique bond-length, 3.988Å, divide that distance by the reference distance to give a ration of 1.4142. This is then used in constructing the symmetry function:
2 19 1.4142 3.
This function is put at the start of the symmetry functions, just after the Z-matrix. The optimization flag for the bond-length of atom 3 can then set to zero, as it is now symmetry defined to that of atom 2. The same process is then done for the remaining three distances. In each case, the multiplier is an easily recognizable number: 2^1/2, 3^1/2, 2, and 6.
The number of optimization flags in the new symmetry-defined data-set having now been reduced to exactly one, geometry optimization can be completed in one line-search.

An atom can be frozen in a DRC

In a DRC, atoms move in response to the forces acting on them. For a given force, the heavier atoms move less than light atoms. If, for any reason, an atom needs to be frozen, set its apparent weight to a large number. For example, if a carbon atom should be fixed, replace the symbol "C" by "C9999999" or "C1.d12" or other mass. The frozen atom will still move, but not as much as an ordinary atom. To verify that the frozen atom is in fact not moving, use LARGE=-1

Individual internal coordinates can be fixed in a Cartesian geometry optimization

Because large molecule geometries are normally defined in Cartesian coordinates, it might seem difficult to lock individual bond-lengths or angles so that their values cannot be modified during the optimization. The answer is to define individual atoms using internal coordinates. Suppose that, in a large molecule, hydrogen atom 3000 needs to be defined as being 1.1 Å from oxygen atom 2999. Then using a graphics package, locate H3000 and O2999, and two other atoms that could be used in defining the angle and dihedral in an internal coordinate connectivity. Suitable atoms might be the atom O2999 is connected to (e.g. C2967) and the next atom (e.g. N2960). Of course, any atoms could be used provided their atom-number is lower than the atom being defined, i.e. here lower than 3000. Once the four atoms (here H3000, O2999, C2967, and N2960) are identified, work out the angle and dihedral. Use these data in constructing the internal coordinate definition for H3000. If the H3000-O2999-C2967 angle was 120°, and the H3000-O2999-C2967-N2960 torsion was 180°, then the position of the hydrogen atom with a fixed O-H distance would be:

H 1.1 0 120 1 180 1 2999 2976 2960

If the angle needed to be fixed, then the line would be:

H 1.1 1 120 0 180 1 2999 2976 2960

and so on. There is no restriction on the number of internal coordinates that can be used. Indeed, entire hetero groups can be defined as being rigid, but free to move within a cavity. In such a case, the first atom in the hetero group would be defined in either unconstrained internal or Cartesian coordinates, the next atom would be in internal coordinates and have its bond-length fixed, the next atom would be internal coordinates and have a frozen bond length and angle, and all the remaining atoms in the hetero group would be in internal coordinates and have all coordinates frozen.

In a small molecule, in this example: toluene, if the molecule is defined in Cartesian coordinates and a C-C bond-length is to be frozen, then a suitable data set would be:

PM6
Toluene
  Cartesian definition, but with one C-C distance frozen
  C        -0.00997463 +1   -0.0015231 +1   -0.0000930 +1                    
  C         2.79787104 +1   -0.0096765 +1   -0.0012704 +1 
  C         0.69635100 +1    1.2055923 +1    0.0410044 +1
  C         0.68930681 +1   -1.2109956 +1   -0.0416784 +1
  C         2.09215830 +1    1.2048435 +1    0.0406317 +1 
  C         2.08705695 +1   -1.2186939 +1   -0.0423481 +1  
  H        -1.09719779 +1    0.0021777 +1    0.0002893 +1  
  H         0.15466169 +1    2.1496180 +1    0.0735817 +1  
  H         0.14334405 +1   -2.1525875 +1   -0.0736970 +1  
  H         2.63556968 +1    2.1465054 +1    0.0727064 +1   
  H         2.62170344 +1   -2.1652386 +1   -0.0748682 +1  
  C         1.5 0 120 1 180 1 2 5 3
  H         4.68822161 +1    0.4932505 +1    0.8985801 +1 
  H         4.72735404 +1   -1.0061093 +1   -0.0329544 +1    
  H         4.68721477 +1    0.5478562 +1   -0.8700774 +1

A GUI is essential for working out the bond-length, angle, and dihedral to be used.

What to when the atoms to be used in the connectivity occur after the atom whose bond length or angle is to be locked

Quite often, the geometry does not allow a bond-length or angle to be defined, because the atoms that would be used in the connectivity are defined after the atom of interest. Fortunately, there is an easy way to solve this problem: the order of occurrence of Cartesian atoms is not important, so simply move the atoms to be used in the connectivity to before the atom whose bond length or angle is to be defined. Alternatively, move the atom whose bond length or angle is to be defined to the end of the data set.

The overall structure can be biased in favor of a reference geometry

X-ray structures can be improved (!) by combining the accuracy of prediction of bond-lengths and angles with the accurate X-ray tertiary structure. This is done using the GEO_REF="<text>" option. To use this option, first prepare a data-set starting with the X-ray structure. Resolve any structural or positional disorder, add hydrogen atoms as necessary, then optimize the positions of the hydrogen atoms. The result would be a data-set representing a chemically sensible structure in which all the non-hydrogen atoms are based on the X-ray structure. Copy this data set and give the copy a name such as ref.mop. In the original data set, add keywords GEO_REF="/Users/<myname>/ref.mop" or GEO_REF="M:/<pathname>/ref.mop". The default bias is 3 kcal/mol/Ångstrom. This can be changed by adding the bias after the closing quotation mark, as in GEO_REF="/Users/<myname>/ref.mop"10.

An overall structure can be biased (moved) towards the structure on the other side of a barrier.

Locating transition states has traditionally been difficult. However, if data sets representing the reactants and products are available, the GEO_REF="<text>" option can be used for moving the reactant geometry towards that of the product, and vice-versa. Let the reactant geometry be R.mop and the product geometry be P.mop. The product geometry can be moved in the direction of the reactant geometry by adding the keyword GEO_REF="/Users/<myname>/R.mop" to the product geometry data set. When that constrained optimization is run, the result is a geometry that is nearer to the reactant geometry than the starting geometry. If the constraint were subsequently removed, and the geometry re-optimized, the geometry would slide back downhill in the direction of the original product geometry. For more information on locating transition states, see protein calculations.