MOPAC utility program "JOIN"

Program "JOIN" is a utility for joining together two parts of a reaction coordinate trajectory created using the Dynamic Reaction Coordinate, DRC, the Intrinsic Reaction Coordinate, IRC, or a reaction path. These trajectories can be modeled using JSmol.

Only the Windows version of JOIN is available. To install JOIN download "JOIN" and place the executable in a location where you keep other .exe files. If that folder is not in the PATH, add a short file named JOIN.cmd file to a folder that is in the PATH, this file should contain a single line of the type "C:\Users\yourname\location of JOIN\JOIN.exe %1 %2"

JOIN is run at the command-prompt level, in the folder that contains the arguments. If JOIN is run without any arguments, it will print a brief message describing what it can do.

One argument: Quite often a trajectory needs to be reversed. For example, a reaction coordinate is normally written as going from a reactant over a transition state to a product, and the ΔH_f of the product is lower than that of the reactant. The reactant is usually on the left of the graph and the product on the right. If a trajectory is calculated, and the positions of the reactant and products are reversed, then the reaction path can be reversed using Join.

JOIN can be used to create a multi-step reaction, i.e., one where a reactant goes over a transition state then down to an intermediate that has a higher energy than the reactant, then the intermediate goes over a transition state to a lower-energy product. In the first reaction, the left-right convention is reversed, so that the intermediate (with a ΔH_f higher than that of the reactant) is on the right of the graph. If the graph is the wrong way round, it can be reversed using JOIN.

In all cases, the original trajectory is over-written.

Two arguments: When two arguments are used, the trajectory specified by the second argument is attached to the end of the trajectory specified by the first argument.

The output from JOIN when the two arguments are used is named "new.xyz" If JSmol had been installed, see instructions, then the new reaction path can be animated. A worked example of such an animation is included in the ZIP file

Notes

Suggested procedure for building a multi-step Intrinsic Reaction Coordinate reaction path

First, generate all the transition states, then use IRC=1* to generate the reaction coordinate for each reaction step. Here are the steps needed for each IRC calculation:

Run a FORCE calculation to work out the reaction coordinate, and use keyword ISOTOPE to store the force matrix. This will be used in each of the IRC calculations. This step does not need to be done if the system is small and runs fast.

Set up an IRC job, using the same geometry and same name as that used in the FORCE calculation. Use keyword IRC=1* if the system is small enough to run in a reasonable time. If it is big, make up two jobs, one with IRC=1 and one with IRC=-1, the results of these two half-path jobs can be joined together to give the same results as IRC=1*.

If the FORCE calculation had been done, then add keyword RESTART, this will read in the FORCE matrix, which will be used in generating the reaction coordinate normal mode required by IRC=1.

Decide on a suitable step-size, e.g., X-PRIORITY=0.2. This step-size should be used for all parts of a reaction path. It should not be changed for any step, if it is changed any resulting animation or graph will be distorted.

A useful option at this point is to include HTML. This will result in a simple HTML web-page being constructed that can be used to verify that the correct path is being generated. A common fault is that both the forward IRC=1 and reverse IRC=-1 paths are the same, this typically occurs when the potential energy surface at the top of a barrier is relatively flat. An easy way to ensure that the paths are correct is to add KINETIC=20. This shifts the starting point for the IRC path a significant distance in the appropriate direction

To minimize the chance of artifacts being created by insufficient precision, add keyword PRECISE.

Add CYCLES=10000 to ensure that long IRC's can be created.

If the behavior of the dipole during a reaction is of interest, add DIPOLE.

If all the keywords just described were used, then the keyword set would consist of "IRC=1* HTML X-PRIORITY=0.2 CYCLES=10000 PRECISE KINETIC=20 RESTART DIPOLE"

Give the jobs for the various trajectories simple descriptive names, such as "start.mop", "Step-2.mop", "end.mop", etc. These will produce trajectories that have names ending in .xyz, e.g., start.xyz. Don't use long names - that would just result in more work typing then in when using JOIN.

Second, optimize the geometry at the end of each path. IRC paths terminate before the exact stationary point is reached because the forces acting on the atoms become so small that the IRC starts to wander erratically. To include the exact stationary point, a very short path, consisting of a single geometry, must be made.

Edit an IRC path, to isolate a terminal geometry, then use that as a starting point for a normal geometry optimization.

This geometry optimization should use keywords PRECISE and HTML. Again, check that the resulting geometry looks correct.

In the data-set, after the geometry, add a second calculation using keywords OLDGEO IRC CYCLES=1 and, optionally DIPOLE. This will generate the stationary point in the path format. Remember: short names are preferred. Run that job.

Third, start to join the various parts together. These should be joined in the order in which they occur in the reaction path.

If half-paths were used, then join up all matching half-paths to form a full path of the type 1*, i.e. the energy goes from a minimum to a transition state to a second minimum. Both half-paths include a point representing the transition state, so edit one of the half-paths to delete the first model; this is the transition state. Then reverse the half-path that would come before the other half path, finally join the two parts using JOIN.

Start with the stationary point at the beginning and the first path. Join these together using a command of the type: JOIN exact-start path-1 This will generate a new file named new.xyz that has the two trajectories joined together.

Examine the file new.xyz by running new.html (this file is included in the JOIN zip file) If it looks correct, rename it as, e.g., start-path-1.xyz.

Join the new path to the stationary point at the end of the first path, e.g. JOIN start-path-1 exact-intermediate-1.

Examine the new file new.xyz, and if it looks correct, rename it as, e.g., start-path-1-int-1.xyz.

Join the new path to the next path, e.g., JOIN start-path-1-int-1 path-2 , check that it's correct, the re-name it as, e.g., start-path-1-int-1-path-2.xyz

Continue adding stationary points and path segments until the entire trajectory is constructed.

This sequence is obviously quite tedious to carry out, and it would need to be run several times, so as soon as a step is done correctly, put the command into a Windows command script. An example of such as script is:

REM
REM Generate whole of reaction path for heats
REM
JOIN "Exact start" "IRC for axial chair - boat"
if exist "axial-TS1.xyz" del "axial-TS1.xyz"
rename new.xyz "axial-TS1.xyz"
REM
JOIN "axial-TS1" "Exact middle"
if exist "axial-TS1-intermediate.xyz" del "axial-TS1-intermediate.xyz"
rename new.xyz "axial-TS1-intermediate.xyz"
REM
JOIN "axial-TS1-intermediate" "IRC equatorial chair - boat"
if exist "axial-TS1-intermediate-TS2.xyz" del "axial-TS1-intermediate-TS2.xyz"
rename new.xyz "axial-TS1-intermediate-TS2.xyz"
REM
JOIN e "axial-TS1-intermediate-TS2" "Exact end"
if exist "axial-TS1-intermediate-TS2-equatorial.xyz" del "axial-TS1-intermediate-TS2-equatorial.xyz"
rename new.xyz "axial-TS1-intermediate-TS2-equatorial.xyz"

This will simplify construction of the whole path, and reduce the probability of making a typographic error.

Excel

Microsoft Office program Excel can be used for making graphs that are suitable for publication. The starting point for this is a file that holds the entire trajectory. A list of the heats of formation for the various points can then be extracted from the trajectory file using Windows "find" command. For example, if the trajectory is named axial-TS1-intermediate-TS2-equatorial.xyz, then a file containing the heats of formation can be generated using:

find "HEAT" axial-TS1-intermediate-TS2-equatorial.xyz > bits.txt

The file bits.txt can then be used in Excel. Two useful operations within Excel are (1) Add a column, Reaction coordinate, containing the reaction coordinate in Ångstroms (the Profile number minus one times the step-size defined in X-PRIORITY=n.nn), and (2) add a column, Heat, containing the heats of formation minus the lowest heat of formation, normally the last point in the trajectory. By plotting Heat against Reaction coordinate, a traditional potential energy profile graph can be made.

How the joins are performed

Normally there will be a small miss-match in the geometries at the end of one trajectory and the start of the next trajectory. To prevent a discontinuity, the size of the miss-match is evaluated, and using the step-size, a number of new points are added. Typically this would be one to six points.

The orientation of the geometry of the system at the end of one trajectory would normally be different to that at the start of the next trajectory. Because orientation is not important from a chemistry perspective, the geometry at the start of the second trajectory is rotated and translated to minimize the distance between the it and the final geometry in the first trajectory. This rotation and translation is then applied to every point in the second trajectory.

A consequence of this operation is that only the geometries in the first trajectory are those calculated by MOPAC, all the rest are rotated and translated.

Distances between points on a trajectory

Two types of distance are used in describing reactions. One is the distance in terms of the IRC. To calculate this, the difference in Profile number of the two points, typically stationary points, either stable minima or transition state geometries, is evaluated and then multiplied by the step-size defined in X-PRIORITY=n.nn. This would be the same as the distance in the Excel chart. The other distance would be the straight-line distance between the two geometries. Distances of this type should all be calculated using geometries taken from the same trajectory file; do not use any of the geometries that were used in making the trajectory file. Convert each geometry of interest into a data-set, and use keywords 1SCF HTML, then run the job using MOPAC. Distances between pairs of points can then be calculated using data sets containing the following instructions:

COMPARE GEO_DAT="First geometry.arc" +
GEO_REF="Second geometry.arc"
Compare "First geometry" and "Second geometry"

The ARC files contain the heat of formation of the system. This is useful for checking that the correct geometry is being used. A concise summary of the differences would be given in the HTML file generated by that run. By definition, all distances calculated using the IRC would be equal to or greater than the equivalent straight-line distance. The length of a piece of string is always equal to or greater than the distance between its ends.