Cautions and Warnings regarding MOPAC

MOPAC is a very powerful program, and there is a strong temptation for users to assume that it can correctly model all types of system. Simply put: this is not correct. In order to get useful results, the input data set must represent a chemically sensible system. If this condition is satisfied, all will be well: researchers will see their projects succeed, funding agencies will get good reports, and everyone will be happy. If this condition is not met, then hours, days, even weeks, can be spent trying to make sense of what's going on. People new to MOPAC will assume it's faulty and abandon it in disgust, research students projects will fail, they'll flunk out of their course, and become unemployable, industrial chemists will not produce results, their research will falter, they'll not get their annual bonus, in short, they'll become unhappy. So in light of all these disastrous possibilities, it is very important that certain precautions are taken.

First, think about what you want to do.

New MOPAC users: If you're new to MOPAC, then spend a few days playing with it before starting any serious work. Run small systems, ones that use almost no time. Try things. Find what works and what doesn't work. Remember - it is completely impossible to hurt MOPAC, and it is very difficult to damage a computer. When MOPAC starts to run, it creates its own little virtual world, and if something goes wrong, the virtual world disappears. No harm done. If you find yourself thinking, "I wonder if this will work?" then TRY IT! Experiment. Gain skill and experience. Get a good Graphical User Interface - Jmol is excellent, and it's free - and start examining the system. Do not ask for help of the type "What would happen if I tried to calculate a hydrogen atom that had a charge of +3?" TRY IT YOURSELF. These little experiments produce useful information faster than hunting for an expert to ask, and avoids wasting the expert's time. If something goes wrong, read the output. There will be an error message saying what went wrong, and giving advice on what to do next.

Intermediate MOPAC users: Before starting work on a project, spend some time thinking about what you want to do. There is an old saying that two weeks in the research lab can save you having to spend two hours in the library. The same is true with projects. Sketch out the work that needs to be done. Are all the resources available? Do I understand the issues involved in the chemistry? Does the chemistry I want to model make sense? If the work succeeds, will it tell me anything that will advance the project? Once all this is done, and after you've decided on a plan of action, a useful tactic is to spend some time trying to model parts of the project. There's a good chance that you'll uncover previously unsuspected problems. If your system is big, then run simulations on much smaller systems first, to get a feel for the issues involved. Again, the objective is to avoid wasting your time. CPU time isn't important - your time is.

Advanced MOPAC users: Ideally, any mistakes made should be caused by MOPAC or the methods inside it, and not by you. If the system crashes, please send the faulty data set and output file to openmopac@gmail.com, and it'll be examined to find the cause of the crash. If you have doubts as to the accuracy of the results, compare them with trusted facts. There are three main types of "fact". First, experimental results are most reliable. Of course this assumes that the experimental work was done correctly, and no errors occurred in copying or collating the data. If the experimental results were obtained correctly, then these are the nearest to the truth, i.e., nearest to the true nature of things. Put another way, the assumption, if not definition, of correctness is Nature itself. One caveat is worth noting: some phenomena such as dipole and I.P. are not only hard to measure experimentally, they sometimes represent something different to what can be calculated. The second type of "fact" is a result from a high level ab-initio calculation, e.g., a CCSD(T)/CBS calculation. Where high-level and semiempirical results differ, high level results should be considered as correct, and semiempirical results as being in error. Don't trust the common lower-level methods such as HF 6-31G(d) or even B3LYP 6-31G(d). If the semiempirical and lower-level results differ, all that can be said is that at least one of the methods is faulty - this is not a satisfactory situation. The third type of "fact" is the type everyone knows by virtue of the fact that they are chemists. This type is quite often misleading. For example, if the retinal group: (...C(Me)=CH-CH=CH-C(Me)=CH-CH=N(Me)) is protonated, everyone knows that the proton goes on to the nitrogen atom of the Schiff base, and "everyone knows" that the nitrogen becomes quaternary, i.e., (...C(Me)=CH-CH=CH-C(Me)=CH-CH=N(+)H(Me)). But if this system is modeled, even using semiempirical methods, the result is quite different - the cationic charge is delocalized over the conjugated chain, and the nitrogen atom is essentially neutral. As soon as the idea of delocalization is suggested, the fallacy of the idea that there is a localized cationic charge on the nitrogen atom is obvious. So be careful of predigested ideas of the type "everyone knows."

Common mistakes made by MOPAC users

Charged organic systems: In organic chemistry, particularly in biochemistry, solvation is usually essential if a charge is involved. Naked charged organic systems do not occur except inside a mass spectrometer - a very unusual environment. This is because the electric field of the ion has a powerful effect on its surroundings. So if, for example, the interaction of a proton with N-Methylacetamide were to be examined, the gas-phase system [CH₃-NH-CO-CH₃ + H(+)]⁺ would be unrealistic, i.e., not a good model. Using a hydronium or Eigen ion, [CH₃-NH-CO-CH₃ + H₃O(+)]⁺ or even the Zundel ion [CH₃-NH-CO-CH₃ + H₅O₂(+)]⁺ would not be much better. The best approach would be to use a solvation model, and then use the Eigen or Zundel ion.

Likewise, neutral amino acids never occur in nature, they always exist as the Zwitterion, stabilized by the aqueous environment. For these systems, a gas-phase calculation would give a completely misleading result.

Slab work for surfaces: Surfaces are usually modeled by using a slab of the solid. A typical slab consists of a parallelepiped of the solid, cut along with sides of at least 10 Ångstroms, and about 7 - 10Å thick. Any Miller indices can be used for the surface, with the commonest being {100}, {010}, {001}, and {111}. The most frequent mistake made is to not ensure that valency requirements are satisfied. Consider a slab of a simple metal oxide, MO₂, M = Ti, Zr, or Hf. The metal atom has a 4+ oxidation state, and no "d" electrons. If the slab consisted only of metal and oxygen atoms, then the ratio of metal to oxygen would need to be exactly 1:2, otherwise if there were an excess of metal atoms, then at least some the metal atoms would need to be in the M(3+), i.e., Mⁱⁱⁱ state, or lower. As this oxidation state has a dramatically different chemistry from the M^iv state, any results obtained would be misleading. If there were an excess of oxygen atoms, then some oxygen atoms would need to be part of a peroxide system - again, not what was intended. So a ratio of 1:2 is not flexible at all.

The ratio 1:2 would only be necessary if the slab consisted only of metal and oxygen atoms. But a more realistic model could be constructed if hydrogen atoms were allowed. In the pure metal-oxygen slab, atoms on the surface of the slab might have unwanted valences - an oxygen atom might be covalently bonded to only one metal atom, or bonded to three metal atoms. Either way, its valency would be strained. By adding hydrogen atoms to convert some oxygen atoms into hydroxide, the strain introduced by the exposed surface could be reduced. Of course, for every two hydrogen atoms added, an extra oxygen atom would also need to be added. Put another way, the surface should be hydrated (but not solvated!).

PDB files: Input data sets must represent realistic chemical systems. Before a protein, from the PDB, can be used, it must first be converted into a chemically reasonable system. The main changes are to resolve positional and structural disorder, if any, and, most important, to add hydrogen atoms. PDB files typically do not include hydrogen atoms, and if they are not added, all results would be nonsense. Adding hydrogen atoms is not a trivial process: most programs get about 99% of the hydrogen atoms added correctly, a few atoms might be missed, and some extra atoms might be added. When a protein is first run using MOPAC, all errors in the number and positions of hydrogen atoms in proteins are printed. READ THE OUTPUT to find which residues are faulty, then correct the errors. Unfortunately, non-residues are not checked, so hetero groups, particularly porphyrin rings, should be checked very carefully.