Modeling proteins and Enzyme-Catalyzed Reactions (Back to "Proteins")


For the purposes of investigating protein behavior, it is useful to regard the program MOPAC as a complete laboratory, with all the utilities and tools needed for modeling protein behavior.  As with tools in any laboratory, a certain level of skill is needed; a potential user who just wants to look at an enzyme-catalyzed reaction and is not willing to prepare a suitable model will inevitably be disappointed.  Potential users must be willing to do two things: first, to acquire the skills necessary for manipulating the complicated models, and second, to construct and use such models.  These are difficult tasks, and a lot of patience is required.  To help with them, MOPAC provides a set of utilities to assist in building models and for detecting errors that occur during this process.

Any researcher who wants to use MOPAC for modeling proteins and is willing to invest the effort to learn how to do that has a right to expect the program to work as described in this Manual.  Obviously there are faults in MOPAC, but a serious effort has been made, taking more than a year, to check that the program works correctly, and to simplify identifying and correcting faults in protein data-sets. The current on-line manual and MOPAC program are the culmination of that effort.

Why should an experimentalist use computational chemistry modeling tools?  Probably the most important reason is that they provide an alternative to the other two approaches: experiment and understanding.  To give a simple example, consider an X-ray structure of a protein.  This structure is physics, not chemistry, and almost all (well over 90%) of the structures examined thus far have had mild to severe errors when looked at from a chemical point of view.  Examples would be unrealistic single bond lengths, e.g., C-C bonds of length ~1.1 Å instead of the expected 1.5 Å, unrealistic non-covalently bonded distances, and simple errors such as a C-NH2 being mistaken for a C=O and vice versa. These errors can be detected using MOPAC, and, more important, they can be corrected.  The result is a structure that is much more chemically realistic than the starting PDB structure.  Put another way, the result of simply building a model of a protein is a structure that is more realistic, i.e., nearer to the structure of the biochemical that occurs in nature, than anything available from any other source.

To reiterate, for a model to be useful it must be realistic.  When an enzyme catalyzes a reaction, the energy changes involved are often very small, frequently less than the energy involved in making a single hydrogen bond.  If there is a fault in the model that might cause it to be in a high-energy state, and later on in the simulation that fault corrected itself, the resulting energy change would more than likely invalidate all the work that had gone before. It is important, therefore, to make sure that the starting model is as error free as possible.  Unfortunately, and without exception, all data sources for protein structures (including the most important one: the Protein Data Bank) have limitations.  These range from minor faults - ones that were previously regarded as unimportant - such as missing hydrogen atoms, to quite severe geometric errors, such as carbon-carbon single bonds having a bond-length of less than 1.4 Ångstroms.  Therefore, before any work can be done, a realistic model of the protein must be constructed.  This operation involves several steps, and requires great care to be exercised to ensure that the resulting model is as good as possible.

MOPAC contains many Tools for use with Proteins.

Steps and processes involved in modeling Proteins