Modeling proteins and Enzyme Catalyzed Reactions

Introduction

Proteins and their properties can be modeled using MOPAC2016.  Until now, there have only been two methods for studying proteins: experiment (X-ray structures and biochemical experiments) and understanding (knowledge of biochemistry and of the behavior of proteins).  The methods and utilities in MOPAC2016 provide a tool for modeling proteins and their reactions.  This is fundamentally different from experiment and understanding, in that it provides a method for investigating protein behavior and for testing ideas, and as a result of the increase in accuracy of prediction of protein behavior in recent years, the computer model is now a practical and reliable intellectual tool.  If experimentalists can be persuaded that this new tool would be useful to them, then its use for studying proteins would amount to a paradigmatic change in the field of biochemistry.

For the purposes of investigating protein behavior, it is useful to regard the program MOPAC as a complete laboratory, with all the utilities and tools needed for modeling protein behavior.  As with tools in any laboratory, a certain level of skill is needed; a potential user who just wants to look at an enzyme-catalyzed reaction and is not willing to prepare a suitable model will inevitably be disappointed.  Potential users must be willing to do two things: first, to acquire the skills necessary for manipulating the complicated models, and second, to construct and use such models.  These are difficult tasks, and a lot of patience is required.  To help with them, MOPAC provides a set of utilities to assist in building models and for detecting errors that occur during this process.

Why should an experimentalist use computational chemistry modeling tools?  Probably the most important reason is that they provide an alternative to the other two approaches: experiment and understanding.  To give a simple example, consider an X-ray structure of a protein.  This structure is physics, not chemistry, and almost all (well over 90%) of the structures examined thus far have had mild to severe errors when looked at from a chemical point of view.  Examples would be unrealistic single bond lengths, e.g., C-C bonds of length ~1.1 Å instead of the expected 1.5 Å, unrealistic non-covalently bonded distances, and simple errors such as a C-NH2 being mistaken for a C=O and vice versa. These errors can be detected using MOPAC, and, more important, they can be corrected.  The result is a structure that is much more chemically realistic than the starting PDB structure.  Put another way, the result of simply building a model of a protein is a structure that is more realistic, i.e., nearer to the structure of the biochemical that occurs in nature, than anything available from any other source.

To reiterate, for a model to be useful it must be realistic.  When an enzyme catalyzes a reaction, the energy changes involved are often very small, frequently less than the energy involved in making a single hydrogen bond.  If there is a fault in the model that might cause it to be in a high-energy state, and later on in the simulation that fault corrected itself, the resulting energy change would more than likely invalidate all the work that had gone before. It is important, therefore, to make sure that the starting model is as error free as possible.  Unfortunately, and without exception, all data sources for protein structures (in particular, the most important one: the Protein Data Bank) have limitations.  These range from minor faults - ones that were previously regarded as unimportant - such as missing hydrogen atoms, to quite severe geometric errors, such as carbon-carbon single bonds having a bond-length of less than 1.1 Ångstrom.  Therefore, before any work can be done, a realistic model of the protein must be constructed.  This operation involves several steps, and requires great care to be exercised to ensure that the resulting model is as good as possible.

MOPAC contains many Tools for use with Proteins.

Steps and processes involved in modeling Proteins