Proteins and their properties can be modeled using MOPAC2016. Until now, there have only been two methods for studying proteins: experiment (X-ray structures and biochemical experiments) and understanding (knowledge of biochemistry and of the behavior of proteins). The methods and utilities in MOPAC2016 provide a tool for modeling proteins and their reactions. This is fundamentally different from experiment and understanding, in that it provides a method for investigating protein behavior and for testing ideas, and as a result of the increase in accuracy of prediction of protein behavior in recent years, the computer model is now a practical and reliable intellectual tool. If experimentalists can be persuaded that this new tool would be useful to them, then its use for studying proteins would amount to a paradigmatic change in the field of biochemistry.
For the purposes of investigating protein behavior, it is useful to regard the program MOPAC as a complete laboratory, with all the utilities and tools needed for modeling protein behavior. As with tools in any laboratory, a certain level of skill is needed; a potential user who just wants to look at an enzyme-catalyzed reaction and is not willing to prepare a suitable model will inevitably be disappointed. Potential users must be willing to do two things: first, to acquire the skills necessary for manipulating the complicated models, and second, to construct and use such models. These are difficult tasks, and a lot of patience is required. To help with them, MOPAC provides a set of utilities to assist in building models and for detecting errors that occur during this process.
Why should an experimentalist use computational chemistry modeling tools? Probably the most important reason is that they provide an alternative to the other two approaches: experiment and understanding. To give a simple example, consider an X-ray structure of a protein. This structure is physics, not chemistry, and almost all (well over 90%) of the structures examined thus far have had mild to severe errors when looked at from a chemical point of view. Examples would be unrealistic single bond lengths, e.g., C-C bonds of length ~1.1 Å instead of the expected 1.5 Å, unrealistic non-covalently bonded distances, and simple errors such as a C-NH2 being mistaken for a C=O and vice versa. These errors can be detected using MOPAC, and, more important, they can be corrected. The result is a structure that is much more chemically realistic than the starting PDB structure. Put another way, the result of simply building a model of a protein is a structure that is more realistic, i.e., nearer to the structure of the biochemical that occurs in nature, than anything available from any other source.
To reiterate, for a model to be useful it must be realistic. When an enzyme catalyzes a reaction, the energy changes involved are often very small, frequently less than the energy involved in making a single hydrogen bond. If there is a fault in the model that might cause it to be in a high-energy state, and later on in the simulation that fault corrected itself, the resulting energy change would more than likely invalidate all the work that had gone before. It is important, therefore, to make sure that the starting model is as error free as possible. Unfortunately, and without exception, all data sources for protein structures (in particular, the most important one: the Protein Data Bank) have limitations. These range from minor faults - ones that were previously regarded as unimportant - such as missing hydrogen atoms, to quite severe geometric errors, such as carbon-carbon single bonds having a bond-length of less than 1.1 Ångstrom. Therefore, before any work can be done, a realistic model of the protein must be constructed. This operation involves several steps, and requires great care to be exercised to ensure that the resulting model is as good as possible.
MOPAC contains many Tools for use with Proteins.
Installing MOPAC2016: How to get the program, install, and activate it
Running a simple job: Constructing a data set for formaldehyde, running it, analyzing the results. Some nomenclature (jobs, calculations, etc.)
Getting a starting protein structure: The PDB, what to look for and what to watch out for
Graphical User Interfaces: What to look for
Preparing a starting data set: Editing the PDB file, resolving disorder and steps involved in adding hydrogen atoms
Solvation: Use solvation - it make the model more realistic
Resequencing: What to watch out for
Running a 1SCF calculation: Issues involved in
Correcting errors in the X-ray structure: Improving on the X-ray structure
Unconstrained optimization: Generating the starting point for modeling protein chemistry
Check that the Starting Model is valid: Examine the active site
Editing the Starting Model to make small changes: Changing -COOH to -CONH2 or CH2OH to CH3
Choosing a format: MOPAC or PDB: Issues and considerations
Make a backup copy of the starting model: Avoid the risk of losing a lot of hard work
Worked example: Chymotrypsin: Complete catalytic cycle
Constructing a Chymotrypsin Starting Model: The minimum energy structure is best
Making reactants and products: Techniques for generating intermediates
Locating and Refining Transition States in Proteins: Generating transition state geometries
Verifying transition states: Show that there is exactly one imaginary force constant
Intrinsic Reaction Coordinates: Show that the imaginary mode connects reactants and products