(Modeling proteins)

Getting a starting protein structure

There is only one good source of protein structures, and that is the Protein Data Bank, the PDB.  This is a collection of over 80,000 proteins, covering almost everything you could want.  Don't bother looking anywhere else for protein structures - you'd just be wasting your time.  The only exception is if you are wanting to work on a new structure that hasn't yet been deposited into the PDB.

Each structure is assigned a unique alphanumeric name, e.g., 1EJG for the small protein Crambin.  Associated with the name is a file, the PDB file, that contains all the important information on the protein - who did the work, where it was published and when, details about the structure, and the structure itself.  All of this is in a highly formal format, the PDB format.   MOPAC can read the PDB format directly, but as the PDB format does not contain any keywords that MOPAC could recognize, the only result would be to add hydrogen atoms, then print out the geometry in MOPAC format, and stop.

To help identify the PDB file you want to work with, here are some suggestions: