PDB files normally consist of a block of descriptive text - lines starting with words such as "HEADER", "TITLE", "REMARK", etc. After the descriptive text there is a block of lines that specify the atoms and their positions. The first line of this block will start with the text "ATOM" or "HETATM"
The simplest way to convert a PDB file into a MOPAC data set is to delete all the descriptive text in the PDB file and replace it with the normal three lines of text of a MOPAC data set. For example, given a PDB file like the following:
HEADER DNA-BINDING PROTEIN 23-FEB-98 TITLE THE CRYSTAL STRUCTURE OF DPS, A FERRITIN HOMOLOG THAT BINDS TITLE 2 AND PROTECTS DNA COMPND MOL_ID: 1; COMPND 2 MOLECULE: DPS; COMPND 3 CHAIN: A, B, C, D, E, F, G, H, I, J, K, L; SOURCE MOL_ID: 1; (lots of lines deleted) REMARK 1 REMARK 2 REMARK 2 RESOLUTION. 1.60 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. REMARK 3 PROGRAM : X-PLOR 3.851 ATOM 1 N SER A 9 24.406 72.250 57.799 1.00 29.89 N ATOM 2 CA SER A 9 25.023 73.353 58.534 1.00 30.04 C ATOM 3 C SER A 9 25.215 74.573 57.639 1.00 30.09 C ATOM 4 O SER A 9 25.784 74.455 56.546 1.00 30.03 O ATOM 5 CB SER A 9 26.394 72.928 59.070 1.00 30.33 C ATOM 6 OG SER A 9 27.109 74.019 59.642 1.00 30.46 O ATOM 7 N LYS A 10 24.750 75.744 58.086 1.00 30.02 NThe MOPAC data set would look like this:
xyz 0scf Convert a PDB file into a normal MOPAC file, and preserve the geometry This line should hold comments. ATOM 1 N SER A 9 24.406 72.250 57.799 1.00 29.89 N ATOM 2 CA SER A 9 25.023 73.353 58.534 1.00 30.04 C ATOM 3 C SER A 9 25.215 74.573 57.639 1.00 30.09 C ATOM 4 O SER A 9 25.784 74.455 56.546 1.00 30.03 O ATOM 5 CB SER A 9 26.394 72.928 59.070 1.00 30.33 C ATOM 6 OG SER A 9 27.109 74.019 59.642 1.00 30.46 O ATOM 7 N LYS A 10 24.750 75.744 58.086 1.00 30.02 N
By default, geometries are flagged for optimization. To prevent this,
use keywords such as 1SCF or
NOOPT. Most PDB files do not include hydrogen atoms, if they
are missing then they need to be added before any MOPAC work is done.
Don't worry if the job fails immediately, simply read the error message and
follow the advice given.
Another common option is to use PDBOUT, in which
case the descriptive text should
be included in the data set, but with every line of descriptive text starting
with an asterisk. The MOPAC data set would look like this:
*HEADER DNA-BINDING PROTEIN 23-FEB-98 *TITLE THE CRYSTAL STRUCTURE OF DPS, A FERRITIN HOMOLOG THAT BINDS *TITLE 2 AND PROTECTS DNA *COMPND MOL_ID: 1; *COMPND 2 MOLECULE: DPS; *COMPND 3 CHAIN: A, B, C, D, E, F, G, H, I, J, K, L; *SOURCE MOL_ID: 1; (lots of lines deleted) xyz 0scf Convert a PDB file into a normal MOPAC file, and preserve the geometry ATOM 1 N SER A 9 24.406 72.250 57.799 1.00 29.89 N ATOM 2 CA SER A 9 25.023 73.353 58.534 1.00 30.04 C ATOM 3 C SER A 9 25.215 74.573 57.639 1.00 30.09 C ATOM 4 O SER A 9 25.784 74.455 56.546 1.00 30.03 O ATOM 5 CB SER A 9 26.394 72.928 59.070 1.00 30.33 C ATOM 6 OG SER A 9 27.109 74.019 59.642 1.00 30.46 O ATOM 7 N LYS A 10 24.750 75.744 58.086 1.00 30.02 N