RESIDUES

In protein work, each atom will be labeled using information supplied by the PDB file. An alternative is to use keyword RESIDUES which causes the PDB-style labels to be worked out using only the topology of the system. Modified residues can still be recognized if XENO=text is used. Unless other keywords such as CHAINS, and START_RES are present, or the input file is already in PDB format, the residue nearest to the NH₂ end of the protein will be No. 1 in chain A, the next is 2, and so on, and the different chains will be labeled A, then B, C, D, etc. If the input file is in PAB format, not just in MOPAC format with PDB information, the residue numbers and chain letters will be worked out from the PDB file.

Keyword RESIDUES uses part of MOZYME, so when the job starts it uses the MOZYME option. If keyword MOZYME is not present, then the job will only be allowed to run if no SCF calculations are possible. This is to prevent a MOZYME calculation being done unless keyword MOZYME is present. Jobs that do not involve SCF calculations use one or more of the keywords 0SCF, LEWIS, CHARGES, ADD-H, SITE, or RESEQ.

Keyword RESIDUES converts the data-set into almost standard PDB format by making the minimum change to the existing format. This means that:

* The residue numbers and letters will be preserved.
* Atoms will not be re-arranged. To put the atoms into standard PDB sequence, use RESEQ.
* If a heterogroup is covalently bonded to an existing residue, the residue will not be correctly recognized. To correctly recognize modified residues, use CVB to remove the covalent bond joining the heterogroup to the residue. Then, if necessary, use XENO=text to re-name the residue and to name the heterogroup.
* Heterogroups are labeled using a simple numbering system. When more than one atom of an element is present, the atoms will be numbered sequentially, i.e., C1, C2, .... C45, C46. When this convention is not wanted, for example if heteroatoms are labeled C2', P', etc., (think saccharides) then edit the new geometry by using cut-and-paste and the original, better, labeling system. Do not correct errors in atom serial numbers - these will be automatically corrected by MOPAC in future jobs.
* All atoms will be given a chain-letter. This includes water molecules, ligands, etc.
* Atom serial numbers correspond to atom numbers plus any terminators (TER's). Although interesting, these numbers are re-calculated when any future jobs are run. That is, atom serial numbers are not important in any MOPAC calculations.
* Many PDB files use non-standard atom labels. If the label is highly unusual the new label might be incorrect.
* Every atom should have a unique label. This is important for jobs that use GEO_REF.

RESIDUES0 - preserves the original atom labels

Good practice is to use RESIDUES to assign PDB labels to atoms, and to not do any other operations, i.e., do not resequence the atoms, or do any calculations in the same job. After the RESIDUES job finishes, examine both the output and the new data-set(s) to check that the re-labeling operation was done correctly. In the original PDB data-set, some atoms might have unusual labels, for example, a sugar backbone atom might have a label such as " C3' " To keep the original atom labels, use RESIDUES0 (the word "RESIDUES" followed by a zero) instead of RESIDUES. If the starting geometry was already in PDB format, and the re-labeling is partially incorrect, then copy and paste the relevant parts of the correctly re-labeled file into the original data-set.

When individual amino acids are mutated, RESIDUES can be used to re-label the mutated sites.

If RESIDUES does not work when other keywords are present, run RESIDUES with keyword 0SCF in a job on its own, then use the results for the job you want.

The most common use for RESIDUES is to allow the residue sequence reported in the PDB file to be compared to the sequence defined by the geometry, i.e., by the topology, of the system. By comparing reported and actual sequences, possible problems in the PDB file can be detected.

Table:
Abbreviations for the 20 Amino Acids

Amino Acid	Formula of Residue $^{dag }$	Three-Letter Abbreviation	One-Letter Abbreviation
Glycine	C₂NOH₃	GLY	G
Alanine	C₃NOH₅	ALA	A
Valine	C₅NOH₉	VAL	V
Leucine	C₆NOH₁₁	LEU	L
Isoleucine	C₆NOH₁₁	ILE	I
Serine	C₃NO₂H₅	SER	S
Threonine	C₄NO₂H₇	THR	T
Aspartic acid	C₄NO₃H₅ (4)	ASP	D
Asparagine	C₄N₂O₂H₆	ASN	N
Lysine	C₆N₂OH₁₂ (13)	LYS	K
Glutamic acid	C₅NO₃H₇ (6)	GLU	E
Glutamine	C₅N₂O₂H₈	GLN	Q
Arginine	C₆N₄OH₁₂ (13)	ARG	R
Histidine	C₆N₃OH₇(8)	HIS	H
Phenylalanine	C₉NOH₉	PHE	F
Cysteine	C₃NOSH₅ (4)	CYS	C
Tryptophan	C₁₁N₂OH₁₀	TRP	W
Tyrosine	C₉NO₂H₉(8)	TYR	Y
Methionine	C₅NOSH₉	MET	M
Proline	C₅NOH₇	PRO	P

: The number of hydrogen atoms in the ionized residue is given in parenthesis after the formula. Cysteine may exist in the neutral, ionized or reduced form.

`RESIDUES[0]`

`RESIDUES0` - preserves the original atom labels

RESIDUES[0]

RESIDUES0 - preserves the original atom labels

`RESIDUES[0]`

`RESIDUES0` - preserves the original atom labels