This description is based on materials provided by Prof Gerd B. Rocha, Universidade Federal da Paraíba, João Pessoa, Brazil.

The modifications in the semiempirical quantum chemistry MOPAC2016 code that accelerate single-point energy calculations (1SCF) of medium-size (up to 4000 atoms) molecular systems using multithreaded shared-memory CPUs are now presented.

Computational aspects

The modifications consisted of using a combination of highly optimized linear algebra libraries (LAPACK and BLAS from Intel MKL (Math Kernel Library)) to reduce the computation time for parts of MOPAC such as the pseudodiagonalization, full diagonalization, and density matrix assembling.

The SCF procedure in MOPAC is driven by the ITER
subroutine which is called exclusively by the COMPFG subroutine. The most
time-consuming steps in conventional semi-empirical calculations are the Fock
matrix diagonalization and density matrix assembly, since these are *O(N³)*
procedures, *N* being the number of basis functions. The rest of the
procedures carried out in the SCF have *N²* complexity, such as the
assembly of Fock Matrix. So, most of our efforts were to replace these
subroutine calls for more-optimized similar versions from MKL.

The pseudodiagonalization in MOPAC is carried out
in subroutine DIAG, and its algorithm starts by forming the occupied-virtual
block of the Fock matrix, F_{o−v}, in the molecular orbital basis of the
previous iteration:

*F _{o-v} = C_{o}^{T}FC_{v}*

where *C _{o}* is the occupied orbitals
sub-matrix and

Once *F _{o−v}
*is built, the second part of DIAG performs a series of 2x2 vector rotations
(unitary transformations) which approximately eliminates the elements of

*C _{i}^{new}
= cC_{i }- sC_{a }*and

where

*s = F _{ia}*/(

In these equations,
*ε _{i}*
and

To hasten it, we have replaced both matrix multiplications for more-optimized MKL procedures (DGEMM), as well as the Jacobi rotations step. Also, we have replaced the full diagonalization procedure subroutine (using subroutines from EISPACK 3 library) with Intel MKL DSYEVD. At the density matrix assembly step, we have used DSYRK instead of DGEMM, since the product matrix is symmetric.

Benchmarks

Below, we present times, in seconds, for a single-point calculation run on a 2 x 2.93 GHz 6-Core Intel Xeon with 16Gb of 1333 DDR3 memory chips Mac Pro computer, using 12 of the 24 threads

Species |
#atoms |
Current MOPAC2016 |
MOPAC with MKL |
MOPAC2016 with MKL and multi-threading |

Naphthalene |
18 |
0.02 |
0.15 |
0.02 |

Crambin | 642 | 468 | 114 | 12 |

(H |
1719 |
3,082 |
584 |
79 |

1G6X |
1455 |
8,612 |
1,240 |
142 |

1EZG (Thermal Hysteresis Protein) | 2064 | 22,959 | 2,118 | 300 |

1RNB (Barnase) |
2066 |
34,372 |
4,108 |
411 |

Bacteriorhodopsin (1BRX) | 3352 | 141,773 | 11,192 | 1,394 |

Additional details

In MOPAC, proteins can be modeled using the MOZYME technique, however, that technique is limited to closed shell RHF calculations. This means that proteins with free radical sites, excited state proteins, and proteins containing iron, chromium or other transition metal atoms should not be modeled using MOZYME.

Such modifications reduce the computation time so that conventional MOPAC methods can now be used for many proteins. This means that it is now practical to use conventional MOPAC methods - RHF-CI and UHF - for modeling most of the smaller proteins, and by implication that it will be possible to model many systems that currently cannot or should not be modeled using MOZYME.

All technical details of the modifications to the MOPAC2016 code, as well as some results, benchmarks and an application in biomolecules, can be found in the publication:

Maia, J. D. C.; Urquiza Carvalho, G. A.; Mangueira, C. P.; Santana, S. R.; Cabral, L. A. F.; Rocha, G. B., "GPU Linear Algebra Libraries and GPGPU Programming for Accelerating MOPAC Semiempirical Quantum Chemistry Calculations" Journal of Chemical Theory and Computation 2012, 8, 3072–3081. See: http://pubs.acs.org/doi/abs/10.1021/ct3004645

In this publication you also can find all about the GPU-powered MOPAC that was released in January 2014. For details of the effect of using a GPU, see GPU.