• Keine Ergebnisse gefunden

We have implemented the DLPNO-MP2 response density calculation in the ORCA pack-age,221 closely following the algorithm for the DLPNO-MP2 gradient.192 The code is structured as follows:

1. First the SCF equations are solved.

2. Then, the DLPNO-MP2 energy and relaxed density calculation is performed. The PNO coefficientsdij and amplitudesTij are kept on disk, as well as the list of active orbital pairs and the domain information.

3. Next, the SCF-level property calculation is performed and the CPSCF solution Uaiλ is stored.

4. The remaining blocks of Uλ are calculated, according to eqs. 3.85, 3.87, 3.88, and 3.93. Fλ and cλ are completed, as well as Peλ.

5. The 3-index integrals (i˜µ0|K) and (iµ˜0|K)λ are calculated and stored.

6. In a loop over orbital pairs, the CPNO coefficients d0ij are reconstructed, and ^Tij and _Tij,λ are calculated according to eqs.3.16 and3.99. θij,λ are computed and the perturbed PNO coefficients dij,λ are stored on disk.

7. The perturbed PNO amplitude equations, 3.103 are solved iteratively, separately for each perturbation.

8. The contributions to the response density matrix fromDij,Dij, and their derivatives is calculated.

9. The PNO-dependent terms are evaluated in a loop over orbital pairs:

9.1. The full d00ij,T^ij, d00ij,λ, and T^ij,λ are reconstructed.

9.2. The intermediates τij and τij,λ are calculated.

9.3. eqs. 3.60 and 3.111 are solved to obtain vij and vij,λ.

9.4. eqs. 3.58 and 3.109 are solved to obtain wij and wij,λ. Their contributions to D0o, D0ij, and their derivatives are evaluated.

9.5. Gi(ij), Gj(ij), Gi(ij),λ, and Gj(ij),λ and their contributions to ΓK, ξij, Yγ, and their derivatives are computed.

9.6. Terms related to ξij and τij and their derivatives are added to Yδ and Yδ,λ. 10. ΓK and ΓK,λ are read from disk and their contributions to Lλij,Lλmj, Yα,λ, Yβ, and

Yβ,λ are evaluated in the AO basis by generating the required 3-index integrals on the fly.

11. The prescreening contributions to Lλij, Lλmj,Lλia, and D0 are calculated.

12. Lλmj is completed and the Z-CV eqs. 3.116 are solved to obtain zmjλ .

13. Lλij is completed and the Z-CPL eqs. 3.113 are solved to obtain zijλ. In our imple-mentation a conjugate gradient solver is used.

14. Lλia is completed and the Z-CPSCF eqs. 3.72 are solved to obtainziaλ. This is most conveniently done in the CMO basis using a standard CPSCF solver.

15. The full response density Dλ is assembled and the DLPNO-MP2-level properties are calculated.

As the goal here is the calculation of field-response properties, the algorithm has been optimized for a small number of perturbations. The loop over the three Cartesian compo-nents of the external field is the innermost one in most cases, such that the unperturbed and a single perturbed quantity are kept in memory at any given time. In some situations, e.g. step 9.2., it is faster to calculate all perturbed quantities at the same time, at the cost of higher memory usage. Likewise, all intermediates which are stored on disk – such as (iµ˜0|K)λ, dij,λ, and Tij,λ – are kept for all perturbations simultaneously. For these reasons, the implemented algorithm is not suitable for a number of perturbations that grows with the system size, e.g. nuclear Hessians.

It is also worth pointing out that the NPAO domains and the SC amplitudes are calculated four times in total – in steps 2. (twice), 6., and 9.1. – while their response is calculated twice – in steps 6., and 9.1.. This is not an insignificant computational overhead, as shown in Section3.4.4, but is done intentionally to avoid the storage and I/O of quantities in the full NPAO domains, which are much larger than the PNO domains.

Taking the vancomycin example from that section, with 22 265 kept pairs and average NPAO and PNO domain sizes of 1007 and 22, respectively: about 1.2 TB of disk space was required in total, of which the PNO coefficients (dij and dij,B) were about 12.8 GB and the PNO amplitudes (Tij andTij,λ) – about 960 MB. A back-of-the-envelope estimate suggests that if the NPAO coefficients (d00ij andd00ij,B) and SC amplitudes (T_ij andT_ij,λ) were stored in addition, a further 2.5 TB of disk space would be required just for those quantities. In the end, less than 20 % of the total computation time would be saved, at the cost of additional I/O overhead.

The asymptotic scaling behavior of the DLPNO-MP2 gradient algorithm was discussed in ref. 192and that discussion applies here as well. Briefly, the number of orbital pairs that survive the dipole-based prescreening scales linearly with system size in the asymptotic limit, while the PAO correlation domains reach a finite size for large systems. This allows for asymptotic linear scaling of most major steps in the algorithm, provided sparsity is properly exploited using the sparse maps infrastructure introduced in ref. 144. In particular, the RI integral transformation, PNO generation, and iterative solution of the amplitude equations are asymptotically linear scaling. Two-electron contributions coming from the Fock matrix, e.g. in the CPSCF and Z-CPSCF equations, can be calculated with effective quadratic scaling using the RIJCOSX approximation.113 The calculation of terms involving (K|ip) integrals in the Z-CPSCF equations (eq.3.77) right-hand sides asymptotically scales as O(N2). Together with the (iµ˜0|K) contributions (eq. 3.78) these terms take about 10–15% of the computation time for the largest systems discussed here.

Contributions from the screened-out pairs scale quadratically, while localization, CPL, and Z-CPL equations scale as O(N3) but these steps constitute a very small part of the overall computation time.

Alongside MP2, the implementation can be used for DHDFT – the relevant modi-fications are identical to the RI-MP2 case, discussed in Section 2.1.4. Spin-component scaling is also implemented.51 Implicit solvent effects can be included using CPCM,245,246 as discussed in Section 2.1.5