Matrix free wave function optimization #84
+1,352
−624
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For the wave functions WaveFunctionUPS and WaveFunctionSAUPS the memory requirements have been reduced from scaling as O(N^2) to O(N) with N being the number of determinants.
This is NOT the case for WaveFunctionUCC that still scales as O(N^2) in memory, this is due to the run-time increased a lot for WaveFunctionUCC when the O(N) algorithm was used.
A very rough estimate for the old and new memory requirements is,
The actual memory usage will ofc be some factor of the above, if an algorithm requires some copies of the state vector (memory bottleneck in the new implementation) then the memory requirement will be that factor times the table value (for the new implementation) approximately.
This is to say that memory usage is no longer a critical bottleneck for UPS and SAUPS.
The UPS wave function optimization is also faster now.
For chains of hydrogen atoms with 1-SA-fUpCCGSD the run times are the following,
A speed increase that increases with system size.
The speed is highly dependent on how dense the state vector of the system is, for O2 6-31G (12,12) an iteration is 800 s.
The WaveFunctionUCC run-time might be largely unaffected by this new structure of the code.
UCCSD H6 STO-3G had a timing of 2.1 s per iteration in the old implementation and now has 2.0 s per iteration.