Ideas for saving memory in Pathfinder #3323

SteveBronder · 2024-12-11T21:39:45Z

Summary:

In multi pathfinder with a large number of samples we can end up making two large memory allocations when we only need one. Or perhaps even zero allocations.

We make a large matrix containing all of the samples from the individual pathfinders and an std vector of matrices to store the individual pathfinder samples. Both of these allocations will be single_pathfinder_samples * num_pathfinders * num_params in size. This can be huge if you have 100K+ parameters.

I think there are two ways to reduce the memory usage. In short one is to just return the things we need to make samples from the single pathfinders and the other is just to be clever with only making one large matrix.

When running single pathfinder from multi pathfinder, we will return everything we need to generate samples (including the rng state) along with the likelihoods. But not actual samples. When we need to get the draws after doing psis etc. we can adjust the rng to the correct state and pass in the appropriate pathfinder parameters. This will require more computation, but should save a ton of memory.
The second option would be to make one large matrix in multi pathfinder and make a parameter writer for each pathfinder that writes to chunks of the large matrix. So it would look something like the following
Make one large matrix num_pathfinders * single_path_samples * parameters at the beginning of pathfinder
Make a new writer that takes in a map to an Eigen matrix
Instead of single pathfinder returning the samples, have them write the samples to the new writer.

The main issue is what to do when a pathfinder fails. My current thought is to loop over the blocks of samples and if a pathfinder failed we move the last pathfinder that succeeded's samples into that spot. So it would look something like the following

Eigen::Index total_samples = single_pathfinder_samples * num_pathfinders;
Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> samples(num_params, total_samples);
std::vector<map_writer> map_writer;
for (int i = 0; i < num_pathfinders; i++) {
  map_writer.emplace_back(...); // setup pointers and map for each path here
}
// Call pathfinder
// ...
// Find bad pathfinders
std::vector<std::pair<bool, Eigen::Index>> pathfinders_start_idx;
for (int i = 0; i < num_pathfinders; i++) {
  // Should just be a pair but for splitting for clarity
  auto [pathfinder_success, individ_sample_start_idx] = check_samples_for_nan(samples, 
    i, single_pathfinder_samples, num_pathfinders);
  pathfinders_start_idx.emplace_back(pathfinder_success, individ_sample_start_idx);
}
// Move good pathfinders to earlier spots
for (auto& path_i :  pathfinders_start_idx) {
    if (!path_i.first) {
      // Find last good pathfinder
      auto last_good_path = std::find_if(pathfinders_start_idx.rbegin(), 
        pathfinders_start_idx.rend(), [](auto& path_j) { return path_i.first}
      move_good_to_bad(samples, path_i, *last_good_path);
      // Copy over last good path
      path_i = *last_good_path;
      // Remove the moved value
      pathfinders_start_idx.erase(last_good_path);
    }
}
// Make a map of only the successful samples
Eigen::Map<Eigen::Array<double, -1, -1>> sample_map(...);
// Do rest of PSIS etc.

The text was updated successfully, but these errors were encountered:

SteveBronder changed the title ~~Pathfinder should only make a full sample matrix if psis_resample is true~~ Ideas for saving memory in Pathfinder Dec 11, 2024

SteveBronder added the good first issue label Dec 11, 2024

SteveBronder linked a pull request Dec 13, 2024 that will close this issue

Feature/reduce mem pathfinder #3325

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas for saving memory in Pathfinder #3323

Ideas for saving memory in Pathfinder #3323

SteveBronder commented Dec 11, 2024 •

edited

Loading

Ideas for saving memory in Pathfinder #3323

Ideas for saving memory in Pathfinder #3323

Comments

SteveBronder commented Dec 11, 2024 • edited Loading

Summary:

SteveBronder commented Dec 11, 2024 •

edited

Loading