Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for saving memory in Pathfinder #3323

Open
SteveBronder opened this issue Dec 11, 2024 · 0 comments · May be fixed by #3325
Open

Ideas for saving memory in Pathfinder #3323

SteveBronder opened this issue Dec 11, 2024 · 0 comments · May be fixed by #3325

Comments

@SteveBronder
Copy link
Collaborator

SteveBronder commented Dec 11, 2024

Summary:

In multi pathfinder with a large number of samples we can end up making two large memory allocations when we only need one. Or perhaps even zero allocations.

We make a large matrix containing all of the samples from the individual pathfinders and an std vector of matrices to store the individual pathfinder samples. Both of these allocations will be single_pathfinder_samples * num_pathfinders * num_params in size. This can be huge if you have 100K+ parameters.

I think there are two ways to reduce the memory usage. In short one is to just return the things we need to make samples from the single pathfinders and the other is just to be clever with only making one large matrix.

  1. When running single pathfinder from multi pathfinder, we will return everything we need to generate samples (including the rng state) along with the likelihoods. But not actual samples. When we need to get the draws after doing psis etc. we can adjust the rng to the correct state and pass in the appropriate pathfinder parameters. This will require more computation, but should save a ton of memory.

  2. The second option would be to make one large matrix in multi pathfinder and make a parameter writer for each pathfinder that writes to chunks of the large matrix. So it would look something like the following

  3. Make one large matrix num_pathfinders * single_path_samples * parameters at the beginning of pathfinder

  4. Make a new writer that takes in a map to an Eigen matrix

  5. Instead of single pathfinder returning the samples, have them write the samples to the new writer.

The main issue is what to do when a pathfinder fails. My current thought is to loop over the blocks of samples and if a pathfinder failed we move the last pathfinder that succeeded's samples into that spot. So it would look something like the following

Eigen::Index total_samples = single_pathfinder_samples * num_pathfinders;
Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic> samples(num_params, total_samples);
std::vector<map_writer> map_writer;
for (int i = 0; i < num_pathfinders; i++) {
  map_writer.emplace_back(...); // setup pointers and map for each path here
}
// Call pathfinder
// ...
// Find bad pathfinders
std::vector<std::pair<bool, Eigen::Index>> pathfinders_start_idx;
for (int i = 0; i < num_pathfinders; i++) {
  // Should just be a pair but for splitting for clarity
  auto [pathfinder_success, individ_sample_start_idx] = check_samples_for_nan(samples, 
    i, single_pathfinder_samples, num_pathfinders);
  pathfinders_start_idx.emplace_back(pathfinder_success, individ_sample_start_idx);
}
// Move good pathfinders to earlier spots
for (auto& path_i :  pathfinders_start_idx) {
    if (!path_i.first) {
      // Find last good pathfinder
      auto last_good_path = std::find_if(pathfinders_start_idx.rbegin(), 
        pathfinders_start_idx.rend(), [](auto& path_j) { return path_i.first}
      move_good_to_bad(samples, path_i, *last_good_path);
      // Copy over last good path
      path_i = *last_good_path;
      // Remove the moved value
      pathfinders_start_idx.erase(last_good_path);
    }
}
// Make a map of only the successful samples
Eigen::Map<Eigen::Array<double, -1, -1>> sample_map(...);
// Do rest of PSIS etc.
@SteveBronder SteveBronder changed the title Pathfinder should only make a full sample matrix if psis_resample is true Ideas for saving memory in Pathfinder Dec 11, 2024
@SteveBronder SteveBronder linked a pull request Dec 13, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant