Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DistributedHaloCatalog #825

Open
aphearin opened this issue Oct 28, 2017 · 4 comments
Open

DistributedHaloCatalog #825

aphearin opened this issue Oct 28, 2017 · 4 comments
Assignees
Milestone

Comments

@aphearin
Copy link
Contributor

Implement a halo catalog that can be distributed across nodes of a large cluster using MPI

@aphearin aphearin added this to the v1.0 milestone Oct 28, 2017
@aphearin aphearin self-assigned this Oct 28, 2017
@rainwoodman
Copy link
Contributor

This issue will solve bccp/nbodykit#502

I think a plausible easy way of doing this is to ensure each MPI rank contains a spatially localized domain -- then reuse the single node code on each rank.

Here is an object that helps you to distribute objects to domains:

https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L274

And we were using it here:

https://github.com/bccp/nbodykit/blob/master/nbodykit/base/decomposed.py#L3

and here

https://github.com/bccp/nbodykit/blob/master/nbodykit/algorithms/pair_counters/domain.py#L113

You can probably write a better version of this on your own; or jump start your development with domain.py and _domain.pyx.

Those models that need particles may need to use the 'smoothing' argument of https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L515

@aphearin
Copy link
Contributor Author

aphearin commented Jul 3, 2018

@rainwoodman - thanks a lot for the pointers. A spatial domain decomposition is indeed what I thought best for this problem. The only difference is that I have been using a buffer region around each domain the size of rmax, the largest pair-counting distance. It looks you handled this without this feature, but perhaps I read it too quickly?

@rainwoodman
Copy link
Contributor

rainwoodman commented Jul 3, 2018 via email

@aphearin
Copy link
Contributor Author

aphearin commented Jul 3, 2018

The mock-population part is truly trivially parallelizable - no models in the entire library would be impacted by the decomposition. However, the reason that this feature actually requires a very significant rewrite is that in order to fully take advantage of the parallelism, the summary statistics kernels need to be computed on the subvolumes, and then only the results are reported to rank 0 which collects things like subvolume pair-counts, sums then and converts them into the tpcf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants