Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bin-free conditional abundance matching #888

Merged
merged 5 commits into from
Mar 12, 2018

Conversation

aphearin
Copy link
Contributor

This PR introduces a new bin-free algorithm for conditional abundance matching, as well as tutorials on how to use it. The way the algorithm works is as follows.

  • For every model galaxy, we find the observed galaxy with the closest primary property, x.
  • We set up a window of nwin~200 observed galaxies bracketing this matching galaxy;
    this window defines Prob(< y_obs | x), which allows us to calculate the rank-order y_obs-percentile for each galaxy in the window.
  • Similarly, we set up a window of nwin model galaxies; this window defines Prob(< y_halo | x), which allows us to calculate the rank-order y_halo-percentile of our model galaxy,r_1.
  • Then we search the observed window for the observed galaxy whose rank-order y_obs-percentile equals r_1, and map its y_obs-value onto our model galaxy.

The implementation is based on a cython kernel, bin_free_cam_kernel.pyx. The simplest way to compute rank-order-percentiles is just by sorting the window. However, this is prohibitively expensive when done for every window around every galaxy. And so the cython kernel has been implemented so that the windows are only sorted once at the beginning, and as the windows slide along the arrays with increasing i, elements are popped in and popped out so preserve the sorted order. The rank-order-percentile can then be calculated via a binary search of the sorted window, which is also part of the cython kernel. Finally, in order to reduce discreteness effects, sub-grid noise can optionally be added: rather than painting y_obs onto the model galaxy, instead we can paint a random number drawn from the interval (y_obs[r_1-1], y_obs[r_1+1]). This is a recommended option that comes at no loss of fidelity because the PDF is not being resolved on scales equal to 1/nwin anyway.

For model galaxy samples with ~1e6 elements, the CAM calculation takes ~500ms - 1s, depending on the size of the window.

CC @manodeep @duncandc @h-aung @yymao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant