Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward-merge branch-24.12 into branch-25.02 #1488

Merged
merged 1 commit into from
Nov 21, 2024
Merged

Conversation

rapids-bot[bot]
Copy link

@rapids-bot rapids-bot bot commented Nov 21, 2024

Forward-merge triggered by push to branch-24.12 that creates a PR to keep branch-25.02 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

Improves on a performance regression identified by @trxcllnt when using `quadtree_point_in_polygon`

Seems like the core slowdown is that the passed `points` is a `GeoSeries`, and the x/y points are generated by interleaving them in `GeoColumnAccessor.xy` and then slicing x/y back out.

Since these points appear to be held in a `cudf.ListDtype` (i.e. `[x, y]`), I tried avoiding interleaving and slicing the the x and y points individually in the lists using the `cudf.Series.list.get` API but this was failing for x/y points for `.polygons.x`/`.polygons.y` since some cuspatial APIs expect these points to be `float` (not sure if it's a cudf bug in `list.get` or how points as structured for polygons)

Instead, I was able to get a 5x speedup by

* Caching the interleaving step to avoid doing this multiple times
* Slicing a cupy array of points instead of a `cudf.Series` (which uselessly slices an `.index` which is not used)


<details>
  <summary>Benchmarking code</summary>

  ```python

# 1.3G    points.arrow
# 722K    polys.arrow

import cudf
import cupy
import cuspatial
import pyarrow as pa
from time import perf_counter_ns

polys = pa.ipc.open_stream("polys.arrow").read_all()
polys = cudf.DataFrame.from_arrow(polys).rename(columns={"tract": "poly"})
point = polys["poly"].list.leaves.struct.explode()
polys = cuspatial.GeoSeries.from_polygons_xy(
    point.interleave_columns(),
    polys["poly"]._column.elements.offsets,
    polys["poly"]._column.offsets,
    cupy.arange(0, len(polys) + 1)
)

point = pa.ipc.open_stream("points.arrow").read_all()
point = cudf.DataFrame.from_arrow(point)
min_x = point["x"].min()
max_x = point["x"].max()
min_y = point["y"].min()
max_y = point["y"].max()

max_size = 0
max_depth = 16
threshold = 10
while max_size <= threshold:
    max_depth -= 1
    max_size = len(point) / pow(4, max_depth) / 4

scale = max(max_x - min_x, max_y - min_y) / (1 << max_depth)

point_xy = cuspatial.GeoSeries.from_points_xy(point.interleave_columns())
point_map, quadtree = (
    cuspatial.quadtree_on_points(point_xy,
                                 min_x,
                                 max_x,
                                 min_y,
                                 max_y,
                                 scale,
                                 max_depth,
                                 max_size))

t0 = perf_counter_ns()

cuspatial.quadtree_point_in_polygon(
    cuspatial.join_quadtree_and_bounding_boxes(
        quadtree,
        cuspatial.polygon_bounding_boxes(polys[0:1]),
        min_x,
        max_x,
        min_y,
        max_y,
        scale,
        max_depth
    ),
    quadtree,
    point_map,
    point_xy,
    polys[0:1]
)

t1 = perf_counter_ns()

print(f"{(t1 - t0) / (10 ** 6)}ms")

t0 = perf_counter_ns()
poly_offsets = cudf.Series(polys[0:1].polygons.part_offset)._column
ring_offsets = cudf.Series(polys[0:1].polygons.ring_offset)._column
poly_points_x = cudf.Series(polys[0:1].polygons.x)._column
poly_points_y = cudf.Series(polys[0:1].polygons.y)._column

from cuspatial._lib import spatial_join

cudf.DataFrame._from_data(
    *spatial_join.quadtree_point_in_polygon(
        cuspatial.join_quadtree_and_bounding_boxes(
            quadtree,
            cuspatial.polygon_bounding_boxes(polys[0:1]),
            min_x,
            max_x,
            min_y,
            max_y,
            scale,
            max_depth
        ),
        quadtree,
        point_map._column,
        point["x"]._column,
        point["y"]._column,
        poly_offsets,
        ring_offsets,
        poly_points_x,
        poly_points_y,
    )
)

t1 = perf_counter_ns()

print(f"{(t1 - t0) / (10 ** 6)}ms")

# 127.406344ms  # this PR
# 644.963021ms  # branch 24.10
```

</details>

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Paul Taylor (https://github.com/trxcllnt)
  - Mark Harris (https://github.com/harrism)

URL: #1446
@rapids-bot rapids-bot bot requested a review from a team as a code owner November 21, 2024 01:05
@rapids-bot rapids-bot bot requested review from harrism and bdice November 21, 2024 01:05
@GPUtester GPUtester merged commit 5704129 into branch-25.02 Nov 21, 2024
Copy link
Author

rapids-bot bot commented Nov 21, 2024

SUCCESS - forward-merge complete.

@github-actions github-actions bot added the Python Related to Python code label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python Related to Python code
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

2 participants