Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort the mergeable segments before computing merged segments #1207

Merged
merged 22 commits into from
Jul 25, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 53 additions & 2 deletions cpp/include/cuspatial/detail/find/find_and_combine_segment.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,20 @@
#include <cuspatial/cuda_utils.hpp>
#include <cuspatial/detail/utility/linestring.cuh>
#include <cuspatial/error.hpp>
#include <cuspatial/iterator_factory.cuh>

#include <rmm/cuda_stream_view.hpp>
#include <rmm/exec_policy.hpp>

#include <thrust/uninitialized_fill.h>
#include <thrust/sort.h>

namespace cuspatial {
namespace detail {

/**
* @internal
* @brief Kernel to merge segments, naive n^2 algorithm.
* @brief Kernel to merge segments. Each thread works on one segment space,
* with a naive n^2 algorithm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps naive is the wrong word. "brute force"? Is lower complexity a possibility?

Copy link
Contributor Author

@isVoid isVoid Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. What's the lower bound of the algorithm complexity here? Given a set of segments, to merge all mergeable segments, one at least needs to look at all segments once, so the lower bound is O(N) (N being the number of segments in the set). To achieve that, I assume you need to design some sort of hashing function. When you look at one segment, the hashing function will compute the key and look into the map to find if there is existing segment that can be merged with it. This key would be similar to what we designed in the sorting comparator here - a combination of group id, slope and lower left of the segment. A map on CPU code is simple, but may be quite difficult to implement on device and could invoke unwanted slow down.

So that said, the current algorithm isn't naive (or brute force) at all - while there is a nested loop here, the sorting actually already did part of the hashing work like above. The sorting makes sure that the outer loop only run against all other segments once in every mergeable group, all subsequent segments are marked merged_flag == 1 after the initial run. So the nested loop here is actually on the order of O(N) here. The aggregated complexity is dominated by the sorting, which is O(NlogN).

*/
template <typename OffsetRange, typename SegmentRange, typename OutputIt>
void __global__ simple_find_and_combine_segments_kernel(OffsetRange offsets,
Expand All @@ -44,6 +46,9 @@ void __global__ simple_find_and_combine_segments_kernel(OffsetRange offsets,
merged_flag[i] = 0;
}

// For each of the segment, loop over the rest of the segment in the space and see
// if it is mergeable with the current segment.
// Note that if the current segment is already merged. Skip checking.
isVoid marked this conversation as resolved.
Show resolved Hide resolved
for (auto i = offsets[pair_idx]; i < offsets[pair_idx + 1] && merged_flag[i] != 1; i++) {
for (auto j = i + 1; j < offsets[pair_idx + 1]; j++) {
auto res = maybe_merge_segments(segments[i], segments[j]);
Expand All @@ -57,20 +62,66 @@ void __global__ simple_find_and_combine_segments_kernel(OffsetRange offsets,
}
}

/**
* @brief Comparator for sorting the segment range.
*
* This comparator makes sure that the segment range are sorted by the following keys:
isVoid marked this conversation as resolved.
Show resolved Hide resolved
* 1. Segments with the same space id are grouped together.
* 2. Segments within the same space are grouped by their slope.
* 3. Within each slope group, segments are sorted by their lower left point.
*/
template <typename index_t, typename T>
struct segment_comparator {
bool __device__ operator()(thrust::tuple<index_t, segment<T>> const& lhs,
thrust::tuple<index_t, segment<T>> const& rhs) const
{
auto lhs_index = thrust::get<0>(lhs);
auto rhs_index = thrust::get<0>(rhs);
auto lhs_segment = thrust::get<1>(lhs);
auto rhs_segment = thrust::get<1>(rhs);

// Compare space id
if (lhs_index == rhs_index) {
// Compare slope
if (lhs_segment.collinear(rhs_segment)) {
// Sort by the lower left point of the segment.
return lhs_segment.lower_left() < rhs_segment.lower_left();
}
return lhs_segment.slope() < rhs_segment.slope();
}
return lhs_index < rhs_index;
}
};

/**
* @internal
* @brief For each pair of mergeable segment, overwrites the first segment with merged result,
* sets the flag for the second segment as 1.
*
* @note This function will alter the input segment range by rearranging the order of the segments
* within each space so that merging kernel can take place.
*/
template <typename OffsetRange, typename SegmentRange, typename OutputIt>
void find_and_combine_segment(OffsetRange offsets,
SegmentRange segments,
OutputIt merged_flag,
rmm::cuda_stream_view stream)
{
using index_t = typename OffsetRange::value_type;
using T = typename SegmentRange::value_type::value_type;
auto num_spaces = offsets.size() - 1;
if (num_spaces == 0) return;

// Construct a key iterator using the offsets of the segment and the segment itself.
auto space_id_iter = make_geometry_id_iterator<index_t>(offsets.begin(), offsets.end());
auto space_id_segment_iter = thrust::make_zip_iterator(space_id_iter, segments.begin());

thrust::sort_by_key(rmm::exec_policy(stream),
harrism marked this conversation as resolved.
Show resolved Hide resolved
space_id_segment_iter,
space_id_segment_iter + segments.size(),
segments.begin(),
segment_comparator<index_t, T>{});

auto [threads_per_block, num_blocks] = grid_1d(num_spaces);
simple_find_and_combine_segments_kernel<<<num_blocks, threads_per_block, 0, stream.value()>>>(
offsets, segments, merged_flag);
Expand Down
11 changes: 11 additions & 0 deletions cpp/include/cuspatial/geometry/segment.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,17 @@ class alignas(sizeof(Vertex)) segment {
/// Return the length squared of segment.
T CUSPATIAL_HOST_DEVICE length2() const { return dot(v2 - v1, v2 - v1); }

/// Return slope of segment.
T CUSPATIAL_HOST_DEVICE slope() { return (v2.y - v1.y) / (v2.x - v1.x); }

/// Return the lower left vertex of segment.
Vertex CUSPATIAL_HOST_DEVICE lower_left() { return v1 < v2 ? v1 : v2; }

bool CUSPATIAL_HOST_DEVICE collinear(segment<T> const& other)
isVoid marked this conversation as resolved.
Show resolved Hide resolved
{
return (v1.x - v1.y) * (other.v2.x - other.v2.y) == (v2.x - v2.y) * (other.v1.x - other.v1.y);
}

private:
friend std::ostream& operator<<(std::ostream& os, segment<T> const& seg)
{
Expand Down
62 changes: 62 additions & 0 deletions cpp/tests/find/find_and_combine_segments_test.cu
Original file line number Diff line number Diff line change
Expand Up @@ -281,3 +281,65 @@ TYPED_TEST(FindAndCombineSegmentsTest, nooverlap3)
{0, 0},
{S{P{0.0, 0.0}, P{1.0, 1.0}}, S{P{0.0, 1.0}, P{1.0, 0.0}}});
}

TYPED_TEST(FindAndCombineSegmentsTest, twospaces)
{
using T = TypeParam;
using index_t = std::size_t;
using P = vec_2d<T>;
using S = segment<T>;

auto segments = make_segment_array<index_t, T>({0, 2, 4},
{S{P{0.0, 0.0}, P{1.0, 1.0}},
S{P{1.0, 1.0}, P{2.0, 2.0}},
S{P{1.0, 1.0}, P{0.0, 0.0}},
S{P{2.0, 2.0}, P{1.0, 1.0}}});

CUSPATIAL_RUN_TEST(this->run_single_test,
segments,
{0, 1, 0, 1},
{S{P{0.0, 0.0}, P{2.0, 2.0}},
S{P{1.0, 1.0}, P{2.0, 2.0}},
S{P{0.0, 0.0}, P{2.0, 2.0}},
S{P{2.0, 2.0}, P{1.0, 1.0}}});
}

TYPED_TEST(FindAndCombineSegmentsTest, twospaces_non_contiguous_segments_with_empty)
{
using T = TypeParam;
using index_t = std::size_t;
using P = vec_2d<T>;
using S = segment<T>;

auto segments = make_segment_array<index_t, T>({0, 4, 4},
{S{P{1.0, 1.0}, P{2.0, 2.0}},
S{P{3.0, 3.0}, P{4.0, 4.0}},
S{P{0.0, 0.0}, P{1.0, 1.0}},
S{P{2.0, 2.0}, P{3.0, 3.0}}});

CUSPATIAL_RUN_TEST(this->run_single_test,
segments,
{0, 1, 1, 1},
{S{P{0.0, 0.0}, P{4.0, 4.0}},
S{P{1.0, 1.0}, P{2.0, 2.0}},
S{P{2.0, 2.0}, P{3.0, 3.0}},
S{P{3.0, 3.0}, P{4.0, 4.0}}});
}

TYPED_TEST(FindAndCombineSegmentsTest, onespace_non_contiguous_segments_overlaps)
{
using T = TypeParam;
using index_t = std::size_t;
using P = vec_2d<T>;
using S = segment<T>;

auto segments = make_segment_array<index_t, T>(
{0, 3},
{S{P{1.0, 1.0}, P{2.0, 2.0}}, S{P{4.0, 4.0}, P{5.0, 5.0}}, S{P{-1.0, -1.0}, P{4.0, 4.0}}});

CUSPATIAL_RUN_TEST(
this->run_single_test,
segments,
{0, 1, 1},
{S{P{-1.0, -1.0}, P{5.0, 5.0}}, S{P{1.0, 1.0}, P{2.0, 2.0}}, S{P{4.0, 4.0}, P{5.0, 5.0}}});
}
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include <cuspatial/geometry/vec_2d.hpp>
#include <cuspatial/intersection.cuh>

#include <rmm/exec_policy.hpp>
#include <thrust/host_vector.h>
isVoid marked this conversation as resolved.
Show resolved Hide resolved

template <typename T>
Expand Down