Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support equalsTopo for GeometryCollections #539

Open
airbreather opened this issue May 13, 2020 · 13 comments
Open

Support equalsTopo for GeometryCollections #539

airbreather opened this issue May 13, 2020 · 13 comments

Comments

@airbreather
Copy link
Contributor

See also: NetTopologySuite/NetTopologySuite#213

We think that there's a reasonably straightforward way to do equalsTopo for GeometryCollections, by comparing the polygon, line, and point components on each side.

@FObermaier had this idea:

We could do this:

  • Extract the single instance geometries (Point, LineString, Polygon) from the GeometryCollection.
  • Sort them by type and build Multi-geometries.
  • Call EqualsTopologically on each of these.

On the NTS side of things, this is a major blocker for some folks, so I figured I'd raise this up here.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

The problem is that components of GeometryCollections can overlap, and covered components do not contribute to the equality check. For instance, a rectangle with a line inside it is topologically equal to the rectangle alone. An even trickier example is where a line is covered by two adjacent polygons.

I'm not sure how the proposal handles this situation?

[EDIT] Based on comment below I realized that my analysis above is based on my own conception of how GC topology could work. I'm not sure if this is supported by any wider opinion (such as the OGC/ISO standard). So this needs further discussion.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

Your example on the NTS issue is interesting. There does seem to be a conflict between the semantics of GeometryCollection boundary, and the semantic suggested above of the collection essentially being topologically equal to the union of the components.

I'm not sure which definition should take priority. The idea that a GeometryCollection functions just as an independent collection of Geometries has some appeal. The implementation for equalsTopo is then straightforward as you suggest.

Does this generalize to other predicates? The predicates are just names for DE-9IM patterns - how would that work for collections?

Is there any clue for GC semantics in the standards docs?

@airbreather
Copy link
Contributor Author

airbreather commented May 13, 2020

The problem is that components of GeometryCollections can overlap, and covered components do not contribute to the equality check. For instance, a rectangle with a line inside it is topologically equal to the rectangle alone. An even trickier example is where a line is covered by two adjacent polygons.

I wasn't going to comment on this when opening the issue, because I'm not 100% on what the standards have to say about GC semantics in such cases, either.

My thought was that, as long as we can assume that each individual component of a GC is valid by itself, then we can start by doing UnaryUnionOp on both sides before extracting the polygon, line, and point components. This would handle excluding redundant lower-dimension components on either side, and we would be able to properly handle situations like a line string that's only partially within a polygon in the same GC.

Does this generalize to other predicates? The predicates are just names for DE-9IM patterns - how would that work for collections?

Well, equalsTopo is at least somewhat special, because if the UnaryUnionOp step leaves a point in gc1 that's covered by a polygon in gc2, then equalsTopo would want to short-circuit and return false, but coveredBy would need to keep going.

A more general solution seems within the realm of possibility, but probably a lot harder than going case-by-case and considering how to apply it to the next specific predicate...

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

My thought was that, as long as we can assume that each individual component of a GC is valid by itself, then we can start by doing UnaryUnionOp on both sides before extracting the polygon, line, and point components. This would handle excluding redundant lower-dimension components on either side, and we would be able to properly handle situations like a line string that's only partially within a polygon in the same GC.

Well, equalsTopo is at least somewhat special, because if the UnaryUnionOp step leaves a point in gc1 that's covered by a polygon in gc2, then equalsTopo would want to short-circuit and return false, but coveredBy would need to keep going.

If you mean that UnaryUnion is applied to each entire GeometryCollection input, then the above doesn't happen. A covered point would not be present in the "topologically-reduced" output of the union.

This seems like the definition most in line with the general "point-set topology" basis for the OGC semantics. And in case it isn't obvious, this is NOT the definition that was proposed in the opening comment of this issue.

A more general solution seems within the realm of possibility, but probably a lot harder than going case-by-case and considering how to apply it to the next specific predicate...

The point-set topology definition makes a general solution simple, because it follows the exact same logic as currently in place.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

Note that if the point-set topology definition is accepted, then the current semantics of getBoundary are wrong. So this might have to be changed? Which is unpleasant (and also not very performant, since it requires a full union.)

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

i have to say that the original use case for the NTS issue seems questionable to me. I don't see why you would use topological equality as the criteria for deciding whether to update a database record.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

Another note: I'm working on an idea to provide a completely new implementation of the spatial predicates. It will operate in a way which is closer to the concept of point-set topology; i.e. it will evaluate the topology only at "significant points". The full topological relationship is determined by the sum of the topologies at each point.

This has the following advantages:

  • It will hopefully be more performant
  • It allows short-circuiting for simple predicates, in particular intersects
  • I think it will allow using a precision model or distance tolerance, which has been requested for a long time

I mention this because it works well with the point-set topology approach suggested above.

@roji
Copy link

roji commented May 13, 2020

To give some context, this is for an automatic change tracking mechanism, where an ORM detects whether a value changed or not. My mental mode here was to think of different representations which are equal topologically as different "non-canonicalized" representations, and for the reasonable user expectation to be to want updates only if the geometry changed topologically. I'd be happy to hear arguments to the contrary, and possibly to change the implementation to do exact instead of topological equality if that makes more sense.

But in any case, the question goes beyond database change tracking etc.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

A further remark on the original proposal: to build a MultiPolygon out of GeometryCollection components Unary Union has to be run anyway, since GC components can overlap.

@dr-jts
Copy link
Contributor

dr-jts commented May 13, 2020

A further comment on NTS 213:

Looking at this more closely, I see that the contradiction in MSSQL arises because apparently MS-SQL has already defined a semantic for STContains on GeometryCollections. So this skews the issue significantly in that context. I don't think this should be used as support for the design in the JTS context.

@airbreather
Copy link
Contributor Author

airbreather commented May 13, 2020

A further comment on NTS 213:

Looking at this more closely, I see that the contradiction in MSSQL arises because apparently MS-SQL has already defined a semantic for STContains on GeometryCollections. So this skews the issue significantly in that context. I don't think this should be used as support for the design in the JTS context.

Actually, I think any UnaryUnionOp business is exactly the kind of thing that I was trying to reject in NetTopologySuite/NetTopologySuite#213 (comment)... heh...

To resolve the particular example that I brought up, I guess we need a definitive answer to the question: "If two polygonal components of a GC share a line segment that is covered by no other polygonal components in that same GC, then is that line segment part of the GC's interior or its boundary?"

  • I don't know how (or if) OGC specifies a way to resolve that.
  • MSSQL's answer seems to be "part of the GC's boundary".
  • I'm perfectly happy with either answer, if OGC is silent on the matter.

Edit to add: notably, if the answer is "part of the GC's interior", then it seems that a UnaryUnionOp approach might not be so wrong after all.

@archanox
Copy link

Is there a rough ETA when this issue will be resolved?

@dr-jts
Copy link
Contributor

dr-jts commented Jan 7, 2025

@airbreather does RelateNG (#1052) fix this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants