-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue withGEOSPreparedContains
#1024
Comments
The problem is that the GEOS "prepared" implementation is never invoked because the argument to
then you should see quite different performance:
Maybe |
Note that due to the well-known quirk of the OGC A test could be made for a collection containing a single Polygon or MultiPolygon. It seems to me too complex to test to see if a collection of Polygons is a valid MultiPolygon so that
I think the current "best-effort" semantics are simpler to use. Perhaps the documentation could provide more guidance on when specific inputs might not provide a performance boost (however, this is a bit of a moving target). I'm working on a new implementation of the topology predicate algorithm (called |
It's impressive that Rust |
That's a good point about "contains", though calling Better would be to have a |
Yes, that's the best extension of the current implementation.
How so? |
I had been thinking of just extracting the points, lines, and polygons from the |
Thanks @dbaston and @dr-jts for your replies and explanations!
Apart from code changes, I think a note in the
As a user, I agree with @dr-jts that the current semantics are easier to use!
Just an idea, but maybe a notice message could be issued in this case? I wouldn't have known that my call to
Sorry if this is naïve, but would it possibly make sense to make this test when the GeometryCollection is created? E.g. checking if a GeometryCollection is homogenous of type @dbaston I tried your suggestion of first creating a However, despite the explanation above I still haven't understood why exactly const points = pointsFeatureCollection
.features
.map(f => geojsonToGeosGeom(f, geos));
const polygon = geojsonToGeosGeom(polygonFeatureCollection, geos);
const prepared_polygon = geos.GEOSPrepare(polygon);
let start = performance.now();
const result_fast = points.map(p =>
[...Array(geos.GEOSGetNumGeometries(polygon)).keys()]
.some(i => geos.GEOSContains(geos.GEOSGetGeometryN(polygon, i), p))
);
// takes ~100 ms
let end = performance.now();
console.log(~~(end - start));
start = performance.now();
const result_slow = points.map(p => geos.GEOSPreparedContains(prepared_polygon, p) ? true : false);
// takes ~2080 ms
end = performance.now();
console.log(~~(end - start)); |
The doc should say that |
Personally I'm opposed to notice messages, since they also may never be noticed, in headless operation. It seems to me that documentation is enough. You need to know how the API works in order to use it effectively. |
This is because the algorithm(s) for prepared geometries are not designed to optimize GeometryCollection inputs. Instead, the call is simply passed on to the regular |
I think the two methods are not equivalent, for the reason you mentioned above:
With the looping method above you can avoid computing topology on the entire collection and return true as soon as you find a containing polygon. |
I guess another potential implementation is to have |
Hi @dr-jts, Thanks a lot for the great work, especially locationtech/jts#1052! I re-ran the benchmark with updated dependencies and now the performance improved drastically:
* Latest available Shapely build 2.1.0.dev0+140.gac708af includes GEOS 3.12.2 and therefore doesn't show the performance gains from locationtech/jts#1052 yet. Benchmark was performed using GEOS 3.13.0 (C-API and JS-WASM), Turf.js v7.1.0 and geo v0.29.2 on a 14" MacBook Pro M1. Previous benchmark used GEOS 3.12.1, Turf.js v6.5.0 and geo 0.27.0 on the same machine. [Go to benchmark repository](This repository stores the code for the benchmark https://github.com/chrispahm/contains-benchmark?tab=readme-ov-file#benchmarking-various-implementations-of-the-geospatial-contains-function) Closing this as resolved! |
Great, thanks for confirming that things are getting better! |
While working on a WebAssembly build of GEOS, I encountered a performance issue with
GEOSPreparedContains
while benchmarking various implementations of the geospatialcontains
function.I compared the performance of GEOS (using the C-API) with other geospatial libraries, such as Turf.js (JS), geo (Rust), and vectorized Shapely (Python). I used the small scale Natural Earth 1:110m land dataset as the polygon, and a set of 1000 random points to test against the polygon. I measured the time to perform the point-in-polygon test for all points, excluding the time to load the dataset and create the geometries.
The results are shown in the table below:
point-in-polygon test
(Tested on a 2021 14" MacBook Pro M1, the source code for each tests can be found in this repo: https://github.com/chrispahm/contains-benchmark/tree/main/src )
As you can see, all GEOS backed implementations (Wasm, Shapely, C) performed the task significantly slower than the other libraries Turf.js and geo. This seems relatively strange, given that the test is relatively simple (127 polygons against 1000 points). I was wondering if there's a known reason for this performance gap, or maybe some trick that could be used to speed it up?
I also tried to use the
GEOSPreparedContainsXY
function instead ofGEOSPreparedContains
, as suggested in this issue, but it did not make a measurable difference in this case.I am not trying to blame or criticize anyone with this issue, but rather to provide some constructive feedback and hopefully help to improve the library. I understand that GEOS is a complex and mature project, and that there may be trade-offs or limitations that I am not aware of. I also acknowledge that this benchmark is not comprehensive or representative of all use cases of GEOS. I just found it difficult to believe that a pure Javascript library performs this task faster than compiled C 😊
I am happy to provide more details or code samples if needed!
The text was updated successfully, but these errors were encountered: