Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DE-9IM Proposal #515

Closed
michaelkirk opened this issue Oct 1, 2020 · 19 comments
Closed

DE-9IM Proposal #515

michaelkirk opened this issue Oct 1, 2020 · 19 comments

Comments

@michaelkirk
Copy link
Member

michaelkirk commented Oct 1, 2020

per #513 I've written up a proposal about an addition I'd like to make to geo - introducing a DE-9IM module, and using it to help drive and correct some of our relation traits: e.g. Contains, Intersects. It would also make it much easier to add future traits like Disjoint, Touches, Crosses, etc.

Read the full proposal here:
https://gist.github.com/michaelkirk/3ac4afe651fe24ad229dec0b56ec8f9d

(a gist allows easier editing than updating issue text in situ).

Please take a look! I spent some time adding context, in hope that it would be accessible to you even if you don't know what DE-9IM is.

The idea is that after getting a little feedback on the proposal, I'll add some specific issues to this tracking issue.

@urschrei
Copy link
Member

urschrei commented Oct 1, 2020

This is great. For me, the main motivations are (in no order)

  1. Correctness of our spatial relations (which is only possible with a coherent relation model, which AFAIK can only be DE9IM);
  2. The ability to add the remaining spatial relations in something other than an ad-hoc way (which is what we've been doing for years, and which is fine, but…);
  3. The ability to confidently interop with GEOS (thus PostGIS, QGIS, and Shapely), and JTS.

With number 3 probably being the biggest deal both in terms of "maturity" of the crate, and number 1 for laying the foundations for optimisation work – up until recently there were robustness issues, and no relational model. Once we have both we can confidently try to make things fast (if we want to).

@rmanoka
Copy link
Contributor

rmanoka commented Oct 2, 2020

Great proposal @michaelkirk; would indeed be awesome to get all the predicate traits available here! Also, love the concise unicode symbol! In the interest of helping you out on this rather substantial effort (hope you don't mind :) ), here is my suggested breakdown of the general effort into as many independent pieces as I can imagine. I'm just thinking aloud, and completely open to suggestions, or counter-arguments!

  1. Geomgraph. Probably the crux of the new functionalities. Given its usefulness for above traits, I feel it should be useful for other use-cases too. I suggest we make this a public data-structure that can be built and used by the end-user too. We could brainstorm a bit on how we model this in our crate, to best re-use computations.

  2. Tests and Benchmarks. Very useful as we proceed to aim for correctness, efficiency and inter-op with other libs. Are there any comprehensive suites available (say in geos / JTS)?

  3. Definitions. Formally define the semantics of each of the geometry types and traits. The traits are fairly well defined in terms of the 👖-matrix. In the past few PRs, we have added the defn. of many of the geometry types too. Would be good to round-up the definitions for all the types, edge cases, and esp. the differences from other libs.

  4. 👖-Matrix. Your proposal pretty much nails the requirements here. I suppose we leave the existing adhoc implementations as long as they're provably correct?

@urschrei
Copy link
Member

urschrei commented Oct 2, 2020

Geomgraph. Probably the crux of the new functionalities. Given its usefulness for above traits, I feel it should be useful for other use-cases too. I suggest we make this a public data-structure that can be built and used by the end-user too. We could brainstorm a bit on how we model this in our crate, to best re-use computations.

Completely agree, and would love to do this collaboratively.

Tests and Benchmarks. Very useful as we proceed to aim for correctness, efficiency and inter-op with other libs. Are there any comprehensive suites available (say in geos / JTS)?

I don't know about JTS, but I know GEOS has an extremely comprehensive test suite here.

@rmanoka
Copy link
Contributor

rmanoka commented Oct 3, 2020

I took a shot at item 3 above (defns. and differences of types) in #516. There were only minor differences between our types and the OGC-SFA specs afaiu.

  1. The specs support a "empty geometry" (eg. wkt: "POINT EMPTY"). I suppose this along the rust paradigm of not supporting a "null pointer" unless user explicitly uses a Option<T> type. Doesn't seem to affect interop. / correctness.

  2. The specs and other libs support a separate LinearRing, but we overload that in LineString. This seems okay as it is just defined as a closed and simple LineString, and we don't check validity for now. Have added some docs highlighting this fact. Again, doesn't seem to affect interop. or correctness.

  3. The specs say in passing that all the traits must also work for a generic GeometryCollection but do not really specify the boundary, and the interior. In contrast, for MultiLineString it defines a "odd-even" rule that magically converts boundary into interior points sometimes. I suppose the sane thing would be group the constituents by dimension (so all 0-d into one MultiPoint, all 1-d into one MultiLineString, etc), and define the interior and boundary as the respective unions. We don't yet have any impls for this type, so we can study other libs, and understand this better. Please let me know if you find any formal defn. for this type.

@michaelkirk
Copy link
Member Author

In the interest of helping you out on this rather substantial effort (hope you don't mind :))

Thanks for getting the ball rolling @rmanoka - I don't mind help at all! Despite my making the pitch and writing the proposal, TBH I don't even particularly want to write this code, I just want it to exist. 😂

One challenge of project management for volunteer efforts like this is, unlike a 9-5, we don't have insight into how much time/energy each other have. I know I'm guilty of just disappearing for long spans of time.

@rmanoka and @urschrei: both of you have expressed some interest in working on this - do either of you have substantial time to dedicate to it in the next two weeks? If so, I'd be happy to have you do as much as you want, (including driving the feature to completion!). But otherwise, I think it makes sense to take an approach that can benefit from your expertise, but doesn't rely on your availability for the next couple weeks.

@rmanoka
Copy link
Contributor

rmanoka commented Oct 4, 2020

@michaelkirk Not sure of substantial help with the geomgraph implementation, as I'm not familiar with the algorithm (any references?). I should be able to discuss and help with the overall design of the module though.

I should be able to round-up the existing adhoc Intersects implementation in #516 in the coming week or two. Intersects is a particularly "easy" trait, as there is no distinction between boundary and interior there, so not many complicated edge cases.

I'm hoping to identify the cases in Intersects that would need geomgraph; those can be immediate use-cases once we have the module.

@urschrei are these the geos test suite you're referring to? The XML tests do seem promising to port as a test suite here. I couldn't figure which of those were specifically about Intersects or Contains.

@michaelkirk
Copy link
Member Author

@michaelkirk Not sure of substantial help with the geomgraph implementation, as I'm not familiar with the algorithm (any references?). I should be able to discuss and help with the overall design of the module though.

I don't have anything other than the JTS implementation: https://github.com/locationtech/jts/blob/master/modules/core/src/main/java/org/locationtech/jts/geomgraph/GeometryGraph.java

@michaelkirk michaelkirk changed the title DE-9IM Proposal / Tracking Issue DE-9IM Proposal Oct 7, 2020
@michaelkirk
Copy link
Member Author

Rather than using this as a tracking issue, I'm trying out GH's project feature to list out the sub-tasks. I've added some cards to a newly created DE-9IM project: https://github.com/georust/geo/projects/1

I've got a couple of milestones, represented as separate columns of cards. A POC, which represents the bare minimum work to get a relation matrix, and a second followup column for everything else. Within the columns, I've tried to order the sequence of work from top to bottom, so please be mindful of that as you add new cards/issues to the project. It might make sense to further refine this, but this is as far as seemed useful to me for now.

I've never used GH projects before and reserve the right to abandon it if it's annoying. 😛

@rmanoka
Copy link
Contributor

rmanoka commented Feb 13, 2021

@michaelkirk I wanted to circle back to this and see if you needed a hand in this. I'm specifically interested in bringing in your work on porting the XMl testsuite. For now, I want to be able to run the tests on Intersects and ConvexHull traits, and see if they trigger any bugs.

Also, is the GeomGraph you are working on related to the plane-sweep data-structure of the Martinez-Rueda and related algorithms (used by geo-booleanop crate)? Intuitively, it seems like boolean-op is a harder question of not just checking for intersection or containment, but actually computing the containment itself. Thus, their data-structures should somehow be amenable for the DE9IM predicates too. But, there is a lot of details I don't fully understand yet; just began pouring through the relevant papers, so my current thoughts may be silly. Pls. let me know if you have any thoughts on this.

@michaelkirk
Copy link
Member Author

Hi @rmanoka - I am still making (slow) progress on this. In fact, the releases of geo-types and wkt this week were in pursuit of this very thing, integrating the JTS test cases, which embed WKT into XML test files.

For months I've been kind of re-working it to so that I can cleave off functionality into more manageable PR's. The current diff is 4k-5k (plus a bazillion lines of XML tests).

I'd like to continue reducing that diff, but it's starting to be diminishing returns. I think extracting the test runner could be useful outside of the context of the geometry graph, and I could open a PR attempting that this week. Would that work for you?

Also, is the GeomGraph you are working on related to the plane-sweep data-structure of the Martinez-Rueda and related algorithms (used by geo-booleanop crate)?

I'm not sure! I haven't read many GIS papers, and readily believe that there are exciting and better ways to do things, but so far I have been approaching this from the perspective of "No one ever got fired for directly porting JTS".

Here's what I know: For efficiently computing edge intersection sets, JTS uses a monotone chain sweep algorithm. But for now, I've not even implemented that MC line sweeper. Instead, I've implemented a slower, more easy to debug one, based on JTS's "SimpleEdgeSetIntersector".

My current plan is to open a PR which is more or less a straight port of the easiest subset of JTS necessary to pass the test suite, and have that test suite lay the groundwork for iterating on a faster/better version down the road.

My personal motivation in this has primarily been:

  1. fix some wrong results we're currently giving
  2. expand our API
  3. have an easy way to verify standards conformance of our API (i.e. outputting a denim matrix)

I'd be very pleased to have someone else working on it at some point, or replace my code with something entirely better, provided the tests continue to pass.

@michaelkirk
Copy link
Member Author

BTW, here's where I'm running the JTS "relate" tests in my experimental branch: https://github.com/georust/geo/blob/mkirk/geomgraph/geo/src/geomgraph/test/mod.rs#L203

@rmanoka
Copy link
Contributor

rmanoka commented Feb 14, 2021

I'd like to continue reducing that diff, but it's starting to be diminishing returns. I think extracting the test runner could be useful outside of the context of the geometry graph, and I could open a PR attempting that this week. Would that work for you?

That would be fantastic. I too think the tests have independent utility, and would be useful to verify our algos. Given the tests are ~15mb, while our code base is like 2mb, we could even make it a separate crate.

Here's what I know: For efficiently computing edge intersection sets, JTS uses a monotone chain sweep algorithm. But for now, I've not even implemented that MC line sweeper. Instead, I've implemented a slower, more easy to debug one, based on JTS's "SimpleEdgeSetIntersector".

Thanks, that's quite informative. In comparison, the Martinez-Rueda family of algorithms use Bentley-Ottman or similar as a sub-routine.

My motivations were to mainly get a unifying understanding of the working of both geo-boolean-op, and the JTS predicates (which may be considered geo-boolean-predicates). The hope is to also help solve bugs / improve both these.

My current plan is to open a PR which is more or less a straight port of the easiest subset of JTS necessary to pass the test suite, and have that test suite lay the groundwork for iterating on a faster/better version down the road.

That sounds like a good plan indeed. Thanks for working on this major effort!

@urschrei
Copy link
Member

@rmanoka Bentley-Ottmann has come up before (#80, #81), along with Greiner-Hormann – it has inevitably been the case that someone has an idea and then gets bogged down in the complexity of writing the algo from scratch (I include myself among the guilty, to be clear). We are in much better shape now than we were three years ago though, thanks to you and @michaelkirk, so maybe we can get there this time.

@rmanoka
Copy link
Contributor

rmanoka commented Feb 15, 2021

Bentley-Ottmann sweep got me thinking about how we plan to support specializations. For instance, our LineString - LineString intersection logic naively checks every pair of segment, with a complexity of O(n^2); the sweep would reduce this to O(n log n). However, sweep requires the coordinates to be float, so as to interpolate within segments.

Unfortunately, an impl<T: Float> Intersection<LineString<T>> and a general impl<T> Intersection<LineString<T>> cannot coexist at the moment. The specialization language feature would allow this, but may not be in stable rust anytime soon (aside: there seems to be a "min_specialization" that may be sufficient, but am unable to find any reference to plans to stabilize it either).

On the other hand, it also seems unfair to not provide intersection checks between say LineString<{integer}>s. This is a specific example, but the general question of specialization (pun unintended) arises elsewhere too. Do we have guidelines / RFC for approaching this? My gut feeling is that some associated consts in the Kernel trait could support this, but it would be great to discuss further. Any thoughts?

@michaelkirk
Copy link
Member Author

Do we have guidelines / RFC for approaching this? My gut feeling is that some associated consts in the Kernel trait could support this, but it would be great to discuss further. Any thoughts?

I don't know of any formal RFC's for this.

On the other hand, it also seems unfair to not provide intersection checks between say LineString<{integer}>s

I haven't personally ever used the integer geometry stuff. I can imagine use cases exist, so I think handling integer types is nice. Rust is such that we can often support diverse use cases without negative runtime costs for the end user.

...however if it comes with much complexity (development, maintenance, or most important of all, accessibility/usability), I think it would be a mistake to let support for integer types drive development.

Said differently, I think we should only support integer geometries as much as is possible without negatively impacting people using float geometries. If this means people using integer geometries wholesale miss out on some functionality, I'm personally OK with that, and prefer losing integer functionality to the alternative of the library being less usable by the overwhelming majority of people trying to get things done.

But if we can support integers without detracting from floats, sure, why not?

@rmanoka
Copy link
Contributor

rmanoka commented Feb 17, 2021

I recently gained more respect for integer geoms. from this C++ clipping / boolean ops library. It seems to be one of the fastest, and accurate impls, and uses only integer coords. I think integer coords is quite useful for very accurate / predictable constructions as the rounding is easier to reason with compared to floats.

I agree we shouldn't need to go out of the way, unless there's a good cause for it. Typically, the integer implementation is the naive / straightforward one, and float would require more thought into handling edge-cases (eg Integer vs Float kernel).

@michaelkirk
Copy link
Member Author

michaelkirk commented Feb 17, 2021

I recently gained more respect for integer geoms. from this C++ clipping / boolean ops library.

Very interesting, thanks for sharing!

@rmanoka
Copy link
Contributor

rmanoka commented Mar 2, 2021

I made an exposition of the Bentley-Ottman algorithm here. The aim was to understand the many corner cases involved, and to figure out a design so it is usable for predicates, operations and possibly other use-cases. It's still a WIP, but sharing it in case anyone else has been thinking about these details.

bors bot added a commit that referenced this issue Apr 13, 2021
639: Introduce the geomgraph module for DE-9IM Relate trait r=michaelkirk a=michaelkirk

- [x] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users.
---

Fixes #513, #515

(I'm sorry it's so large)

~~I'm going to leave it as a draft (edit: 🤦 I failed to actually open the PR as a draft) while I wait to merge #636 and #638 and then do some rebasing, but I don't anticipate doing other large changes before review.~~ *update: ready for review!*

Here's some of the earlier work in pursuit of this:

#514
#516
#523
#524 
#538
#552
#561
#611
#628
#629
#636 

Primarily, this introduces the geomgraph module for a DE-9IM `Relate` trait.

geomgraph implements a topology graph largely inspired by JTS's module of the same name:
https://github.com/locationtech/jts/tree/jts-1.18.1/modules/core/src/main/java/org/locationtech/jts/geomgraph

You can see some of the reference code if you omit the "REMOVE JTS COMMENTS" commit. In some places the implementation is quite close to the JTS source. 

The overall "flow" is pretty similar to that of JTS, but in the small, there were some divergences. It's not easy (or desirable) to literally translate a Java codebase making heavy use of inheritance and pointers to rust. Additionally, I chose to take advantage of `Option` and rust's enums with associated data to make some case analysis more explicit.

There is a corresponding PR in our [jts-test-runner](georust/jts-test-runner#6) crate which includes the bulk of the tests for the new Relate trait.

## Algorithm Overview 

This functionality is accessed on geometries, via the `Relate` trait, e.g. `line.relate(point)` which returns a DE-9IM [`IntersectionMatrix`](https://en.wikipedia.org/wiki/DE-9IM#Matrix_model).

The `Relate` trait is driven by the `RelateOperation`. The `RelateOperation` builds a `GeometryGraph` for each of the two geometries being related. 

A `GeometryGraph` is a systematic way to organize the "interesting" parts of a geometry's structure - e.g. where its vertices, lines, and areas lie relative to one another.

Once the `RelateOperation` has built the two `GeometryGraph`s, it uses them to efficiently compare the two Geometries's structures, outputting the `IntesectionMatrix`.





Co-authored-by: Michael Kirk <[email protected]>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
bors bot added a commit that referenced this issue Apr 13, 2021
639: Introduce the geomgraph module for DE-9IM Relate trait r=frewsxcv,rmanoka a=michaelkirk

- [x] I agree to follow the project's [code of conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `CHANGES.md` if knowledge of this change could be valuable to users.
---

Fixes #513, #515

(I'm sorry it's so large)

~~I'm going to leave it as a draft (edit: 🤦 I failed to actually open the PR as a draft) while I wait to merge #636 and #638 and then do some rebasing, but I don't anticipate doing other large changes before review.~~ *update: ready for review!*

Here's some of the earlier work in pursuit of this:

#514
#516
#523
#524 
#538
#552
#561
#611
#628
#629
#636 

Primarily, this introduces the geomgraph module for a DE-9IM `Relate` trait.

geomgraph implements a topology graph largely inspired by JTS's module of the same name:
https://github.com/locationtech/jts/tree/jts-1.18.1/modules/core/src/main/java/org/locationtech/jts/geomgraph

You can see some of the reference code if you omit the "REMOVE JTS COMMENTS" commit. In some places the implementation is quite close to the JTS source. 

The overall "flow" is pretty similar to that of JTS, but in the small, there were some divergences. It's not easy (or desirable) to literally translate a Java codebase making heavy use of inheritance and pointers to rust. Additionally, I chose to take advantage of `Option` and rust's enums with associated data to make some case analysis more explicit.

There is a corresponding PR in our [jts-test-runner](georust/jts-test-runner#6) crate which includes the bulk of the tests for the new Relate trait.

## Algorithm Overview 

This functionality is accessed on geometries, via the `Relate` trait, e.g. `line.relate(point)` which returns a DE-9IM [`IntersectionMatrix`](https://en.wikipedia.org/wiki/DE-9IM#Matrix_model).

The `Relate` trait is driven by the `RelateOperation`. The `RelateOperation` builds a `GeometryGraph` for each of the two geometries being related. 

A `GeometryGraph` is a systematic way to organize the "interesting" parts of a geometry's structure - e.g. where its vertices, lines, and areas lie relative to one another.

Once the `RelateOperation` has built the two `GeometryGraph`s, it uses them to efficiently compare the two Geometries's structures, outputting the `IntesectionMatrix`.





Co-authored-by: Michael Kirk <[email protected]>
Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com>
@michaelkirk
Copy link
Member Author

I think this can be closed since #639 was merged long ago.

There's some follow up work in other issues, but I don't think there's anything further to discuss about the "de-9im" proposal at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants