Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS for vector data #235

Closed
soxofaan opened this issue Apr 21, 2021 · 9 comments
Closed

CRS for vector data #235

soxofaan opened this issue Apr 21, 2021 · 9 comments
Labels
Milestone

Comments

@soxofaan
Copy link
Member

soxofaan commented Apr 21, 2021

We have a couple of core processes that take a "geojson" object:

  • aggregate_spatial (and proposals/aggregate_spatial_binary)
  • load_collection
  • filter_spatial
  • mask_polygon

According to the official GeoJSON spec RFC7946 the coordinate reference system is always WGS84.
But it turns out that a lot of users also use GeoJSON with geometries defined in other CRSes. Older versions of the GeoJSON spec allow to specify the CRS in the GeoJSON object (see https://github.com/geojson/geojson-spec/blob/master/geojson-spec-draft.rst#coordinate-reference-system-objects).

It would be beneficial for the usability of openEO to make it possible to pass a non-WGS84 GeoJSON object, that is properly handled by client libraries and backend drivers.

possible solutions:

  • define that the "geojson" subtype supports the GeoJSON-2008 style extension to define the CRS
  • add arguments to the processes above to specify the CRS of the GeoJSON data

references: #53

@m-mohr
Copy link
Member

m-mohr commented Apr 21, 2021

Context: Vector support is limited right now in openEO due to the fact that we did not have the time and use-cases in openEO H2020 to define vector data cubes and define the corresponding processes (see #68). Once we have defined those, the vector data cubes can load file formats that support CRS and the vector data cubes also will have support for CRS.

I doubt it's a good idea to use the old draft of GeoJSON due to the missing tooling support. Also, OGC is working on a GeoJSON derived standard that supports CRS. Once that is available, it would be the more logical choice.

Until then, why don't you pass a WGS84 GeoJSON? Do you have proof that "a lot of users also use GeoJSON with geometries defined in other CRSes"? I haven't seen that yet.
I'd even argue that OGC API - Features by default only supports WGS84 is an indicator that WGS84 support by default should be sufficient for most use-cases and all others can be handled with the additional vector support in openEO. Of course, #68 then needs to be prioritized.

@m-mohr m-mohr added the vector label Apr 21, 2021
@jdries
Copy link
Contributor

jdries commented Apr 21, 2021

Yes we have proof, otherwise we wouldn't bother about this. The unfortunate fact is that agriculture ministries for some reason encode their vector data (mostly cap parcel registrations) in local coordinate systems. Having to teach our users about proper reprojection to EPSG:4326, and corresponding axis order pitfalls, is a big pain.
That's why adding this simple support for specifying a CRS is the preferred option right now.
We're not trying to argue against the fact that WGS84 by default is a good choice, but that evolution should start in the systems that store and manage vector data. In our openEO backend, we just try to support anything that a user might have, to live up to our promise of 'simplicity'.

By the way, there is also an actual reason why users work in different projections: in local coordinate systems, that work in meters, vector processing systems can for instance apply a buffer with a distance of '10 meters'. Similarly, users can compute an area, and get a result in meters². These things are not uncommon in a vector preprocessing chain before the polygons are sent to openEO. So there as well, having to reproject to lat lon only because the file format decided to drop support for CRS's is annoying for our users.

@m-mohr
Copy link
Member

m-mohr commented Apr 21, 2021

The unfortunate fact is that agriculture ministries for some reason encode their vector data (mostly cap parcel registrations) in local coordinate systems.

How do they encode this? Using the old GeoJSON draft?

Having to teach our users about proper reprojection to EPSG:4326, and corresponding axis order pitfalls, is a big pain.
[...]
These things are not uncommon in a vector preprocessing chain before the polygons are sent to openEO.

So the reprojection could also be handled on the client-side? I.e. accept a non-standardized GeoJSON and convert it to fulfill the API requirements?

Overall, it seems better to start working on #68 instead of pushing something non-standardized through the process definitions into all back-ends. Once we define the proposal in the processes, all back-ends would need to figure out how to work with non-standardized GeoJSON, for which there are not a lot of tools available.

@jdries
Copy link
Contributor

jdries commented Apr 21, 2021

A lot of data still comes in as shp files.

Client side conversion can be done in cases where the user is skilled enough, and can install the required dependencies. It's not something I want to enforce by default.

The tooling support is not an issue, adding the crs info can be done like this:
https://github.com/Open-EO/openeo-python-client/blob/master/openeo/rest/datacube.py#L1143
And basically the same for getting it back out again.

The other option is that we simply do the 'non-standard' thing in the vito backend, and then other backends can follow when their users are complaining as well. I believe we just wanted to be correct, and highlight that we're deviating from the api here.

@soxofaan
Copy link
Member Author

Another thing about the client side conversion: it's feasible to do automatic conversion in the openeo python client library from the related DataCube methods (aggregate_spatial, mask_polygon, ...), but when you start working with UDPs, it becomes less clear where the automatic conversion should happen.
And with UDPs you also have the use case that the user of the UDP is not using a real openeo client library in the first place (which is one of the selling points of UDPs)

@soxofaan
Copy link
Member Author

about @jdries 's note

and can install the required dependencies. It's not something I want to enforce by default.

I don't now the all GIS libraries for reprojection in python, but the most common one, pyproj is not a pure python one, so installation of it might involve compilation which can be challenging on some platforms (like windows), so we are hesitant to start depending on that one

@m-mohr
Copy link
Member

m-mohr commented Apr 21, 2021

Telco: VITO uses the proprietary API for now and can deprecate later. We'll start working on vector data cubes in an iterative way, just getting the first bits and pieces done now, like loading them from e.g. a shapefile and allowing vector data cubes in other existing processes. We can add additional processes later. We may also allow OGC's new CRS-enabled GeoJSON at some point. Is this summarized correctly, @jdries ?

@m-mohr m-mohr changed the title GeoJSON CRS CRS for vector data Apr 21, 2021
@m-mohr m-mohr added this to the future milestone Apr 21, 2021
@soxofaan
Copy link
Member Author

That is good summary of action points I think

VITO uses the proprietary API for now and can deprecate later

Note that this is not only about proprietary support in VITO backend, but also some features in python client library (e.g. a srs argument in the DataCube.mask_polygon method: https://github.com/Open-EO/openeo-python-client/blob/451695f93bd49596f0ef517306a947e61ef09176/openeo/rest/datacube.py#L1113-L1116) . I'll try to flag these features more explicitly as non-standard/experimental.

soxofaan added a commit to Open-EO/openeo-python-client that referenced this issue Apr 22, 2021
…egate_spatial` and `mask_polygon`

Harmonize geometry handling in `aggregate_spatial` and `mask_polygon`, including handling of non-standard crs and whitelisting of allowed GeoJSON types
Raises warning when using a CRS for the geometry of `aggregate_spatial` and `mask_polygon`
Also mark this more clearly in the docs

Related: #202, Open-EO/openeo-processes#235, Open-EO/openeo-processes#53
@m-mohr
Copy link
Member

m-mohr commented Mar 8, 2023

I think we can wrap up this issue, but please leave a comment if you think otherwise.

  • Vector data cube is available in openeo-processes (draft, for 2.0) and the geometries dimension could have any CRS
  • GeoJSON input is always WGS84 and has to be loaded through load_geojson Streamline GeoJSON import #346 #412, but we can add explicit CRS support for OGC FGJSON or back-ends can choose to also accept old GeoJSON drafts with CRS support.
  • If a user for examples loads a shapefile through load_uploaded_files, the CRS can simply be used in the vector data cube.

@m-mohr m-mohr closed this as completed Mar 8, 2023
@m-mohr m-mohr modified the milestones: future, 2.0.0 Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants