Default execution mode should be asynchronous. #413

pvretano · 2024-05-26T22:18:05Z

Requirements 26 item C says: "The server SHALL respond synchronously if, according to the job control options in the process description, the process can be executed in either mode."

I think this is the UNSAFE play. We should change this to ASYNCHRONOUS by default.

jerstlouis · 2024-05-26T22:58:54Z

@pvretano UNSAFE play?

Disagreeing with this, for the reasons we discussed at length previously.

If I recall correctly, the main reason we opted for synchronous by default is that the fact that the client provides a Prefer: header indicates that the client is able to handle async execution.

The RFC for the Prefer: header does not include a mechanism to indicate sync execution (there is only respond-async).

Changing how this works would definitely be a major breaking change between the two versions, which I hope we would avoid or it will really be a transition nightmare.

The OGC API - Processes async mode is significantly more complicated to implement from both the server and the client perspective.

From the server side, it requires implementing a complex queuing mechanism, results storage and persistence, etc.
From the client side, it requires implementing complex polling and status checking.

Sync execution on the other hand is super simple. It's handling / making a POST requests stating what the client wishes to execute and handling the response. It is much more similar to the typical OGC API GET requests. I believe it's actually easier to implement a relatively safer sync execution than the async equivalent.

An async service can be implemented as an additional layer on top of a simple sync service.

Async / batch processing is still way overrated in my opinion.
I still believe on-demand / small requests for AoI/ToI/RoI (e.g., "Collection output"), where these small requests can easily be synchronous, will largely mostly replace batch processing, though it's going to take more time for people to accept this paradigm shift in geoprocessing. This is easier to manage from the both the client and the server side.

The sync mode is the simple thing. I hope we can keep the default the simple way :)

pvretano · 2024-05-26T23:33:39Z

@jerstlouis requirement 26, part C covers the case where NO perfer headers is specified AND the process can be executed either sync or async. In this specific case I think the better default is async. Otherwise you risk a long running process timing out the HTTP connection.

jerstlouis · 2024-05-27T04:14:27Z

@pvretano The reason we had requirement 26 C:

The server SHALL respond synchronously if, according to the job control options in the process description, the process can be executed in either mode.

which is equivalent to 25 C in the approved/published 1.0 version is because it would otherwise be impossible for a client to tell the server "I want to execute this process synchronously".

Omitting the Prefer: header is how the processing clients can currently do that with the approved standard.

Without this, it would be impossible to write synchronous-only execution clients that can work with servers that also offer optional async support. Synchronous-only clients are much easier to write and get working, from past code sprints and testbed experience.

Otherwise you risk a long running process timing out the HTTP connection.

In my opinion, this is a non-issue.
From my understanding (and a quick search), there is no fundamental HTTP time out.

The server can decide how long it wants to keep processing something.
The client can decide it wants to keep waiting for the response.

At any point, both ends can decide to give up and interrupt the connection.

This is actually a good thing in terms of avoiding to hog up the resources from the server standpoint, because the server actually knows that there is an actual client still waiting patiently for a response or not, and can easily limit the number of active connections from a particular client (which is much simpler than managing a processing queue, and can be done with readily-available tools like an Apache server proxy).

With async, a malicious client could just decide to queue up tons of processing requests and never care about them.

Interesting releated comment on Reddit:

The year is 2522. The war rages on and the robots are gaining ground each day. Also, we're still waiting for json to get returned from an http request made 500 years ago.

If a process is inherently always going to take a long time and the server doesn't want to have a long time out, then it can choose to only offer the async option.

If a process sometimes takes a small amount of time and sometimes a very long amount of time based on the execution request, then the server can estimate how long the processing would take, and refuse to process it synchronously in the case it takes too long, with a 400 error that says to the client:

This execution request would time out before it could complete.
Please include a Prefer: header in your execution request to indicate that you can handle asynchronous requests.

I believe this is the best solution to address your concern.

bpross-52n · 2024-05-27T13:41:36Z

SWG meeting from 2024-05-27: We keep it as it is (synchronous), as otherwise there would be no way for a client to tell servers that synchronous execution is requested.

fmigneault · 2024-05-28T02:36:08Z

@pvretano @jerstlouis @bpross-52n

Sorry I could not be part of today's meeting, but I strongly disagree with this decision and do not think this issue should be closed.

Previous iteration of the standard had a lot of considerations about letting the server decide what is appropriate between sync/async (at some point, even the order of the supported execution modes in the process description defined the default). Sure, the synchronous implementation might seem easier to implement from server/client side when dealing with "simple" processing like NDVI or applying a AoI/ToI/RoI filter to a collection, but it makes absolutely NO sense when the processing is much longer or to handle limited-resource/high-demand situations, since servers can close the HTTP connection due to timeouts or need to wait for the resource to become available in a queuing system. In such cases, the async operation is much more appropriate. I could argue in my typical use cases that async is easier to implement since a monitoring Job is always desired... and leaving a connection open indefinitely is a misuse of resources, which most servers will avoid by defaulting between ~5/60s timeout according to use cases.

Note that I am not advocating for async by default either. I believe this flexibility of execution mode is an advantage from the processes API.

I would also like to remind that, according to Prefer header, it is a preference by definition. The server is always allowed to completely ignore this requirement if it deems it inappropriate for the operation to perform, as long as it indicates accordingly if the Preference-Applied was applied or not in the response. Therefore, even if the Prefer was specified, the only real way for a client to validate whether the preferred sync/async was used is to confirm with Preference-Applied. Furthermore, sync/async are supposed to return 200/201 for the respective modes in the case of OGC API - Processes, so this could also be used to validate further. However, regardless of applied execution strategy, ANY client implementation should always be ready to handle either sync/async implementation, since the server is allowed to select the most appropriate one for the operation or available resources. Deviating from this would mean OGC API - Processes deviates from the RFC, which is misleading with the use of Prefer altogether. OGC API - Processes should simply omit any indication of any default.

I also disagree with the statement:

The RFC for the Prefer: header does not include a mechanism to indicate sync execution (there is only respond-async).

If a server allows it, it is perfectly valid to indicate Prefer: wait=100000000000000000. That would pretty much indicate to run sync indefinitely. I would question a server allowing and maintaining this unreasonable timeout, but it would be valid according to the HTTP specification. That would not solve the mocking "waiting forever analogy" either.

If a process sometimes takes a small amount of time and sometimes a very long amount of time based on the execution request, then the server can estimate how long the processing would take, and refuse to process it synchronously in the case it takes too long, with a 400 error

Sometimes, the operations are not themselves long, but the resources to compute them are insufficient to respond to the demand. Sometimes, there is simply no way to estimate that duration, as it depends on external factors the server does not have access to, such as the time of a deployed process the server cannot know how long it will take to complete. In some cases, the server still wants to respect the submission order of the requests, and not only depend on luck for when an execution request will succeed or not. Believing that async is used only for long-running jobs is a vast oversimplification of the use cases.

An async service can be implemented as an additional layer on top of a simple sync service.

Async / batch processing is still way overrated in my opinion.

I still believe on-demand / small requests for AoI/ToI/RoI (e.g., "Collection output"), where these small requests can easily be synchronous, will largely mostly replace batch processing, though it's going to take more time for people to accept this paradigm shift in geoprocessing. This is easier to manage from the both the client and the server side.

I would like to make sure OGC API - Processes does not evolve according to this kind of mentality. If the operations is "so simple" that it can be performed by a simple GET request for a one-off operation, maybe it is an indication that a dedicated endpoint for that relevant operation should be something else than a OGC API - Processes, since it does not really justify the whole Process Description overhead. IMO, implicating a Process Description and potentially the creation of a Job operation to monitor it implies that the operation is "complicated enough" to need its own standalone definition rather than a generic OpenAPI endpoint schema. In many cases, these "complicated processes" could run with particular requirements that perfectly justifies async, but that could also support sync execution if requirements were available at the time the request was submitted.

OGC API - Processes should never disregard these use cases, especially with the increasing demand for AI/ML algorithms implicating ever increasingly complex operations, varying demands and resource requirements.

jerstlouis · 2024-05-28T04:12:54Z

@fmigneault

Thanks a lot for engaging in this discussion. I enjoy our thoughtful discussions, even though others might find them too long to read ;)

However, regardless of applied execution strategy, ANY client implementation should always be ready to handle either sync/async implementation, since the server is allowed to select the most appropriate one for the operation or available resources.

This is where I disagree, because I believe there is value in developers being able to quickly put together an OGC API - Processes client that supports only sync, for those servers/processes that do support sync only requests.

If a server allows it, it is perfectly valid to indicate Prefer: wait=100000000000000000. That would pretty much indicate to run sync indefinitely.

The difference is that the server is not obligated to respond sync in this case, unlike when the Prefer: header is omitted altogether.

Deviating from this would mean OGC API - Processes deviates from the RFC, which is misleading with the use of Prefer altogether.

The way we kind of avoided deviating from the RFC in Processes 1.0, is that when the client does not use the Prefer: header, the RFC is not involved, and the default (if the process supports sync execution) is synchronous execution.

leaving a connection open indefinitely is a misuse of resources, which most servers will avoid by defaulting between ~5/60s timeout according to use cases.

these "complicated processes" could run with particular requirements that perfectly justifies async, but that could also support sync execution if requirements were available at the time the request was submitted.

The way I would suggest to address this, is an explicit permission for the server to refuse synchronous execution (execution request submitted without a Prefer: header) with a 400 and a notice to the client to include one, indicating that it is ready to accept an async response, as I was suggesting above. Or alternatively, a 503 telling the client to try again later if this is the different case that resources for sync execution are not currently available, or 413 that the request is for too much (still with the hint that the server might be able to accept it as an asynchronous request right now).

If the operations is "so simple" that it can be performed by a simple GET request for a one-off operation, maybe it is an indication that a dedicated endpoint for that relevant operation should be something else than a OGC API - Processes

The idea with collection output (which very much in line with the concept of GeoDataCubes), is that you can describe the workflow once, and the follow-on requests for a particular AoI/ToI/RoI are really just a simple get requests (e.g., using OGC API - Coverages, Tiles, DGGS, Features, Maps, EDR...). The description of the workflow can be of any level of complexity, and may combine several inputs and chains of other workflows/processes, whether local or distributed, so this does justify the use of OGC API - Processes. But the requests for partial results are very simple and potentially also very quick to process due to their limited ATRoI.

implicating a Process Description
it implies that the operation is "complicated enough" to need its own standalone definition rather than a generic OpenAPI endpoint schema.

I hope we can prototype example OpenAPI version of the Process description as an alternative/complement to the OGC process description. Regardless of how complicated the process is, the inputs and outputs need to be described and I think that is what both of those approaches do.

and potentially the creation of a Job operation to monitor it implies that the operation is "complicated enough"

The need for job monitors is when we cannot avoid lengthy batch processes.
There are certainly use cases for these, but if it there is value in sometimes executing that same process synchronously for a small dataset and the process is localizable, then this likely fits in the ATRoI scenario and a server deciding to use that approach can do away without a monitoring system by implementing only Part 1 sync + Part 3 collection output (as we currently do on our demo server). This probably does not fit AI/ML training processes, but might well fit inference processes.

pvretano · 2024-05-28T04:20:05Z

Guys, you are killing me with these LONG, LONG comments! ;)

This particular issue is about a client POSTing an execution request without an accompanying Prefer header. The specification, via Req 26C, currently says that the server should respond synchronously.

I was proposing that the server should respond ASYNC since, without knowning how long the process might take to run, ASYNC was the safer approach.

@jerstlouis is proposing to leave it as is.

@fmigneault, I think, is saying remove this requirement altogether and instead let the server ALWAYS decide the execution mode base on its internal knowlege of the process, the execution enviroment, the available resources, etc. The client can express a PREFERENCE via the Prefer header but the server is not compelled to satisfy that preference. The client must also always be prepared to handle either scenario (sync or async) execution. A combination of Preference-Applied and the HTTP status code (200 or 201) returned will always inform the client as to which action the server took.

So do I remove Req26? I'm leaning towards yes.

This, of course, then begs the question ... do we need to bother with job control metadata in the process description at all?

jerstlouis · 2024-05-28T04:26:54Z

@pvretano I really believe we need to keep the requirement as-is.

I'm proposing to address @fmigneault 's use case by adding a permission clarifying that the server MAY return a 503 if it's not able to handle the request synchronously right now due to limited resources or a 413 if it's not able to due to the client asking to process too much synchronously, including a verbose hint that the client should re-submit the request with a Prefer: header.

pvretano · 2024-05-28T04:34:18Z

@jerstlouis I think @fmigneault proposal is simpler and pretty much what Req27 says right now. In the case where the process can run sync or async the server picks and makes its decision known via the Preference-Applied (and/or HTTP 200/201) header. Easy peasy.

jerstlouis · 2024-05-28T04:38:32Z

As I said above:

This is where I disagree, because I believe there is value in developers being able to quickly put together an OGC API - Processes client that supports only sync, for those servers/processes that do support sync only requests.

This loses that, and breaks compatibility with 1.0.

The difference is that the server is not obligated to respond sync in this case, unlike when the Prefer: header is omitted altogether.

I also still believe we should aim for as much compatibility with 1.0 as possible, given that so far everything is very, very close to full compatibility in terms of existing 1.0 clients being able to execute processes from 1.1/2.0 servers.

pvretano · 2024-05-28T04:51:41Z

@jerstlouis assuming we keep the job control options in the process description (see my previous question to the group) then you can still write such a simple OAProc client. This client simply needs to search for processes in the process list that can only run synchronously and ignore all the other ones. No?

As for compatability, nice to have but not 100% necessary since we are targeting 2.0 ... no?

jerstlouis · 2024-05-28T05:27:53Z

@pvretano

This client simply needs to search for processes in the process list that can only run synchronously and ignore all the other ones. No?

Currently, Processes 1.0 also makes this possible for processes that support both sync & async, this change limits this to those that support only sync (and the server adding support for async to those processes later will suddenly breaks those sync-only clients).

As for compatibility, nice to have but not 100% necessary since we are targeting 2.0 ... no?

I was still hopeful that this could still be a 1.1 version, at least in terms of not breaking this existing compatibility, even if not reflected in the actual version number.

fmigneault · 2024-05-28T06:34:49Z

Exactly as @pvretano described. Perfectly understood by thought process.

m-mohr · 2024-06-21T19:19:29Z

Is there any way in OGC API - Processes that I can enforce synchronous execution and if it's not supported or to complex to run synchronously, it just returns an error?

Similarly, is this possible to asynchronous execution?

I'm thinking of two use cases here:

Rapid web visualization via synchronous execution (e.g. for web mapping, which just doesn't work effectively with batch jobs).
Creating large result sets e.g. a result as STAC catalog (I'd probably never want to parse that from a single HTTP response)

fmigneault · 2024-06-22T01:43:06Z

@m-mohr

Is there any way in OGC API - Processes that I can enforce synchronous execution

2 part answer depending on the version of the standard

Before

Yes, completely, using mode in the JSON execution body enforced the mode to respect. If it did not match one of the supported jobControlOptions indicated by the process description, you got an error. If it matched supported options, the mode specified had to be respected.

Current

Not guaranteed depending on the description. The same rule about mismatching jobControlOptions remains for returning an error. However, if a process supports both sync/async, it can technically ignore your Prefer header, as per that header's definition in HTTP, and fall back to any other mode if the server deems it could not respect the suggested preference. If it is respected, it MUST reply with Preference-Applied with matching values. Otherwise, it must omit that response header, and it up to you to deal with the response result (that you might not have expected).

I have raised this concern on multiple occasions, across multiple issues.

The only "real" way to enforce a mode currently while respecting all standard revisions and HTTP simultaneously, is to only indicate a single mode in jobControlOptions, whichever one you feel is more appropriate for a given process.

jerstlouis · 2024-06-22T03:04:02Z

To clarify, the "Before" version with "mode" that @fmigneault is referring to was pre-1.0 and the "Current" version is 1.0 (which already replaced the use of "mode" by the Prefer: header, after much discussion).

According to 1.0, when NOT including the Prefer: header, if sync is an option, the server must execute synchronously (Requirement 25 C).

The reverse (both sync and async as an option for the process, with a Prefer: respond-async header) is not technically guaranteed due to the nature of the Prefer: header being only a preference, but it is a recommendation (12 A) which in practice should always be followed, because the server explicitly says that it supports async, and the client says that it wants to execute async. The server would need to have a good reason to not follow the client's preference.

fmigneault · 2024-06-22T04:18:12Z

The issue is that the server has the option to ignore the client's preference, meaning that the behavior cannot be predetermined in all circumstances. This makes the implementation of async handling much harder by clients. It is fine to have defaults being sync, since it is a simpler use case, but the async mode should be handled just as fairly and reliably, not twice as hard to achieve.

A "good reason" could be as simple as a server having a limited amount of resources to store jobs, for which it always tries to return the result synchronously when it could be executed fast enough to save space, but "is forced" to fall back to async when an input resource it must wait for cannot be ready in time for it to execute before server timeout was reached. Since it could respond in either way, the process description MUST indicate both modes in jobControlOptions to be compliant. However, there is no way to tell "why" a mode should be picked over another when Prefer is provided. From the point of view of a client, Prefer could sometimes be respected, and sometimes not, making the server appear unreliable or malfunctioning. The jobControlOptions is simply the only available indication that it could run either way, nothing more.

If a server was behaving this way, I could not blame the implementer that they do not follow the standard, as they technically be right. They would be allowed to ignore my preference. This makes it a bigger burden for my client integrating their server, as I must always try to deal with any possible outcome.

m-mohr · 2024-06-22T05:27:32Z

It sounds like the Prefer header is not the right solution for what is needed. Don't use it? Splitting the endpoints might be the better option, see #419

jerstlouis · 2024-06-24T17:05:12Z

This makes it a bigger burden for my client integrating their server, as I must always try to deal with any possible outcome.

Yes, because it's only a recommendation, the client do technically need to be ready to handle sync responses as well, even when they submit a Prefer: response-async preference. However, handling async is quite complex by itself (e.g., polling, retrieving results separately), and the sync handling code can probably share a lot of code with the retrieving results part which the clients would need to do anyways.

Different end-points as @m-mohr suggested might have been a simpler solution side-stepping those problems, but possibly because some implementations also create jobs for synchronous execution, the SWG had not considered that at the time. Now I believe that the SWG mostly wants to avoid breaking changes as much as possible and finalize 1.1/2.0.

fmigneault · 2024-06-25T18:41:42Z

SWG mostly wants to avoid breaking changes as much as possible and finalize 1.1/2.0

💯 agreed. This is a strong requirement.

I only wished there was a way to "force" a certain mode for certain edge cases where it is critical that it is respected. In these few cases were sync/async is a "must" for whatever reason, I would actually prefer receiving an unprocessable request error or similar over generating a possibly long/heavy-resource execution that will not be handled accordingly.

I've actually just came across this: Prefer: handling=strict|lenient. Maybe something to consider to preserve the current Prefer behavior while allowing the one described above?

christophenoel · 2024-06-26T06:30:00Z

Hi, See below a change proposal for asynchronous communication that we did in OGC Testbed 18 (Secure Asynchronous Catalog) : https://docs.ogc.org/per/22-018.html#toc28 I hope it helps. OGC API-Notification Service Propose Webhooks for an OGC API-Notification Service specification for other services besides Records which should work with the PubSub SWG. 8.1.3. Notification Content The subscription prototype backend performs a periodic check as to whether the response corresponding to the resources-uri in the subscription object is identical or not to the original response. If not, then the same resources-uri is sent to the subscriber according to the delivery mechanism that was selected. If the resources-uri corresponds to a single item (e.g., /items/{item-id}), then this means that the record content has changed. If the resources-uri corresponds to a search request (e.g., /items?…), then this means that the (first result page) of the response has changed. A number of improvements are possible. · In the case of a search request, changes should be detected that affect not only the first result page (e.g., first n results and information about total number of hits). A deletion followed by an insertion, keeping the total number of results identical, may not be detected. · In case resources-uri corresponds to a search request, the client should have the option to receive the actual changes, instead of a repetition of the full set of results. This is particularly useful for very large collections or searches with many results. A mechanism to receive information about deleted records is required in particular when the notification mechanism would be applied for supporting incremental harvesting. A mechanism similar to the Atom deleted-entry Element [RFC-6721<https://www.rfc-editor.org/rfc/rfc6721>] might be applied. The Open Archive Initiative Protocol for Metadata Harvesting [OAI-PMH<https://docs.ogc.org/per/22-018.html#oai-pmh>], which is still widely used, has support for deleted records<https://www.openarchives.org/OAI/openarchivesprotocol.html#DeletedRecords>. · When a resource-uri corresponds to a search result, e.g., “all records with datetime overlapping with date-1 and date-2”, it might be useful to indicate in a subscription that the dates are to be interpreted as relative with respect to the actual time the subscription check is executed, instead of interpreting dates always as absolute dates. 8.1.4. Asynchronous Patterns for OGC API Common A significant topic for future activities should be exploring how asynchronous communications can be addressed at the OGC API Common level. In the context of other APIs, asynchronous communications must address a wider range of concerns than for OGC API Records, as show in the following examples. · A server might handle requests asynchronously for preparing the results (single response), or in response to certain events (multiple responses). · A client should specify the preferred behavior for retrieving the delayed response(s) either actively (polling pattern) or passively (push pattern requiring a listener endpoint). The next subsections propose patterns to address general OGC API concerns related to asynchronous communications. · The asynchronous pattern for delayed response provides a simple solution to query (poll) a URL until the response is ready. · The callback pattern for delayed response complements the asynchronous pattern for specifying an endpoint for receiving (push) the response(s). · The OGC API Common Asynchronous requirement class is a draft that generalizes the API records asynchronous solution relying on jobs resources. 8.1.4.1. Asynchronous Pattern for Delayed Response Implementations of OGC APIs typically use a communication protocol pattern based on a single socket pair for sending and receiving an HTTP(S) request / response. The handling of slowly resolved queries that are not ready within the regular HTTP time out (i.e., roughly 30 seconds) can be addressed using a simple communication pattern (not implying the subscription resources solution detailed earlier). The proposed standard RFC 7240<https://docs.ogc.org/per/22-018.html#rfc7240> “Prefer Header for HTTP” recommends using the ‘Prefer’ header to indicate the server behavior preferred by the client. As already adopted by OGC API Processes, the “respond-async” preference indicates that the client prefers the server to respond asynchronously to a response. In addition, the value respond-async, wait=10 is a hint to the server that the client expects a maximum of 10 seconds to return the response following the traditional synchronous pattern. RFC 7240<https://docs.ogc.org/per/22-018.html#rfc7240> mentions that the server honoring the “respond-async” preference should return a 202 (Accepted) response as per HTTP 1.1 (RFC 7231). However, the actual behavior is unspecified, and little guidance is provided by code 202 (“The representation sent with this response ought to describe the request’s current status and point to (or embed) a status monitor that can provide the user with an estimate of when the request will be fulfilled”). For the above reason, some Testbed participants also proposed considering the alternative code 303: See Others. “See Other” is a way to redirect web applications to a new URI, particularly after a HTTP POST has been performed. Independent of the response code, the server should return a Location header (and an optional Retry-After header) holding a link to the target requested resources. The link to the target resources can monitor the status of the previous request until the resources are available. The structure of the monitor response is custom as it needs to fit the purpose. But, to ensure that the caller is not too proactive, the server may throttle the caller via 429 (too many requests: Retry-after). ***@***.***Figure 36 — Asynchronous Pattern for Delayed Response 8.1.4.2. Callback Pattern for Delayed Response Complementary to the asynchronous pattern described above, the request might be extended to submit an endpoint URI for receiving callback messages that contain the delayed response of the server. Indeed, as adopted by OGC API Processes, OpenAPI 3.0 provides a callback (push-based) mechanism where a subscriber-URL is passed to the API in the request. Once the resources are available, the result response is sent to the specified URL. OpenAPI supports specifying the placeholders of the callback URI’s (potentially for a set of defined events) submitted in the request, and to define the schemas of the callback messages which must be specified in the callbacks property of the related definition of the OpenAPI operation. An example is shown below. callbacks: completed: '{$request.header.Prefer}/callbackURI': post: # Method requestBody: # Contents of the callback message … responses: # Expected responses … Figure 37 — Generic OpenAPI Definition of the callback For expressing the callback endpoint (and options), Testbed participants highlighted one simple approach taking advantage of the Prefer header. The header might be extended with a callback token holding the (single) endpoint for callback messages. Also, in case of multiple results (based on particular events), the frequency preference can be provided in a schedule token holding a unix-cron value. 8.1.4.3. OGC API Common Asynchronous Requirement Class The proposed approach for a generic asynchronous requirement class relies upon the typical use of HTTP code 202<https://restfulapi.net/http-status-202-accepted/>. A job resource is created to monitor the execution of the request. Note that the approach is very similar and reuses most concepts from the OGC API Processes. Recommendation 21 Label /rec/core/process-execute-honor-prefer A If a request is accompanied with the HTTP Prefer<https://datatracker.ietf.org/doc/html/rfc7240#section-2> header asserting a respond-async<https://tools.ietf.org/html/rfc7240#section-4.1> preference, then the server should honor that preference and response asynchronously. B If a request is accompanied with the HTTP Prefer<https://datatracker.ietf.org/doc/html/rfc7240#section-2> header asserting a wait<https://tools.ietf.org/html/rfc7240#section-4.3> preference, then the server should honor that preference in the decision to execute the process asynchronously. C If a request is accompanied with the HTTP Prefer header, then in the response, servers should include the HTTP Preference-Applied response header as an indication as to which ‘Prefer` tokens were honored by the server. Recommendation 22 Label /req/async/response A If a request is executed asynchronously, the server should respond with an HTTP status code of 202. The server should return a Location header (and an optional Retry-After header) holding a link to the job monitoring the processing of the request. Requirement 21 Label /req/async/job A The server shall support the HTTP GET operation for retrieving a long-running asynchronous job at the path /jobs/{jobID}. B A successful execution of the operation shall be reported as a response with a HTTP status code 200. The content of that response shall be based upon the OpenAPI 3.0 schema jobStatus.yaml. The jobStatus schema is illustrated on the class diagram below. ***@***.***Figure 38 — JobStatus Schema Recommendation 23 Label /req/async/prefer-callback A If a request is accompanied with the HTTP Prefer<https://datatracker.ietf.org/doc/html/rfc7240#section-2> header asserting a callback preference (endpoint URI), then the potential asynchronous response(s) should be pushed as a callback message delivered to the provided callback endpoint URI. B If a request is accompanied with the HTTP Prefer<https://datatracker.ietf.org/doc/html/rfc7240#section-2> header asserting a callback preference and a schedule UNIX-cron value, then the potential asynchronous response(s) should be pushed in respect to the submitted schedule. The resulting sequence diagram is provided below. ***@***.***Figure 39 — Generic Asynchronous Job Sequence Diagram The status values defined in OGC API Processes are clarified below in the context of sequential results updates (on purpose) managed by the asynchronous job. Requirement 22 Label /req/async/job-status A The status of a job shall be accepted if the asynchronous job request is valid and has been queued for execution. B The status of a job shall be running if the provided start time has been reached. C The status of a job shall be failed if the asynchronous job request is not valid or if the processing of the request raised an error that prevented its completion. D The status of a job shall be dismissed if the asynchronous job has been dismissed through a HTTP DELETE request. E The status of a job shall be successful if the asynchronous job has completed or the end time has been reached. From: Matthias Mohr ***@***.***> Sent: Friday, 21 June 2024 9:20 PM To: opengeospatial/ogcapi-processes ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [opengeospatial/ogcapi-processes] Default execution mode should be asynchronous. (Issue #413) Is there any way in OGC API - Processes that I can enforce synchronous execution and if it's not supported or to complex to run synchronously, it just returns an error? Similarly, is this possible to asynchronous execution? I'm thinking of two use cases here: * Rapid web visualization via synchronous execution (e.g. for web mapping, which just doesn't work effectively with batch jobs). * Creating large result sets e.g. a result as STAC catalog (I'd probably never want to parse that from a single HTTP response) — Reply to this email directly, view it on GitHub<https://clicktime.symantec.com/15siL2oUbxEJXKFUoN3Hm?h=q45iUO4CRZ_DPlGkn_7s7GfBlMfj4TJQoJb2E7Tunhw=&u=https://github.com/opengeospatial/ogcapi-processes/issues/413%23issuecomment-2183323562>, or unsubscribe<https://clicktime.symantec.com/15siFCcC9LYi7NRZFoe99?h=bxbWhlgnBeMQubJXmFynKCgY_-FAKvEDXIjzhPNoJ6U=&u=https://github.com/notifications/unsubscribe-auth/ACFG335KCJQZ7TO62HLQS7LZIR4FTAVCNFSM6AAAAABIKEQE7OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBTGMZDGNJWGI>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>

…

------------------------------------------------------------------------------ E-MAIL DISCLAIMER The present message may contain confidential and/or legally privileged information. If you are not the intended addressee and in case of a transmission error, please notify the sender immediately and destroy this E-mail. Disclosure, reproduction or distribution of this document and its possible attachments is strictly forbidden. SPACEBEL denies all liability for incomplete, improper, inaccurate, intercepted, (partly) destroyed, lost and/or belated transmission of the current information given that unencrypted electronic transmission cannot currently be guaranteed to be secure or error free. Upon request or in conformity with formal, contractual agreements, an originally signed hard copy will be sent to you to confirm the information contained in this E-mail. SPACEBEL denies all liability where E-mail is used for private use. SPACEBEL cannot be held responsible for possible viruses that might corrupt this message and/or your computer system. -------------------------------------------------------------------------------

fmigneault · 2024-08-19T14:38:52Z

@pvretano
After re-reading this, the main takeaway to address is:
"there is no way to force a behavior with Prefer where sync is not an appropriate default".

The issue is rather about the default handling of Prefer, not the default "execution mode" per se.

In other words, if a process supports both sync/async, the following can happen:

Prefer: respond-async (or explicitly Prefer: respond-async, wait=0) will suggest to the server to run async
⚠️ However, if the job queue is full for example, the server CAN still ignore it and run in sync, since it is technically supported by the process.
Prefer: wait=10000000000000 will suggest to run in sync
⚠️ However, if that wait time cannot be respected (eg: server limitation, greater than max request connection time, etc.), the server CAN still run in async (and respond with 200 job status).

The biggest issue is that Prefer can still be ignored (by definition). The only way to check if it was respected is by looking up the Preference-Applied in the response. However, at that point, it might be too late for the intended behavior by the client (eg: big job better handled in async is already started in sync - won't be received as desired, resources already wasted).

One way to address this, as mentioned in #413 (comment), is to use Prefer: handling=strict, to indicate to the server that, if sync/async cannot be respected for whatever reason, it should not try to be smart about it and fall back to the "default" sync. Instead, it would reply with a 4xx code, effectively "forcing" the prefered execution mode. Prefer: handling=lenient would be the default behavior of auto-resolving as currently described by the standard.

jerstlouis · 2024-08-19T14:58:29Z

Requirements or mention of Prefer: handling=lenient/strict sound good to me.

My understanding of the requirements relating to the Prefer: header is that it is primarily the client's way of letting the server know that it is capable of handling async processing when executing a process, noting that async execution from the client's perspective is signficantly more complicated -- it is a POST request, plus parsing the JSON response, plus a polling loop, followed by more HTTP requests for job results, and finally more HTTP get requests to request outputs -- as opposed to a single POST request that can be executed with a simple curl command.

Although this may seem redundant when executing a process supporting only async, a client expecting the process to always be executed async, even in the event that the server introduces new sync support for that same process, should always include the Prefer: request header. We should add an IMPORTANT: or WARNING: regarding this in the standard which I think would address the main concern here.

fmigneault · 2024-08-20T16:03:46Z

@jerstlouis
Note that async is not necessarily that much more complicated when using subscribers. A single request can be POST'd as a "shoot and forget", and then you obtain the result "at some point" directly at the desired location (as if it was sync-executed) by the subscriber callback URI. Using this approach allows working around HTTP request connections that would otherwise be cut off because the request took too long to respond.

The async execution might be preferred to avoid cases where, depending on input dimension, we are always threading that fine line between closed connection or not. Using async, we would not have to worry about encountering that case no matter which input dimension is submitted. However, the process itself could very well work fine in sync for a given smaller input submitted in other situations.

I believe always using Prefer: handling=string, respond-async to submit the request to handle this kind of use case (whether sync is supported by the process or not) would be an acceptable solution, as it should resolve the same way in each case (error 4xx if async not possible).

jerstlouis · 2024-08-20T16:29:33Z

not necessarily that much more complicated when using subscribers.

That is a separate requirement class that the server may or may not support thoug, right? This is the Callback requirement class? This is also a security vulnerability from the perspective of an open service accepting requests from anyone without authorization which can trigger the server making any URL request. And so is async in general compared to sync whereas you may limit the number of connections from a particular client, and you won't execute anything more from that client if it already has e.g., 5 processes waiting on sync exec requests. Whereas with async, a client may just queue thousands of requests one after the other.

@pvretano We should probably add a mention in the Security Considerations about the Callback requirement class security vulnerabilities.

fine line between closed connection or not.

Increasing connection time outs may be one way to address that :)

fmigneault · 2024-08-20T17:24:15Z

Even if it is defined in a separate requirement class, it is a valid use case. Since Core provides this as a valid mechanism, the standard must provide all necessary means to handle it without side effects from defaults.

Maybe open a separate issue about the security concern to avoid diverging in this thread.
Personally, I don't see what is insecure about this. All AsyncAPI functionalities work this way.

Increasing connection time outs may be one way to address that :)

If anything, that is a bigger security concern. Great way to cause a server to DDoS.

fmigneault · 2024-08-28T21:26:49Z

Will try implementing the suggested Prefer: handling parameter.

I'm proposing "408 Request Timeout" for cases where Prefer: wait=X, handling=strict cannot be respected (since the server is not permitted to fall back to async by strict).

For the opposite case, where Prefer: respond-async, handling=strict cannot be respected, I think
"412 Precondition Failed" would be a good choice (since 'timeout' doesn't much sense in that case).

gfenoy · 2024-09-12T13:44:34Z

Using the handling=strict and handling=lenient preferences makes a lot of sense IMO.

I would like to know if we all agree to force asynchronous execution in OGC API - Processes - Part 4: Job Management when starting a job using POST on the /jobs/xxx-xxxx-xxxx-xxxx/results path. It is the default in OpenEO, so this extension may consider distinguishing between execution modes differently than the Core Standard.

If we agree on the previous point, I would like to propose adding the /results path to Part 4 for synchronous execution using an execute request based on the execute-workflow.yaml schema (or an OpenEO graph).

fmigneault · 2024-09-12T22:03:28Z

I would like to know if we all agree to force asynchronous execution in OGC API - Processes - Part 4: Job Management when starting a job using POST on the /jobs/xxx-xxxx-xxxx-xxxx/results path

IMO, it should behave just like /processes/{processID}/execution to avoid confusion.

If the process indicates that it can only run synchronously, making async the default would cause it to always fail, and require explicitly adding the Prefer header each time, defeating the purpose of having the default.

That being said, I strongly believe that the core issue remains, as it as been mentioned many times, that the Rec-25C somewhat contradicts what Rec-26C indicates in Execution mode. The server should have the option to decide its own default (aka the "auto" from the previous revision) when nothing was requested explicitly. That would allow openEO to use its own default (see below), and not force servers to run sync by default.

It is the default in OpenEO

I believe openEO creates the job, but does not put it in queue until /jobs/{job_id}/results is requested. Therefore, its default is neither sync nor async, but rather some third "pending" behavior.

bpross-52n closed this as completed May 27, 2024

pvretano reopened this May 28, 2024

m-mohr mentioned this issue Jun 21, 2024

Central job submission endpoint (also: POST /processes/{processID}/execution is not RESTful and is confusing for workflows) #419

Open

pvretano self-assigned this Aug 5, 2024

fmigneault mentioned this issue Aug 28, 2024

Support Prefer: handling=strict crim-ca/weaver#701

Open

6 tasks

gfenoy mentioned this issue Sep 23, 2024

Proposal OGC API - Processs - Part4: Job Management #437

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default execution mode should be asynchronous. #413

Default execution mode should be asynchronous. #413

pvretano commented May 26, 2024 •

edited

Loading

jerstlouis commented May 26, 2024 •

edited

Loading

pvretano commented May 26, 2024

jerstlouis commented May 27, 2024 •

edited

Loading

bpross-52n commented May 27, 2024

fmigneault commented May 28, 2024

jerstlouis commented May 28, 2024 •

edited

Loading

pvretano commented May 28, 2024 •

edited

Loading

jerstlouis commented May 28, 2024 •

edited

Loading

pvretano commented May 28, 2024

jerstlouis commented May 28, 2024 •

edited

Loading

pvretano commented May 28, 2024

jerstlouis commented May 28, 2024 •

edited

Loading

fmigneault commented May 28, 2024

m-mohr commented Jun 21, 2024

fmigneault commented Jun 22, 2024

jerstlouis commented Jun 22, 2024 •

edited

Loading

fmigneault commented Jun 22, 2024

m-mohr commented Jun 22, 2024 •

edited

Loading

jerstlouis commented Jun 24, 2024

fmigneault commented Jun 25, 2024

christophenoel commented Jun 26, 2024 via email

fmigneault commented Aug 19, 2024

jerstlouis commented Aug 19, 2024

fmigneault commented Aug 20, 2024

jerstlouis commented Aug 20, 2024 •

edited

Loading

fmigneault commented Aug 20, 2024

fmigneault commented Aug 28, 2024

gfenoy commented Sep 12, 2024

fmigneault commented Sep 12, 2024

Default execution mode should be asynchronous. #413

Default execution mode should be asynchronous. #413

Comments

pvretano commented May 26, 2024 • edited Loading

jerstlouis commented May 26, 2024 • edited Loading

pvretano commented May 26, 2024

jerstlouis commented May 27, 2024 • edited Loading

bpross-52n commented May 27, 2024

fmigneault commented May 28, 2024

jerstlouis commented May 28, 2024 • edited Loading

pvretano commented May 28, 2024 • edited Loading

jerstlouis commented May 28, 2024 • edited Loading

pvretano commented May 28, 2024

jerstlouis commented May 28, 2024 • edited Loading

pvretano commented May 28, 2024

jerstlouis commented May 28, 2024 • edited Loading

fmigneault commented May 28, 2024

m-mohr commented Jun 21, 2024

fmigneault commented Jun 22, 2024

Before

Current

jerstlouis commented Jun 22, 2024 • edited Loading

fmigneault commented Jun 22, 2024

m-mohr commented Jun 22, 2024 • edited Loading

jerstlouis commented Jun 24, 2024

fmigneault commented Jun 25, 2024

christophenoel commented Jun 26, 2024 via email

fmigneault commented Aug 19, 2024

jerstlouis commented Aug 19, 2024

fmigneault commented Aug 20, 2024

jerstlouis commented Aug 20, 2024 • edited Loading

fmigneault commented Aug 20, 2024

fmigneault commented Aug 28, 2024

gfenoy commented Sep 12, 2024

fmigneault commented Sep 12, 2024

pvretano commented May 26, 2024 •

edited

Loading

jerstlouis commented May 26, 2024 •

edited

Loading

jerstlouis commented May 27, 2024 •

edited

Loading

jerstlouis commented May 28, 2024 •

edited

Loading

pvretano commented May 28, 2024 •

edited

Loading

jerstlouis commented May 28, 2024 •

edited

Loading

jerstlouis commented May 28, 2024 •

edited

Loading

jerstlouis commented May 28, 2024 •

edited

Loading

jerstlouis commented Jun 22, 2024 •

edited

Loading

m-mohr commented Jun 22, 2024 •

edited

Loading

jerstlouis commented Aug 20, 2024 •

edited

Loading