Skip to content

Commit

Permalink
Merge branch 'master' of github.com:crim-ca/weaver into parsing-metadata
Browse files Browse the repository at this point in the history
  • Loading branch information
Nazim-crim committed Sep 10, 2024
2 parents ea2d487 + e78fa89 commit cef42bd
Show file tree
Hide file tree
Showing 59 changed files with 3,896 additions and 406 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,14 @@ jobs:
continue-on-error: true
runs-on: ubuntu-latest
outputs:
should_skip: ${{ steps.skip_duplicate.outputs.should_skip && ! contains(github.ref, 'refs/tags') }}
should_skip: ${{ steps.skip_duplicate.outputs.should_skip && ! contains(github.ref, 'refs/tags') && ! contains(github.ref, 'origin/master') }}
steps:
- id: skip_check
uses: fkirc/skip-duplicate-actions@master
with:
concurrent_skipping: "same_content"
concurrent_skipping: "same_content_newer"
skip_after_successful_duplicate: "true"
do_not_skip: '["pull_request", "workflow_dispatch", "schedule", "release"]'
do_not_skip: '["workflow_dispatch", "schedule", "release"]'

# see: https://github.com/actions/setup-python
tests:
Expand Down
65 changes: 60 additions & 5 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,66 @@ Changes

Changes:
--------
- No change.

Fixes:
------
- No change.
- Add `CWL` schema definitions with ``weaver`` namespace
(see `weaver/schemas/cwl <https://github.com/crim-ca/weaver/tree/master/weaver/schemas/cwl>`_)
that provide explicit requirement classes
for ``weaver:BuiltinRequirement``, ``weaver:WPS1Requirement``, ``weaver:OGCAPIRequirement``
and ``weaver:ESGF-CWTRequirement`` to avoid missing reference warnings that were previously raised by ``cwltool``
due to `Application Packages` using their non-``weaver`` namespaced classes in ``hints``. These new `CWL`
definitions can be reported directly in the ``requirements`` section, better describing the required dependencies
of the referenced `Process` and/or `Provider` in the workflow steps.
- Add hosted `CWL` schema definitions for ``weaver`` accessible at the ``https://schemas.crim.ca/cwl/weaver#`` endpoint.
- Add support of ``weaver`` namespaced ``requirements`` to the ``cwltool`` runner.
- Add better validation off well-known `CWL` ``$namespaces`` as reserved keywords when deploying a `Process` to ensure
better interoperability between implementations and adequate metadata resolution
(relates to `#463 <https://github.com/crim-ca/weaver/issues/463>`_).
- Add documentation about *Jupyter Notebook* to `CWL` conversion
utility `ipython2cwl <https://github.com/common-workflow-lab/ipython2cwl>`_
and a sample `crim-ca/ncml2stac <https://github.com/crim-ca/ncml2stac/tree/main#ncml-to-stac>`_ repository
making use of it with the `Weaver` `CLI` to generate a deployed `OGC API - Processes` definition
(fixes `#63 <https://github.com/crim-ca/weaver/issues/63>`_).

Fixes:
------
- Fix ``VariableSchemaNode`` resolution of child nodes with complex mixture of ``StrictMappingSchema`` or when
using the equivalent ``unknown = "raise"`` parameter for a ``colander.Mapping`` schema type to
disallow ``additionalProperties`` that cannot be mapped to a particular child `JSON` schema definition.
- Fix ``VariableSchemaNode`` resolution to allow mapping against multiple ``variable`` sub-nodes representing
different nested `JSON` schema nodes permitted under the ``additionalProperties`` mapping.

.. _changes_5.8.0:

`5.8.0 <https://github.com/crim-ca/weaver/tree/5.8.0>`_ (2024-09-05)
========================================================================

Changes:
--------
- Add support of *OGC API - Processes: Part 3* ``collection`` as input to a `Process`
(fixes `#682 <https://github.com/crim-ca/weaver/issues/682>`_).
- Add ``AnyCRS`` schema definition with improved validation of allowed values.
- Use ``AnyCRS`` schema for ``SupportedCRS``, ``XMLStringCRS``, ``BoundingBoxValue`` and ``ExecuteCollectionInput``
instead of a generic ``URL`` schema definition for better reference validation, while allowing alternate short forms.
- Add auto-resolution of media-type for cases where it can reasonably be inferred from a ``schema`` reference,
such as an URI referring to a ``.json`` or ``.xsd`` respectively representing `JSON` and `XML` data.
- Update ``cwltool`` with fork
`fmigneault/cwltool @ fix-load-contents-array <https://github.com/fmigneault/cwltool/tree/fix-load-contents-array>`_
until ``loadContents`` behavior is resolved for ``type: File[]``
(relates to `common-workflow-language/cwltool#2036 <https://github.com/common-workflow-language/cwltool/pull/2036>`_).

Fixes:
------
- Fix `CWL` I/O with ``format`` defined as a `JavaScript Expression` to be incorrectly parsed by the convertion
operations to extract applicable media-types. These cases will be ignored, since media-types cannot be inferred
from them. The `WPS` or `OAS` I/O definitions should instead provide the applicable media-types
(relates to `common-workflow-language/cwl-v1.3#52 <https://github.com/common-workflow-language/cwl-v1.3/issues/52>`_).
- Fix ``format`` parsing when trying to infer media-types from various I/O definition representations using a
reference provided as an URI schema from an ontology. Parsing caused the URI to be split, causing an invalid
resolution. If no appropriate media-type is provided, JSON will be used by default, while preserving the submitted
schema URI.
- Fix invalid resolution of ``weaver.formats.ContentEncoding.open_parameters``.
- Fix minor resolution combinations or redundant checks for multiple ``weaver.formats`` utilities.
- Fix `CWL` ``format`` resolution check against `IANA` media-types if the reference ontology happens to be
temporarily/sporadically unresponsive to SSL handshake check, allowing temporary HTTP resolution of media-type.

.. _changes_5.7.0:

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ MAKEFILE_NAME := $(word $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST))
# Application
APP_ROOT := $(abspath $(lastword $(MAKEFILE_NAME))/..)
APP_NAME := $(shell basename $(APP_ROOT))
APP_VERSION ?= 5.7.0
APP_VERSION ?= 5.8.0
APP_INI ?= $(APP_ROOT)/config/$(APP_NAME).ini
DOCKER_REPO ?= pavics/weaver
#DOCKER_REPO ?= docker-registry.crim.ca/ogc/weaver
Expand Down
20 changes: 10 additions & 10 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,13 @@ for each process.
:alt: Requires Python 3.8+
:target: https://www.python.org/getit

.. |commits-since| image:: https://img.shields.io/github/commits-since/crim-ca/weaver/5.7.0.svg
.. |commits-since| image:: https://img.shields.io/github/commits-since/crim-ca/weaver/5.8.0.svg
:alt: Commits since latest release
:target: https://github.com/crim-ca/weaver/compare/5.7.0...master
:target: https://github.com/crim-ca/weaver/compare/5.8.0...master

.. |version| image:: https://img.shields.io/badge/latest%20version-5.7.0-blue
.. |version| image:: https://img.shields.io/badge/latest%20version-5.8.0-blue
:alt: Latest Tagged Version
:target: https://github.com/crim-ca/weaver/tree/5.7.0
:target: https://github.com/crim-ca/weaver/tree/5.8.0

.. |deps| image:: https://img.shields.io/librariesio/github/crim-ca/weaver
:alt: Libraries.io Dependencies Status
Expand All @@ -75,9 +75,9 @@ for each process.
:alt: Github Actions CI Build Status (master branch)
:target: https://github.com/crim-ca/weaver/actions?query=workflow%3ATests+branch%3Amaster

.. |github_tagged| image:: https://img.shields.io/github/actions/workflow/status/crim-ca/weaver/tests.yml?label=5.7.0&branch=5.7.0
.. |github_tagged| image:: https://img.shields.io/github/actions/workflow/status/crim-ca/weaver/tests.yml?label=5.8.0&branch=5.8.0
:alt: Github Actions CI Build Status (latest tag)
:target: https://github.com/crim-ca/weaver/actions?query=workflow%3ATests+branch%3A5.7.0
:target: https://github.com/crim-ca/weaver/actions?query=workflow%3ATests+branch%3A5.8.0

.. |readthedocs| image:: https://img.shields.io/readthedocs/pavics-weaver
:alt: ReadTheDocs Build Status (master branch)
Expand All @@ -89,7 +89,7 @@ for each process.

.. below shield will either indicate the targeted version or 'tag not found'
.. since docker tags are pushed following manual builds by CI, they are not automatic and no build artifact exists
.. |docker_build_status| image:: https://img.shields.io/docker/v/pavics/weaver/5.7.0?label=tag%20status
.. |docker_build_status| image:: https://img.shields.io/docker/v/pavics/weaver/5.8.0?label=tag%20status
:alt: Docker Build Status (latest version)
:target: https://hub.docker.com/r/pavics/weaver/tags

Expand Down Expand Up @@ -237,12 +237,12 @@ For a prebuilt image, pull as follows:

.. code-block:: shell
docker pull pavics/weaver:5.7.0
docker pull pavics/weaver:5.8.0
For convenience, following tags are also available:

- ``weaver:5.7.0-manager``: `Weaver` image that will run the API for WPS process and job management.
- ``weaver:5.7.0-worker``: `Weaver` image that will run the process job runner application.
- ``weaver:5.8.0-manager``: `Weaver` image that will run the API for WPS process and job management.
- ``weaver:5.8.0-worker``: `Weaver` image that will run the process job runner application.

Following links correspond to existing servers with `Weaver` configured as *EMS* or *ADES* instances respectively.

Expand Down
2 changes: 1 addition & 1 deletion docker/Dockerfile-base
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ LABEL description.short="Weaver Base"
LABEL description.long="Workflow Execution Management Service (EMS); Application, Deployment and Execution Service (ADES)"
LABEL maintainer="Francis Charette-Migneault <francis.charette-migneault@crim.ca>"
LABEL vendor="CRIM"
LABEL version="5.7.0"
LABEL version="5.8.0"

# setup paths
ENV APP_DIR=/opt/local/src/weaver
Expand Down
7 changes: 7 additions & 0 deletions docs/examples/collection-input-basic.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"inputs": {
"image-input": {
"collection": "https://example.com/collections/sentinel-2"
}
}
}
21 changes: 21 additions & 0 deletions docs/examples/collection-input-filter-cql2-json-ogc-features.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"inputs": {
"features": {
"collection": "https://example.com/collections/dataset-features",
"format": "ogc-feature-collection",
"filter": {
"op": "s_intersects",
"args": [
{"property": "geometry"},
{
"type": "Polygon",
"coordinates": [ [30, 10], [40, 40], [20, 40], [10, 20], [30, 10] ]
}
]
},
"filter-crs": "https://www.opengis.net/def/crs/OGC/1.3/CRS84",
"filter-lang": "cql2-json",
"sortBy": "-id"
}
}
}
11 changes: 11 additions & 0 deletions docs/examples/collection-input-filter-cql2-text-stac.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"inputs": {
"images": {
"collection": "https://example.com/collections/sentinel-2",
"format": "stac-collection",
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"filter": "properties.eo:cloud_cover < 0.1",
"filter-lang": "cql2-text"
}
}
}
22 changes: 22 additions & 0 deletions docs/examples/jupyter_repo2cwl_python.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import csv
import json
from typing import TYPE_CHECKING

if TYPE_CHECKING:
# This block is only evaluated by type checkers (and jupyter-repo2cwl).
# Therefore, it is not executed when running hte notebook.
# In other words, 'ipython2cwl' does not even need to be installed!
from ipython2cwl.iotypes import CWLFilePathInput, CWLFilePathOutput

input_file: "CWLFilePathInput" = "data.csv"
with open(input_file, mode="r", encoding="utf-8") as f:
csv_reader = csv.reader(f)
data = [line for line in csv_reader if line]

headers = data[0]
values = data[1:]
items = [{k: v} for val in values for k, v in zip(headers, val)]

output_file: "CWLFilePathOutput" = "output.json"
with open(output_file, mode="w", encoding="utf-8") as f:
json.dump(items, f)
97 changes: 89 additions & 8 deletions docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,18 +44,47 @@ Glossary
- :ref:`quotation`
- :ref:`conf_quotation`

Builtin Process
An immutable :term:`Process` that comes pre-packaged with `Weaver`, without need to be deployed.
This is usually an "utility" or "converter" :term:`Process` that is often reused across :term:`Workflow`
definitions.

.. seealso::
Refer to the :ref:`proc_builtin` section for more details about available processes.

CLI
| Command Line Interface
| Script that offers interactions through shell commands or Python scripts to execute any described operations.
Details of the provided `Weaver` commands are described in :ref:`cli` chapter.
Collection
A geospatial resource that may be available as one or more sub-resource distributions
that conform to one or more |ogc-api-standards|_. Additionally, |stac-collections|_ can
be included in this group.

Please refer to the :term:`OGC` official |ogc-collection|_ for more details
and complementary terminology.

CRS
Coordinate Reference System
Geospatial data encoding of the representation parameters, describing the structure to locate entities,
in terms of axis order (latitude, longitude, altitude, etc.), dimension (2D, 3D, etc.), in respect to a
specific celestial object, a position of origin, scale and orientation (i.e: |datum-def|_).

.. seealso::
- Reference :term:`W3C`/:term:`OGC` documentation about |crs|_.
- Reference definition of |crs-def|_.

CWL
| |cwl|_
| Representation of the internal :term:`Application Package` of the :term:`Process` to provide execution
methodology of the referenced :term:`Docker` image or other supported definitions.
A |cwl|_ file can be represented both in :term:`JSON` or :term:`YAML` format, but is often represented
in :term:`JSON` in the context of `Weaver` for its easier inclusion within HTTP request contents.
See :ref:`application-package` section for further details.
Common Workflow Language
Representation of the internal :term:`Application Package` of the :term:`Process` to provide execution
methodology of the referenced :term:`Docker` image or other supported definitions.
A |cwl|_ file can be represented both in :term:`JSON` or :term:`YAML` format, but is often represented
in :term:`JSON` in the context of `Weaver` for its easier inclusion within HTTP request contents.

.. seealso::
- Official |cwl|_ documentation.
- :ref:`application-package` section for further details.

Data Source
Known locations of remote servers where an :term:`ADES` or :term:`EMS`
Expand All @@ -76,6 +105,7 @@ Glossary
types by providing additional formats that are more specifics to some data domains.

EMS
Execution Management Service
| |ems|
| See :ref:`processes` section for details.
Alternative operation modes are described in :ref:`Configuration Settings`.
Expand All @@ -89,11 +119,42 @@ Glossary
:ref:`opensearch_data_source` section.

ESGF
|esgf|_
Earth System Grid Federation
An open source effort providing a robust, distributed data and computation platform,
enabling world wide access to large-scale scientific data.

.. seealso::
|esgf|_ official website.

ESGF-CWT
|esgf-cwt-git|_

.. seealso::
:ref:`proc_esgf_cwt` for more details about the :term:`Process` type.

Feature
An abstraction of real-world phenomena into a digital entity representation, which includes
information detailing its *extent* (i.e.: how it is placed and located in time and space).

.. seealso::
- :term:`OGC` |feature-ogc-def|_ definition.
- :term:`W3C` |feature-w3c-def|_ definition.
- :term:`W3C` |feature-w3c-desc|_ examples and extended description.

GeoJSON
| Geospatial :term:`JSON`
| A specific :term:`JSON` format representation for encoding a variety of geographic data structures,
such as ``Point``, ``LineString``, ``Polygon``, ``MultiPoint``, ``MultiLineString``, and ``MultiPolygon``,
``Feature``, and ``FeatureCollection``.
.. seealso::
Refer to the official |geojson|_ specification for more details.

.. note::
Multiple extended or derived variants exist. Notably, the |ogc-api-features|_ and |stac-spec|_ define
additional ``properties`` or additional ``type`` values for particular use cases, but remain generally
interoperable and compatible.

HREF
| Hyperlink Reference
| Often shortened to simply `reference`. Represents either a locally or remotely accessible item, such as a
Expand Down Expand Up @@ -136,6 +197,7 @@ Glossary
such as ``&`` or ``;`` to distinguish between distinct pairs. Specific separators, and any applicable
escaping methods, depend on context, such as in URL query, HTTP header, :term:`CLI` parameter, etc.
Media-Type
Media-Types
MIME-types
| Multipurpose Internet Mail Extensions
Expand All @@ -153,7 +215,12 @@ Glossary
|OpenAPI-spec|_

OGC
|ogc|_
Open Geospatial Consortium
International standards organization for geospatial data and processing best practices
that establishes most of the :term:`API` definition implied under `Weaver`.

.. seealso::
|ogc|_

OAP
OGC API - Processes
Expand Down Expand Up @@ -207,6 +274,14 @@ Glossary
S3
Simple Storage Service (:term:`AWS` S3), bucket file storage.

STAC
| SpatioTemporal Asset Catalog
| Language used to describe geospatial information, using extended definitions of :term:`GeoJSON`,
and which can usually be searched using a |stac-api-spec|_ compliant with |ogc-api-features|_.
.. seealso::
Please refer to the |stac-spec|_ for more details.

TOI
| Time of Interest
| Corresponds to a date/time interval employed for :term:`OpenSearch` queries in the context
Expand Down Expand Up @@ -246,6 +321,12 @@ Glossary
- :ref:`vault_upload`
- :ref:`file_vault_inputs`

W3C
World Wide Web Consortium
Main international standards organization for the World Wide Web.
Since |ogc-api-standards|_ are based on HTTP and web communications, this consortium establishes the
common foundation definitions used by the :term:`API` specifications.

WKT
Well-Known Text geometry representation.

Expand Down
Loading

0 comments on commit cef42bd

Please sign in to comment.