Skip to content

Catalog design and architecture

David Huard edited this page Jan 29, 2021 · 2 revisions

Data catalog

Proposal

  1. Define catalogs for homogeneous collections where a clear controlled vocabulary (CV) or Data Reference Syntax (DRS) can be defined
  2. Make sure the fusion of heterogeneous collections is possible, but leave the details to end-users
  3. Define a short list of common attributes that will be shared among all collections
  4. Respect the semantics of attributes, i.e. do not hijack attributes for the purpose of finding a dataset more easily
  5. Use provenance language to describe relationships between datasets. e.g. MetaClip

Implementation

At the moment, the data that we want to expose is first aggregated using an NcML document, which presents to users a view of multiple individual files served by a THREDDS Data Server (TDS). This allows us to modify attributes without changing the original files on disk. TDS has an NCML service which returns an XML document describing netCDF or NcML content. We can scrape this XML content to feed into a DB or create a catalog.

Clone this wiki locally