-
Notifications
You must be signed in to change notification settings - Fork 6
CovJSON, CF-JSON and NCO-JSON #86
Comments
My understanding is that NCO JSON can represent the full netCDF-4 data model. |
NCO produces JSON for classic (netCDF3) and extended (netCDF4) data. Sample input and output files are here: http://dust.ess.uci.edu/tmp/in* The in_grp* are hierarchical (netCDF4) and the in.* are flat/classic (netCDF3). NCO defines three options that trade-off increasing JSON complexity for binary-reproducibility. It's all documented in the NCO manual. We welcome any discussion/feedback to improve/extend the format! |
NCO's JSON options and covJSON have very different goals and approaches. CF-JSON and NCO's JSON are very similar in goals and approaches. |
I do think covJSON DOES overlap with CF, and where is does, it would be nice to have it be compatible with the netcdf-cf json representations. In fact, there might be room to go the other way -- under CF, there is a discussion of how to model geometries in netcdf-CF -- maybe it could be informed by covJSON. |
Thanks everyone for your comments. @ChrisBarker-NOAA - if you have any suggestions for any constructs that CovJSON could adopt/reuse from CF, please feel free to raise an issue on this site. Just to expand on @BobSimons' point:
The "core" CovJSON spec does not include specific feature types. It defines a general structure that can be used to encode a very wide range of features. The encoding of all features is structurally the same - the only thing that changes is the form of the domain. (The parameters and range objects don't need to know anything about feature types.) However, we recognise that it's useful for clients to be able to quickly detect what kind of feature is being encoded (timeseries, grid, vertical profile etc). Hence a Coverage can contain a Users can create their own domain types if they want - this is part of the extensibility mechanism. Or they don't have to use domain types at all - but that makes general-purpose clients harder to write. |
Honestly, I'm not much of a GIS guy, so I"m still a bit confused about what exactly a "coverage" is. (I think I know that features are...) But there is always been a bit of a disconnect between the data models for scientific data (and model results) that I deal with and the "standard" GIS data model. And it seems the CovJSON is trying to close this gap a bit. So: CF was designed for scientific data and netcdf. But while CF was designed fr netcdf, it does, in fact, impose a data model that can be used and adapted to other file formats or programming environments. And netcdf also defines both a specific file format, and a data model. So: The CF data model can be mapped to other file formats So these can be done more or less orthogonally, so I suggest that: CF-JSON and NCO-JSON be focused on mapping netcdf to JSON in a standard way. Once that is done, then CF itself becomes a metadata standard that can simply be applied to JSON the same way as netcdf. So CovJSON can them take a similar tack -- for those "things" that overlap between CF and what you want Cov-JSON t cover, the CF standards are used, and expressed in JSON in a way compatible with CF_JSON/NCO_JSON (maybe call it nc-JSON?). Maybe we need a coverage CF spec? Note that additional specs can be "added on" to CF if they are done in a compatible way. Also note that there is an effort afoot to add GIS_like geometry specs to CF: https://github.com/twhiteaker/netCDF-CF-simple-geometry and cf-convention/cf-conventions#115 (I've lost track of the "official" status or even where the latest discussion lives...) But I expect there is room for collaboration/alignment there, too. -CHB |
There has been a discussion on the Climate and Forecast mailing list about different JSON formats for recording NetCDF data. Until this discussion I wasn't aware that there are a couple of other initiatives going on:
The discussion revealed that these two initiatives are quite similar in aim to each other. They both aim to translate the NetCDF(-3) [edit/correction - NCO also supports NetCDF4] data model into JSON and apply the CF metadata conventions directly.
CovJSON does not have quite the same aim: it operates at a higher level of abstraction and does not mimic any particular existing format. Here are a few comparison points between CovJSON, CF-JSON and NCO, intended to stimulate discussion. I'm going to make a simplifying assumption that CF-JSON and NCO are very similar in respect of the points made here:
CF-JSON and NCO are likely to be more familiar with users who are already comfortable with NetCDF and the CF conventions.
Conversely, users who are unfamiliar with CF/NetCDF may find CovJSON easier to understand (at least, that's our intention...). CovJSON does not assume that the data are "born in NetCDF format".
CovJSON borrows concepts from ISO and OGC standards, and may be conceptually more familiar to folk from those communities. It's intended to provide a "bridge" between what we might loosely call the "NetCDF community" and the "GIS community".
CovJSON cannot (yet) encode all possibilities afforded by CF/NetCDF. For example, cell methods and climatological time are not yet supported in CovJSON. So if entirely "lossless" encoding of CF-NetCDF in JSON is required, CF-JSON and NCO may be more appropriate choices.
The NetCDF(-3) data model struggles to accommodate certain types of data structures. It is quite a "flat" structure, and the mechanisms required to link relevant data together in a NetCDF file can be quite hard to understand. (Coordinate reference systems, and their links to dimensions and variables are one example. Encoding geometries is another.) I assume that JSON formats based directly on NetCDF will suffer from similar issues, forcing clients to implement some of the more complex parts of the CF conventions in order to piece the information back together. By contrast, CovJSON aims to repartition the same information in a way that is (hopefully) easier for clients to deal with, using the possibilities afforded by JSON.
By virtue of the above, I would argue that non-gridded data (e.g. observations from points or moving platforms), which often require the recording of geometries, trajectories and other "composite" coordinate types, are easier to encode and understand in CovJSON than in NetCDF. (Concretely, CovJSON provides the facility for "tuple" and "polygon" axis types: https://covjson.org/spec/#axis-objects. These require some gymnastics to encode in NetCDF.)
CovJSON provides mechanisms to partition large datasets among different files (e.g. holding range objects in separate files, the tiling scheme). This is done for "web-friendliness", i.e. avoiding large monolithic files. I'm not aware that CF-JSON and NCO have this facility, although I may be wrong.
On a more minor point of implementation, CovJSON encodes data values as flat, 1-D arrays (the reason why is explained here. CF-JSON and NCO use nested arrays.
Discussion of the above points (and addition of new ones!) is most welcome. My intention is not to evangelise for CovJSON, but to point out points of similarity and departure (philosophically and structurally) between CovJSON, CF-JSON and NCO. If we can understand these points we'll be in a better place to discuss whether we should look at merging these initiatives.
The text was updated successfully, but these errors were encountered: