Skip to content
This repository has been archived by the owner on Dec 1, 2022. It is now read-only.

How to define calendar / time CRS? #3

Closed
letmaik opened this issue Jul 28, 2015 · 36 comments
Closed

How to define calendar / time CRS? #3

letmaik opened this issue Jul 28, 2015 · 36 comments

Comments

@letmaik
Copy link
Member

letmaik commented Jul 28, 2015

We need a way to define what a date/time instant string, or number (duration since x) is referring to. So this is about calendar systems (e.g. Gregorian), string notations (e.g. ISO8601), time standards (e.g. UTC).

@jonblower
Copy link
Member

I think the main question is, do we allow both string-valued times (e.g. ISO8601) and numeric-valued times (with a temporal CRS definition), or only one of these options?

Either way, we need a calendar system to interpret the time strings (time values or temporal datum).

Assuming that the calendar system is represented as a URI, where do we record this in the JSON? And if we're using numeric-valued times then we need a place to put the CRS definition ("days since X", which appear as "units" under CF).

A minimum requirement is that time strings should always have time zone indicators. A possible stricter requirement is that they should always be in UTC (Zulu time).

@letmaik
Copy link
Member Author

letmaik commented Jul 28, 2015

For simplicity's sake I'm somehow against numeric-valued times. From the point of space usage I think we don't save too much in the CBOR case, compared to the whole document size and I'm pretty sure there are no JavaScript libraries which handle such temporal CRS definitions, except for the common UNIX time.

Maybe let's take some more complicated examples, excluding the standard time instant case (just a normal date-time).

Example 1: Climatological time TODO check CF conventions

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

First idea for a time string:

"referencing": [{
      "axes": ["t"],
      "time": {
        "id": "http://www.w3.org/2001/XMLSchema-datatypes#dateTime",
        "timeScale": {
          "id": "http://www.opengis.net/def/trs/BIPM/0/UTC"
        },
        "calendar": {
          "id": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"
        }
      }
    }
]

I don't like it yet.

For time crs:

WKT says "In this International Standard calendar dates and times are restricted to the Gregorian calendar, the 24-hour clock and UTC as defined in ISO 8601:2004. Only the ISO 8601 extended format (separators between date units and between sexagesimal time units) is permitted; truncation and extension is not catered for. Any precision is allowed. Other date formats such as geological eras or calendars other than Gregorian may be stated through a free format quoted text string."

Literal WKT translation for time CRSs:

"referencing": [{
      "axes": ["t"],
      "crs": {
        "type": "TimeCRS",
        "datum": {
          "origin": "1980-01-01T00:00:00.0Z"
        },
        "cs": {
          "type": "TimeCS",
          "axes": [{
            "unit": {
              "symbol": "second"
            }
          }]
        }
      }
}]

It's pretty verbose, I can't imagine people would like it, compared to something like CF's "seconds since 1980-01-01T00:00:00.0Z", maybe a compromise in between is better, since I don't want parsing to be necessary.

@jonblower
Copy link
Member

Since most of the WKT translation is boilerplate, how about:

"referencing" : [{
    "axes": ["t"],
    "origin": "1980-01-01T00:00:00.0Z",
    "unit": {
        "symbol": "second"
    },
    "calendar": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"
}]

I think this has the information we need in realistic situations. The origin is an ISO8601-like string that is essentially a microsyntax for a time instant. The fields of this instant (years, months, days, etc) must be interpreted in the given calendar system. (Hence we can use the same microsyntax for identifying instants in other calendars, like Julian or 360-day. Strictly, ISO8601 is Gregorian-only.)

If we allow string-valued time axes we don't need the origin or unit but we do need the calendar.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

I see you skipped the time standard. So we assume UTC by default which makes sense. Would it be correct to have an explicit time standard at the same level as origin, unit, calendar? Meaning, could I use e.g. UT1 together with a Gregorian calendar? If so, we're fine and it is easily extensible if we should need it at some point.

@jonblower
Copy link
Member

I guess we could have the time standard in there, but I don't remember ever seeing any data that wasn't UTC. I must admit I don't really understand all the issues involved. Chris Little (Met Office) would be the person to talk to.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

Ok, I know we're Earth-focused but I'd like to keep it flexible. The CDF format has a special time data type which was created to get rid of leap-second problems. Again, I know it's not relevant for our immediate use cases, but still! :) It says:

CDF_TIME_TT2000, defined as an 8-byte signed integer with a fixed Time_Base=J2000 (Julian date 2451545.0 TT or 2000 January 1, 12h TT), Resolution=nanoseconds, Time_Scale=Terrestrial Time (TT), Units=nanoseconds, Reference_Position=rotating Earth Geoid. Given a current list of leap seconds, conversion between TT and UTC is straightforward (TT = TAI + 32.184s; TT = UTC + deltaAT + 32.184s, where deltaAT is the sum of the leap seconds since 1960; for example, for 2009, deltaAT = 34s).

I think that would be equal to:

"referencing" : [{
    "axes": ["t"],
    "origin": "2000-01-01T12:00:00Z",
    "timeScale": "https://en.wikipedia.org/wiki/Terrestrial_Time",
    "unit": {
        "symbol": "nanosecond"
    },
    "calendar": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"
}]

(Note in the quote "2000 January 1, 12h" refers to a Gregorian date, not Julian, so this is correct)

Funny thing, some time standards don't know time zones, like TT above, making a thing like "Z" or a timezone offset invalid and the allowed syntax for "origin" would be slightly different.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

Although, UTC doesn't know time zones either, it's just an encoding of a local time with an offset to the time scale... so maybe let's ignore that. The origin above would still be "origin": "2000-01-01T12:00:00Z" with the Z. Everything fine...

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

I think we should put the time referencing within a field similar to "crs", then it is possible to reuse it and define it independently, which is important since I need to reuse referencing in the domain but also the parameters.

What about "trs"? As in temporal reference system. Would this adequately cover both cases of a temporal CRS and just a calendar with string encoding? That's how it's named in ISO 19108:2002 but of course I can't check it since it's not free...

@jonblower
Copy link
Member

I think "trs" is fine for now. "crs" might also do, since a trs is just a subtype of crs. It may be that "calendar with string encoding" is stretching the ISO definition of a trs but I don't think that's too important for now.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

Just found https://github.com/52North/PostTIME/wiki/Temporal-Reference-Systems which confirms that. Hm, I think the correct hierarchy is like that:

CRS = abstract thing for referencing coordinates
SRS (spatial RS) = CRS with datum and a spatial CS (coordinate system)
TRS (temporal RS) = CRS with calendar and either time CS or string encoding

So we could use crs for both or use srs and trs. I'd say crs for both and distinguishing it by the type:

"referencing" : [{
    "axes": ["t"],
    "crs": {
      "type": "TemporalCRS",
      "origin": "1980-01-01T00:00:00.0Z",
      "unit": {
        "symbol": "second"
      },
      "calendar": "http://www.opengis.net/def/uom/ISO-8601/0/Gregorian"
    }
}]

This is also more in line with WKT.

@jonblower
Copy link
Member

Looks good!

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

Hm, or maybe not. I was just looking at http://docs.geotools.org/stable/javadocs/org/opengis/referencing/ReferenceSystem.html and basically crs always means there are "coordinates" in some coordinate system. So rather:

RS = abstract thing for referencing things
CRS = abstract thing for referencing coordinates in a coordinate system
SRS (spatial RS) = abstract thing for referencing spatial things, but often synonymous with spatial CRS
spatial CRS = SRS and CRS with datum and a spatial CS (coordinate system)
TRS (temporal RS) = abstract thing for referencing time things
Calendar = TRS which is a calendar
Temporal CRS = TRS and CRS with calendar and time CS

I guess a thing like Calendar in the spatial world (so not a CRS but something else) could be a Geocoding SRS which allows place names as referencable things and references them to a position in the world.

SO, that means...

We should use "srs" and "trs". And both can be a CRS but don't have to. Of course we could also just use "rs" then but I think this is getting a little too wild and there is actually some value in being able to say that this is about time or spatial without looking further inside.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

And that also means we shouldn't use "coordinates" as field name in the domain axes, but instead as you suggested earlier "values", since these may be categorical things, or other non-coordinate things.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

What I'm still trying to find is a TRS subtype that combines a calendar, a time standard and a string notation. Maybe... StringTRS, which would have as subtype ISO8601InstantTRS (or rather xsd:dateTime!) or something like that.

@jonblower
Copy link
Member

My instinct would be to say that we use "Temporal CRS" for a proper units+datum CRS with numeric values and "Temporal RS" (i.e. the supertype) for string-valued axes, where we only need the calendar system and time standard. It's not wrong, just not very specific.

@letmaik
Copy link
Member Author

letmaik commented Nov 16, 2015

Ok, I can live with that.

@letmaik
Copy link
Member Author

letmaik commented Nov 17, 2015

I'm implementing it into the spec now, I think it would be better to use "calendar": "Gregorian" instead of the full URL since it will be so common. We can still associate it to the full URL via the JSON-LD context with "@type": "@vocab".

@jonblower
Copy link
Member

Good idea.

@letmaik
Copy link
Member Author

letmaik commented Nov 17, 2015

OK I'm more or less done with the trs object definition. The only open point is:

  • If the temporal RS object has the type "TemporalRS" then the referenced values must be strings conforming to the syntax of XXX`

If it's a CRS, then for the "origin" field I require a RFC3339 date-time string (YYYY-MM-DDTHH:MM:SS[.F]Z) but should the same be required for the referenced values? This probably touches the issues about uncertainty if we would allow YYYY-MM-DD or similar things as well.

@jonblower
Copy link
Member

I think that the origin should be assumed to be at least of second precision and sub-second precision could be given if necessary. But time coordinates could be of varying precisions, and I think we should specify that the precision is implied by the length of the string (e.g. YYYY, YYYY-MM, YYYY-MM-DD etc). Alternatively/additionally there could be a “precision” field on the time axis that specifies the level of precision that clients should assume.

(The use of dekads - 10-day periods - is reasonably common, and sometimes data are presented as averages over the dekad. This level of precision isn’t easy to express as truncated ISO-like strings. So there may be an argument for “precision=P1D” or something like that.)

@letmaik
Copy link
Member Author

letmaik commented Nov 17, 2015

(The use of dekads - 10-day periods - is reasonably common, and
sometimes data are presented as averages over the dekad. This level of
precision isn’t easy to express as truncated ISO-like strings. So
there may be an argument for “precision=P1D” or something like that.)
Hm, if the data is an average over a dekad then the bounds would
already cover this use case, there is no precision issue here I think.
This is equal to WaterML2's rainfall averages over the last period (a
day for example).

What would be an example where a dekad is used and the data is not an
average?

@jonblower
Copy link
Member

Well, things are not always simple averages. Data are often simply "representative" of a time period - sometimes we simply don't have more information or precision than that. A use case might be a composite satellite image that composes all the images from that day - the resulting image is not an average, but it would not many any sense to claim that the image was anchored to a millisecond instant. It is just "that day's data".

More generally, I was thinking that if we use the length of the ISO8601-like string as our indicator of precision, that only gives us precisions of 1 year, 1 month, 1 day, etc. We might want to be more flexible than that.

@letmaik
Copy link
Member Author

letmaik commented Nov 17, 2015

How about forcing a full date-time and having a field for precision as you suggested already? This would mean that the coordinate must be exactly in the middle of the precision interval. If you had a week precision, then that would mean for this week the coordinate would be "2015-11-19T12:00:00Z" (I think) with precision="7 days". So, the advantage is that it's more flexible, disadvantage is that it's more complex for year,month,day etc. precision in terms of data preparation compared to truncation. I see some issues with this though... people will just put in 2010-01-01T00:00:00Z with precision=1year and expect the equivalent of ISO's "2010" whereas it would be completely different and would refer to the year between the middle of 2009 and the middle of 2010. Do you think this would happen?

Fact is that ISO8601 defines that truncation means reduced precision and not "filling up with zeros". So we could use that for the simple year,month,day etc cases. We would not say the date is in an arbitrary ISO8601 syntax but instead a subset of it:
YYYY[-MM[-DD[THH[:mm[:ss[.F]]](Z|(+|-)HH:mm)]]]
Then it would be easy for the client to determine the precision (common time libraries don't extract the precision info):

var x = "2000-01-01T01Z"
var p = "unknown"
if (x.length === 4) {
  p = "year"
} else if (x.length === 7) {
  p = "month"
} else if (x.length === 10) {
  p = "day"
} else {
  if (x[x.length-1] === "Z") {
    if (x.length === 14) {
      p = "hour"
    } else if (x.length === 17) {
      p = "minute"
    } else if (x.length === 20) {
      p = "second"
    } else {
      p = "subsecond"
    }
  } else {
    if (x.length === 19) {
      p = "hour"
    } else if (x.length === 22) {
      p = "minute"
    } else if (x.length === 25) {
      p = "second"
    } else {
      p = "subsecond"
    }
  }
}
console.log(p)

So maybe an "easy" truncation syntax but also the complex one for more advanced cases. Obviously then when the "precision" field is given, the time string would be filled up with zeros in case something like second fraction is missing.

@jonblower
Copy link
Member

So maybe an "easy" truncation syntax but also the complex one for more advanced cases.

Yes, I agree with this.

By the way, I’m not sure we can always assume that the year is 4 characters long. Some cases (e.g. BC dates, dates far in the future) may violate this, even if ISO8601 does not allow them.

@letmaik
Copy link
Member Author

letmaik commented Nov 18, 2015

Ok, stepping into dangerous territory. I guess this could be for some paleoclimate timeseries. Most datetime libraries couldn't parse dates such as "20000" since 9999 is the ISO8601 maximum. So I think we should make this extended date notation a special case that simple clients can easily distinguish from ISO8601 dates. The question is how. And of course this would also apply to the CRS case with "origin", making it way more complex. An example axis may be going from "-12300" to "-3800" in 100 year steps, and I guess you would not use the minus sign but instead define a BC paleoclimate calendar. Let's forbid minus signs for now.

So, a practical way to make this work is to find a way to easily detect ISO8601-compatible dates. A simple client would check "if calendar=gregorian and dates=iso-string -> happy, calculate with dates; else -> just display dates/coordinates as is".

Checking for ISO compatible dates is simple (you would have to check all axis values like that):

var d = "2000" // "20000", "2000-02" ...
var isISO8601 = (d.length === 4 || d[4] === '-')

It gets tricker when a temporal CRS is used which has an ISO8601 "origin" but extends at some point above 9999. Thinking about it, the same applies to the strings as well. I thought earlier it would be enough to check the first axis value but this is only enough to extract the precision info. So... not sure.

@letmaik
Copy link
Member Author

letmaik commented Dec 22, 2015

Got an idea... how about: We only allow date strings if the dates are ISO8601. For anything else (like non 4-digit years, or non-Gregorian dates) a temporal CRS has to be used. This would simplify it a lot for clients I think. They don't have to think about custom parsing and we don't have to think so much about how to name the ISO notation to differentiate it from something else (where the "else" is not commonly defined). Also, this makes it easier to compare non-ISO date values since they are just numbers and the client has the choice to understand the temporal CRS or treat it as numbers as well, and just append an axis label or something.

@jonblower
Copy link
Member

This could be a good idea, although there still needs to be a date-time string for the temporal datum (i.e. the "X" in "days since X"). So there would still have to be a syntax for serialising non-Gregorian dates, even it if is "compartmentalised" only to the datum.

@letmaik
Copy link
Member Author

letmaik commented Jan 20, 2016

I think this issue is slowing down progress too much. Do we really need to settle on temporal CRS now? We could support just

{
  "type": "TemporalRS",
  "calendar": "Gregorian"
}

for the first release, which is mostly underspecified but which clients can handle on a best-efforts basis (trying to parse axis value strings as dates if calendar = gregorian; otherwise leave as-is and treat as categorical).

@jonblower
Copy link
Member

I think this is OK for the first release. We don't have a use case in MELODIES (yet) for data with weird climate TRSs (like 360-day calendars).

@letmaik
Copy link
Member Author

letmaik commented Jan 25, 2016

For reference, the current temporal CRS design removed out of the spec:

{
  "type": "TemporalCRS",
  "calendar": "Gregorian",
  "origin": "1980-01-01T00:00:00Z",
  "unit": {
    "symbol": "s"
  }
}

letmaik added a commit that referenced this issue Jan 25, 2016
letmaik added a commit that referenced this issue Jan 25, 2016
@letmaik letmaik added this to the future milestone Jan 25, 2016
@letmaik
Copy link
Member Author

letmaik commented Jan 25, 2016

I adapted the relevant section, looks quite good to me and keeps it open for future extensions. In doing that I discovered also that ISO8601 defines an extended year (>4 digits) notation (with year precision only in that case) which is e.g. +100000, or -10000. So since it requires a + or - it is also easy to detect if this is used. I defined a subset of ISO8601 notations in the spec which should cover most cases. Did I miss anything?

@letmaik
Copy link
Member Author

letmaik commented Jan 25, 2016

Correction: ISO extended years are not restricted to year precision. Wikipedia is a bit wrong there, I checked the standard. So we could include the month as well, not sure if climatological data needs that precision though.

@jonblower
Copy link
Member

Looks good to me

@letmaik
Copy link
Member Author

letmaik commented Feb 18, 2022

Closing this since the original issue has been resolved and time can be represented using ISO8601 notation while providing a way to specify both the calendar and time scale using unique identifiers. The design is compatible with future iterations to support temporal CRS's, based on a numeric axis. But this is a separate issue and can be left for the next version. Guidance from OGC who defined URIs for such CRSs will likely be helpful.

@letmaik letmaik closed this as completed Feb 18, 2022
@jonblower
Copy link
Member

I agree with closing this, but just to note that #93 is an interesting topic that could evolve how we handle time

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants