-
Hi TL;DR: How to add accessors for resampling/groupby/rolling I'm using xarray to handle netCDF data of a cloud radar. Many of the associated variables are usually shown in decibel (dB, i.e. log-space) while the original data is in linear space. As such, our reader (a wrapper around open_dataset with some potential preprocessing, attribute standardisation as it handles some different types/difference between netCDFs of each manufacturer etc.) by default converts the appropriate variables from linear- to log-space. So far so good. Now, we often require some statistics, which should be calculated in linear space. For this purpose, I used the accessors, replacing mean/std/.. and adding skewness/kurtosis from scipy with versions that check for an existing attribute ("U/units") and whether they start with dB. (Admittedly, this isn't great and pint units would be better but it's doing the job for now even though I'm not sure what we would like is supported based on https://pint.readthedocs.io/en/stable/user/log_units.html.) For illustration in the import xarray as xr
import numpy as np
def _dbmean(self, **kwargs):
# kwargs are passed to the "normal" mean
if isinstance(self, xr.Dataset):
return self.apply(_dbmean, keep_attrs=True, **kwargs)
else:
if 'units' in self.attrs and self.attrs['units'].startswith('dB'):
return 10*np.log10(10**(self/10)._mean(**kwargs))
return self._mean(**kwargs)
@xr.register_dataset_accessor('mean')
def mean_dataset_accessor(dataset, **kwargs):
def mean(**kwargs):
return _dbmean(dataset, **kwargs)
doc = xr.core._aggregations.DatasetAggregations.mean.__doc__
mean.__doc__ = extenddoc(doc)
return mean
@xr.register_dataset_accessor('_mean')
def _mean_dataset_accessor(dataset, **kwargs):
def _mean(**kwargs):
return xr.core._aggregations.DatasetAggregations.mean(dataset, **kwargs)
doc = xr.core._aggregations.DatasetAggregations.mean.__doc__
_mean.__doc__ = extenddoc(doc)
return _mean The above in a way is simply for convenience as we could always use .reduce with the appropriate function and to reduce the need to import _dbmean whenever we deal with those data. ( However, the same handling of linear- and log-space would be nice for rolling/grouping/resampling. Any input is appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Does pint do the right thing when calling If not, I'd look into writing an accessor that handled the plotting nicely for you. Or perhaps there's a way to have your data in linear-space as pint arrays, and have matplotlib render the data in log-units? |
Beta Was this translation helpful? Give feedback.
I'm not an expert here but it seems a lot easier to keep the data in linear-space, and have all your computations be correct, and then transform as necessary to get the nice plot.