Fix regrid example #305

anton-seaice · 2023-10-23T03:02:36Z

Closes #300

After the re-grid, the new index dimensions (x/y) are renamed to longitude and latitude. I added a line to explicitly delete the old longitude and latitude coordinates first to avoid there being a name conflict.

The fix is backward compatible, it works on both conda/analysis and analysis-unstable

review-notebook-app · 2023-10-23T03:02:41Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

review-notebook-app · 2023-10-23T14:27:30Z

View / edit / reply to this conversation on ReviewNB

navidcy commented on 2023-10-23T14:27:30Z
----------------------------------------------------------------

can we deal with these warnings? they are trying to warn us about something and we ignore them. @dougiesquire?

anton-seaice commented on 2023-10-23T21:48:11Z
----------------------------------------------------------------

I don't think they are relevant for our circumstance.

We could just load the whole da into memory, its small, but this makes the notebook harder/more confusing to modify later.
Or if we leave as is, the chunks seem fine - one chunk per file. I think the bottleneck in doing an operation on this da would be opening all the files, so one chunk per file seems fine (1 chunk per core would be better presumably, by we don't know how many cores the user has).

Saying that, i am definitely a dask amateur :)

dougiesquire commented on 2023-10-23T22:33:03Z
----------------------------------------------------------------

I think we want to understand what's going on here. The warnings are telling us that the cookbook is opening the data with dask chunks that split the chunking on disk, which is definitely not something we want to do. The default chunking of the cookbook should be the chunking on disk, so to me this looks like it could be a bug in either the cookbook or in xarray.

However I can't reproduce the warnings. I don't get them with either analysis3-23.04 or analysis3-23.01.@anton-seaice, @navidcy, can you confirm that you do?

review-notebook-app · 2023-10-23T14:27:32Z

View / edit / reply to this conversation on ReviewNB

navidcy commented on 2023-10-23T14:27:31Z
----------------------------------------------------------------

Line #3.    ssh_1_regridded = tidy_coords(ssh_1_regridded)

this is just 1-2 operations so let's not define a new function, e.g., tidy_coords but rather let's do them in line one after the other

also, let's avoid numpy indexing, e.g., [:, 0] and use instead xarray.

anton-seaice · 2023-10-23T21:48:12Z

I don't think they are relevant for our circumstance.

We could just load the whole da into memory, its small, but this makes the notebook harder/more confusing to modify later.
Or if we leave as is, the chunks seem fine - one chunk per file. I think the bottleneck in doing an operation on this da would be opening all the files, so one chunk per file seems fine (1 chunk per core would be better presumably, by we don't know how many cores the user has).

Saying that, i am definitely a dask amateur :)

View entire conversation on ReviewNB

dougiesquire · 2023-10-23T22:33:04Z

I think we want to understand what's going on here. The warnings are telling us that the cookbook is opening the data with dask chunks that split the chunking on disk, which is definitely not something we want to do. The default chunking of the cookbook should be the chunking on disk, so to me this looks like it could be a bug in either the cookbook or in xarray.

However I can't reproduce the warnings. I don't get them with either analysis3-23.04 or analysis3-23.01.@anton-seaice, @navidcy, can you confirm that you do?

View entire conversation on ReviewNB

anton-seaice · 2023-10-23T22:36:38Z

Ah thanks!

However I can't reproduce the warnings. I don't get them with either analysis3-23.04 or analysis3-23.01.@anton-seaice, @navidcy, can you confirm that you do?

They occur in analysis3-23.07

dougiesquire · 2023-10-23T22:41:02Z

They occur in analysis3-23.07

Ah whoops - didn't notice 23.07 - ta

anton-seaice · 2023-10-23T23:20:54Z

adding a chunks={'time':'auto'} keyword argument to getvar removes the warning, which leads to looking at this line in the cookbook for getvar:

xr_kwargs = {"chunks": _parse_chunks(ncfiles[0].NCVar)}

but I can't see anything obviously wrong there.

dougiesquire · 2023-10-24T00:19:41Z

I think this is a bug in xarray. There's something funny going on the with calculation of the preferred_chunks attribute for NetCDF files. The implementation of preferred_chunks for NetCDF files is a new feature in v2023.09.0 which is why this warning has only started recently. I'll keep digging, but for now I think we'll just have to accept (or ignore) this warning.

See update here: #305 (comment)

navidcy · 2023-10-24T00:21:01Z

OK!

dougiesquire · 2023-10-24T00:56:57Z

The issue comes about because NetCDF UNLIMITED dimensions can have ChunkSizes that are larger than the dimension size (as is the case for time dimension in the data in this notebook and , it would seem, much of the COSIMA data) and the recently-added preferred_chunks attribute implementation doesn't account for this. I'll open an issue with xarray

See update here: #305 (comment)

dougiesquire · 2023-10-24T00:59:51Z

Specifying chunks="auto" (or chunks={"time": "auto"}) as mentioned by @anton-seaice will avoid these warnings and is probably good practice in most cases anyway (note, however, that you can't do "auto" chunking with object dtype).

anton-seaice · 2023-10-24T01:06:21Z

Specifying chunks="auto" (or chunks={"time": "auto"}) as mentioned by @anton-seaice will avoid these warnings and is probably good practice in most cases anyway (note, however, that you can't do "auto" chunking with object dtype).

Maybe chunks='auto' should be the default in the cookbook, rather than loading the chunksizes from the file?

In this workbook, they re-chunk before doing processing, so probably no change needed for this one?

dougiesquire · 2023-10-24T01:13:40Z

Maybe chunks='auto' should be the default in the cookbook

I thought about this for Intake-ESM, but unfortunately object dtype data is too common for this to be a good default.

In this workbook, they re-chunk before doing processing, so probably no change needed for this one?

I didn't get that far through the notebook. If that's the case, it's best to open the data with chunks as close to the final chunks as possible - see here

anton-seaice · 2023-10-24T01:38:35Z

Maybe chunks='auto' should be the default in the cookbook

I thought about this for Intake-ESM, but unfortunately object dtype data is too common for this to be a good default.

In this workbook, they re-chunk before doing processing, so probably no change needed for this one?

I didn't get that far through the notebook. If that's the case, it's best to open the data with chunks as close to the final chunks as possible - see here

Thanks - I can't make the chunks='auto' change due to a cookbook issue, but will do chunks={"time": "auto"}

navidcy · 2023-10-24T02:00:59Z

Maybe chunks='auto' should be the default in the cookbook

I thought about this for Intake-ESM, but unfortunately object dtype data is too common for this to be a good default.

In this workbook, they re-chunk before doing processing, so probably no change needed for this one?

I didn't get that far through the notebook. If that's the case, it's best to open the data with chunks as close to the final chunks as possible - see here

Thanks - I can't make the chunks='auto' change due to a cookbook issue, but will do chunks={"time": "auto"}

cc @angus-g , @micaeljtoliveira

dougiesquire · 2023-10-24T02:19:11Z

Sorry, I'm actually wrong about this being an issue in xarray. I didn't notice at first, but the average_* variables in the data have NetCDF _ChunkSizes = 512 (along time). The cosima-cookbook tries to open these data with the NetCDF chunking of the variable requested (in this case {'time': 1, 'yt_ocean': 300, 'xt_ocean': 360}), which does divide the chunks for the average_* variables. Of course, these variables aren't actually returned in this case, but the warning is still thrown.

The fix is to get rid of the logic in the cosima-cookbook that sets the default chunks to the variable NetCDF chunks. This should be replaced with chunks={} which opens each variable with it's own NetCDF chunks.

review-notebook-app · 2023-10-24T02:48:47Z

View / edit / reply to this conversation on ReviewNB

anton-seaice commented on 2023-10-24T02:48:47Z
----------------------------------------------------------------

@dougiesquire Does this new description make sense?

dougiesquire commented on 2023-10-24T02:58:09Z
----------------------------------------------------------------

Looks good.

dougiesquire · 2023-10-24T02:58:10Z

Looks good.

View entire conversation on ReviewNB

access-hive-bot · 2024-06-18T22:36:21Z

This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/xarray-warnings-while-loading-data-using-cosima-cookbook/2169/4

navidcy added the 🛸 updating An existing notebook needs to be updated label Oct 23, 2023

Fix regrid example

af50845

Remove numpy indexing and extra func

073496c

anton-seaice force-pushed the regrid-jenkins-failure branch from 2aef408 to 073496c Compare October 23, 2023 23:01

navidcy approved these changes Oct 24, 2023

View reviewed changes

anton-seaice mentioned this pull request Oct 24, 2023

querying.getvar doesn't accept chunks='auto' kwarg COSIMA/cosima-cookbook#333

Open

Tweak chunking

5e2c002

anton-seaice merged commit c9af3d9 into COSIMA:main Oct 24, 2023
2 checks passed

navidcy mentioned this pull request Feb 27, 2024

cookbook with dask client gives out _a lot of_ warnings about bad chunking COSIMA/cosima-cookbook#335

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix regrid example #305

Fix regrid example #305

anton-seaice commented Oct 23, 2023 •

edited

Loading

review-notebook-app bot commented Oct 23, 2023

review-notebook-app bot commented Oct 23, 2023 •

edited

Loading

review-notebook-app bot commented Oct 23, 2023 •

edited

Loading

anton-seaice commented Oct 23, 2023 •

edited

Loading

dougiesquire commented Oct 23, 2023

anton-seaice commented Oct 23, 2023

dougiesquire commented Oct 23, 2023

anton-seaice commented Oct 23, 2023

dougiesquire commented Oct 24, 2023 •

edited

Loading

navidcy commented Oct 24, 2023

dougiesquire commented Oct 24, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023 •

edited

Loading

anton-seaice commented Oct 24, 2023

dougiesquire commented Oct 24, 2023 •

edited

Loading

anton-seaice commented Oct 24, 2023

navidcy commented Oct 24, 2023

dougiesquire commented Oct 24, 2023

review-notebook-app bot commented Oct 24, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023

access-hive-bot commented Jun 18, 2024

Fix regrid example #305

Fix regrid example #305

Conversation

anton-seaice commented Oct 23, 2023 • edited Loading

review-notebook-app bot commented Oct 23, 2023

review-notebook-app bot commented Oct 23, 2023 • edited Loading

review-notebook-app bot commented Oct 23, 2023 • edited Loading

anton-seaice commented Oct 23, 2023 • edited Loading

dougiesquire commented Oct 23, 2023

anton-seaice commented Oct 23, 2023

dougiesquire commented Oct 23, 2023

anton-seaice commented Oct 23, 2023

dougiesquire commented Oct 24, 2023 • edited Loading

navidcy commented Oct 24, 2023

dougiesquire commented Oct 24, 2023 • edited Loading

dougiesquire commented Oct 24, 2023 • edited Loading

anton-seaice commented Oct 24, 2023

dougiesquire commented Oct 24, 2023 • edited Loading

anton-seaice commented Oct 24, 2023

navidcy commented Oct 24, 2023

dougiesquire commented Oct 24, 2023

review-notebook-app bot commented Oct 24, 2023 • edited Loading

dougiesquire commented Oct 24, 2023

access-hive-bot commented Jun 18, 2024

anton-seaice commented Oct 23, 2023 •

edited

Loading

review-notebook-app bot commented Oct 23, 2023 •

edited

Loading

review-notebook-app bot commented Oct 23, 2023 •

edited

Loading

anton-seaice commented Oct 23, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023 •

edited

Loading

dougiesquire commented Oct 24, 2023 •

edited

Loading

review-notebook-app bot commented Oct 24, 2023 •

edited

Loading