Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new dev2/resave.py with sharding example #3

Merged
merged 9 commits into from
Jul 11, 2024

Conversation

will-moore
Copy link
Member

@joshmoore
Copy link
Member

💯, @will-moore. You ok to just merge this when ready?

@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome2024-ngff-challenge/97363/17

@will-moore
Copy link
Member Author

Currently trying to convert a tiny 3-image plate that I generated with omero-cli-zarr and it's failing with:

(zarr_v3) Williams-MacBook-Pro:dev2 wmoore$ python resave.py /Users/wmoore/Desktop/ZARR/data/plates/51.zarr/A/1/0.zarr plate_51_well.zarr
Traceback (most recent call last):
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/resave.py", line 120, in <module>
    convert_image(read_root, ns.input_path, ns.output_path)
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/resave.py", line 103, in convert_image
    convert_array(
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/resave.py", line 75, in convert_array
    }).result()
ValueError: FAILED_PRECONDITION: Error opening "zarr3" driver: Mismatch in "codecs": Cannot merge zarr codec constraints [] and [{"configuration":{"clevel":5,"cname":"lz4"},"name":"blosc"}]: Mismatch in number of bytes -> bytes codecs (0 vs 1) [source locations='tensorstore/driver/zarr3/codec/codec_chain_spec.cc:422\ntensorstore/driver/zarr3/codec/codec_chain_spec.cc:468\ntensorstore/driver/zarr3/metadata.cc:527\ntensorstore/driver/zarr3/metadata.cc:527\ntensorstore/driver/zarr3/driver.cc:584\ntensorstore/driver/kvs_backed_chunk_driver.cc:1262\ntensorstore/driver/driver.cc:112'] [tensorstore_spec='{\"context\":{\"cache_pool\":{},\"data_copy_concurrency\":{},\"file_io_concurrency\":{},\"file_io_sync\":true},\"create\":true,\"driver\":\"zarr3\",\"dtype\":\"uint8\",\"kvstore\":{\"driver\":\"file\",\"path\":\"plate_51_well.zarr/0/\"},\"metadata\":{\"chunk_grid\":{\"configuration\":{\"chunk_shape\":[2,1024,1024]},\"name\":\"regular\"},\"chunk_key_encoding\":{\"name\":\"default\"},\"codecs\":[{\"configuration\":{\"clevel\":5,\"cname\":\"lz4\"},\"name\":\"blosc\"}],\"data_type\":\"uint8\",\"dimension_names\":[\"c\",\"y\",\"x\"],\"node_type\":\"array\",\"shape\":[3,1024,1344]},\"transform\":{\"input_exclusive_max\":[[3],[1024],[1344]],\"input_inclusive_min\":[0,0,0],\"input_labels\":[\"c\",\"y\",\"x\"]}}']

joshmoore and others added 3 commits July 10, 2024 12:36
Example:
```
./resave.py zarr/v0.4/idr0062A/6001240.zarr challenge/dev2/6001240.zarr/ \
    --input-overwrite \
    --input-bucket=idr \
    --input-endpoint=https://uk1s3.embassy.ebi.ac.uk \
    --input-anon \
    --output-bucket=EXAMPLE \
    --output-endpoint=https://MYHOST \
    --output-overwrite
```
@will-moore
Copy link
Member Author

@joshmoore I'm trying to read from remote but write locally. With my last commit above, I now get:

$ python resave.py zarr/v0.4/idr0062A/6001240.zarr 6001240_from_remote.zarr --input-overwrite --input-bucket=idr --input-endpoint=https://uk1s3.embassy.ebi.ac.uk --input-anon

Traceback (most recent call last):
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/resave.py", line 167, in <module>
    read_root = zarr.open_group(store=STORES[0], zarr_format=2)
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/api/synchronous.py", line 175, in open_group
    sync(
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/sync.py", line 92, in sync
    raise return_result
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/sync.py", line 51, in _runner
    return await coro
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/api/asynchronous.py", line 523, in open_group
    return await AsyncGroup.open(store_path, zarr_format=zarr_format)
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/group.py", line 152, in open
    zgroup_bytes, zattrs_bytes = await asyncio.gather(
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/store/core.py", line 35, in get
    return await self.store.get(self.path, prototype=prototype, byte_range=byte_range)
  File "/Users/wmoore/Desktop/ZARR_PYTHON/zarr-python/src/zarr/store/remote.py", line 103, in get
    await (
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 1128, in _cat_file
    return await _error_wrapper(_call_and_read, retries=self.retries)
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 145, in _error_wrapper
    raise err
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 113, in _error_wrapper
    return await func(*args, **kwargs)
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 1115, in _call_and_read
    resp = await self._call_s3(
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 358, in _call_s3
    await self.set_session()
  File "/Users/wmoore/opt/anaconda3/envs/zarr_v3/lib/python3.10/site-packages/s3fs/core.py", line 519, in set_session
    self.session = aiobotocore.session.AioSession(**self.kwargs)
TypeError: AioSession.__init__() got an unexpected keyword argument '//'

@joshmoore
Copy link
Member

@will-moore: can you try updating from the zarr-python v3 branch? I'm not seeing this locally.

@will-moore
Copy link
Member Author

I updated to latest branch...

cd ZARR_PYTHON/zarr-python/
git fetch origin
git checkout origin/v3
pip freeze | grep zarr
-e git+ssh://git@github.com/ome/ome-zarr-py.git@d5b37acd6b7bb246e173b24e48183b7df59e8d61#egg=ome_zarr
-e git+ssh://git@github.com/zarr-developers/zarr-python.git@33b158974a55f1818f27dcc9a3bd2135c51450ff#egg=zarr

but still see the same error.

Also tried...

$ pip freeze | grep s3fs
s3fs==2024.6.0
$ pip install -U s3fs
Successfully installed fsspec-2024.6.1 s3fs-2024.6.1

But still seeing the same result.

@joshmoore
Copy link
Member

Hmmm.... and with a fresh conda/mamba environment?

channels:
  - conda-forge
dependencies:
  - 'numpy<2'
  - tensorstore # loads dependencies
  - zarr  # loads dependencies
  - pip
  - pip:
      - "--editable=git+https://github.com/will-moore/ome-zarr-py.git@zarr_v3_support#egg=ome-zarr"
      - "--editable=git+https://github.com/zarr-developers/zarr-python.git@v3#egg=zarr"
      - 'tensorstore>=0.1.63'

@will-moore
Copy link
Member Author

environment.yml
name: zarr_python_v3
channels:
  - conda-forge
dependencies:
  - 'numpy<2'
  - tensorstore # loads dependencies
  - zarr  # loads dependencies
  - pip
  - pip:
      - "--editable=git+https://github.com/will-moore/ome-zarr-py.git@zarr_v3_support#egg=ome-zarr"
      - "--editable=git+https://github.com/zarr-developers/zarr-python.git@v3#egg=zarr"
      - 'tensorstore>=0.1.63'
conda env create -f environment.yml
...
Successfully installed MarkupSafe-2.1.5 aiobotocore-2.13.1 aiohttp-3.9.5 aioitertools-0.11.0 aiosignal-1.3.1 attrs-23.2.0 botocore-1.34.131 certifi-2024.7.4 charset-normalizer-3.3.2 click-8.1.7 cloudpickle-3.0.0 crc32c-2.4.1 dask-2024.7.0 distributed-2024.7.0 donfig-0.8.1.post1 frozenlist-1.4.1 fsspec-2024.6.1 idna-3.7 imageio-2.34.2 iniconfig-2.0.0 jinja2-3.1.4 jmespath-1.0.1 lazy-loader-0.4 locket-1.0.0 multidict-6.0.5 networkx-3.3 ome-zarr-0.9.1.dev0 packaging-24.1 partd-1.4.2 pillow-10.4.0 pluggy-1.5.0 psutil-6.0.0 pytest-8.2.2 python-dateutil-2.9.0.post0 pyyaml-6.0.1 requests-2.32.3 s3fs-2024.6.1 scikit-image-0.24.0 scipy-1.14.0 six-1.16.0 sortedcontainers-2.4.0 tblib-3.0.0 tensorstore-0.1.63 tifffile-2024.7.2 toolz-0.12.1 tornado-6.4.1 typing-extensions-4.12.2 urllib3-2.2.2 wrapt-1.16.0 yarl-1.9.4 zarr-3.0.0a1.dev29+g33b1589 zict-3.0.0 zstandard-0.22.0

$ conda activate zarr_python_v3

$ python resave.py zarr/v0.4/idr0062A/6001240.zarr 6001240_from_remote.zarr --input-overwrite --input-bucket=idr --input-endpoint=https://uk1s3.embassy.ebi.ac.uk --input-anon
Traceback (most recent call last):
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/resave.py", line 167, in <module>
    read_root = zarr.open_group(store=STORES[0], zarr_format=2)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/api/synchronous.py", line 175, in open_group
    sync(
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/sync.py", line 92, in sync
    raise return_result
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/sync.py", line 51, in _runner
    return await coro
           ^^^^^^^^^^
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/api/asynchronous.py", line 523, in open_group
    return await AsyncGroup.open(store_path, zarr_format=zarr_format)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/group.py", line 152, in open
    zgroup_bytes, zattrs_bytes = await asyncio.gather(
                                 ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/store/core.py", line 35, in get
    return await self.store.get(self.path, prototype=prototype, byte_range=byte_range)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/Desktop/NGFF/ome2024-ngff-challenge/dev2/src/zarr/src/zarr/store/remote.py", line 103, in get
    await (
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 1128, in _cat_file
    return await _error_wrapper(_call_and_read, retries=self.retries)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 145, in _error_wrapper
    raise err
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 113, in _error_wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 1115, in _call_and_read
    resp = await self._call_s3(
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 358, in _call_s3
    await self.set_session()
  File "/Users/wmoore/opt/anaconda3/envs/zarr_python_v3/lib/python3.12/site-packages/s3fs/core.py", line 519, in set_session
    self.session = aiobotocore.session.AioSession(**self.kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: AioSession.__init__() got an unexpected keyword argument '//'

@joshmoore joshmoore merged commit 3baae54 into ome:main Jul 11, 2024
@joshmoore
Copy link
Member

joshmoore commented Jul 16, 2024

Note: currently the wells need yq -iP '.attributes.ome.version="0.5"' zarr.json -o json updates to pass validation.

idr0001-2551.zarr $ find */* -maxdepth 1 -name zarr.json -exec grep version {} /dev/null \;
# hand-edited
A/1/zarr.json:      "version": "0.5",
A/2/zarr.json:      "version": "0.5",
A/3/zarr.json:      "version": "0.5",

idr0001-2551.zarr $find */* -maxdepth 1 -name zarr.json -exec yq -iP '.attributes.ome.version="0.5"' {} -o json \;

@joshmoore joshmoore mentioned this pull request Jul 16, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants