Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate DoubleJobRegistry #863

Open
soxofaan opened this issue Sep 12, 2024 · 2 comments
Open

Eliminate DoubleJobRegistry #863

soxofaan opened this issue Sep 12, 2024 · 2 comments

Comments

@soxofaan
Copy link
Member

DoubleJobRegistry was introduced as temporary solution during the gradual migration from Zookeeper based job registry to the ElasticSearch based job registry. Now that we fully eliminated the zookeeper based job registry (right?), there is no need to keep DoubleJobRegistry as additional layer of complexity/redirection.

related to

@soxofaan
Copy link
Member Author

oh wait, we're apparently still using ZK on Terrascope deploy

@soxofaan
Copy link
Member Author

oh wait, we're apparently still using ZK on Terrascope deploy

just pushed a config update to also disable ZK registry on Terrascope

bossie added a commit that referenced this issue Sep 16, 2024
ZkJobRegistry was recently disabled on Terrascope but async_task directly relies on it.
Errors re: missing jobs are not considered fatal, by design.

{"message": "job not found; assuming user deleted it in the meanwhile", "levelname": "WARNING", "name": "__main__", "created": 1726475564.4724538, "filename": "async_task.py", "lineno": 283, "process": 12, "exc_info": "Traceback (most recent call last):\n  File \"/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/job_registry.py\", line 396, in _read\n    data, stat = self._zk.get(path)\n  File \"/opt/venv/lib64/python3.8/site-packages/kazoo/client.py\", line 1165, in get\n    return self.get_async(path, watch=watch).get()\n  File \"/opt/venv/lib64/python3.8/site-packages/kazoo/handlers/utils.py\", line 75, in get\n    raise self._exception\nkazoo.exceptions.NoNodeError: /openeo/integrationtests/jobs/ongoing/f689e77d-f188-40ca-b12b-3e278f0ad68f/j-2409161ca9c248e5986a11f20e61b26a\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/job_registry.py\", line 403, in _read\n    data, stat = self._zk.get(path)\n  File \"/opt/venv/lib64/python3.8/site-packages/kazoo/client.py\", line 1165, in get\n    return self.get_async(path, watch=watch).get()\n  File \"/opt/venv/lib64/python3.8/site-packages/kazoo/handlers/utils.py\", line 75, in get\n    raise self._exception\nkazoo.exceptions.NoNodeError: /openeo/integrationtests/jobs/done/f689e77d-f188-40ca-b12b-3e278f0ad68f/j-2409161ca9c248e5986a11f20e61b26a\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/async_task.py\", line 247, in main\n    job_info = registry.get_job(batch_job_id, user_id)\n  File \"/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/job_registry.py\", line 246, in get_job\n    job_info, _ = self._read(\n  File \"/opt/venv/lib64/python3.8/site-packages/openeogeotrellis/job_registry.py\", line 406, in _read\n    raise JobNotFoundException(job_id) from e\nopeneo_driver.errors.JobNotFoundException: The batch job j-2409161ca9c248e5986a11f20e61b26a does not exist.", "job_id": "j-2409161ca9c248e5986a11f20e61b26a"}

#863
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants