Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method for overriding cached/checkpointed results #3534

Open
drewoldag opened this issue Jul 18, 2024 · 3 comments
Open

Method for overriding cached/checkpointed results #3534

drewoldag opened this issue Jul 18, 2024 · 3 comments

Comments

@drewoldag
Copy link

Is your feature request related to a problem? Please describe.
This is not necessarily a feature request, only a question of whether this feature exists but is difficult to find in the documentation. I am curious if there is way override or dynamically bypass the use of cached or checkpointed results at runtime.

For instance, in a large workflow that has been run on an input set of data, but a bug has been found that affects a small portion of the task graph for a subset of the input data. Does there exist an API that would allow me to decide on the fly to not use the cached/checkpointed results?

In my current project, I can imagine this happening frequently. i.e. The vast majority of the task graph is correct, but one branch for one part of the data contained a bug or needs to be rerun. It would be nice to be able to do that programmatically instead of creating an ad hoc task graph just for that portion of the data.

Describe the solution you'd like
Information about whether such an API exists, and where in the documentation to read about it.

Describe alternatives you've considered
Currently just making use of caching and checkpointing.

@benclifford
Copy link
Collaborator

There isn't an exposed API for this, but I've talked about "different kinds of checkpointing" with @WardLT and I think its is definitely interesting to get practical experience about what that really means...

In the codebase, there are two interesting places for checkpointing:

When a task completes, the 10 lines starting here are the code that stores the result inside the checkpointing system:

self.memoizer.update_memo(task_record, future)

and then on a subsequent run, to decide if a result has been stored already and can be reused, the DFK makes this call followed by an if statement:

memo_fu = self.memoizer.check_memo(task_record)

So if you wanted to try out your ideas, these two places would be a place to start editing in your own policies directly into your own version of Parsl. I don't know what that would mean for you - for example, perhaps you need to store more information in each checkpoint to make your decisions later on...

@drewoldag
Copy link
Author

Alright, I see what you mean. I would prefer not to have to maintain my own version of parsl, but I believe I understand your point. Thanks for indicating the relevant parts of the code - I very much appreciate it!

@benclifford
Copy link
Collaborator

@drewoldag i did a bit of rearranging based on previous thoughts i've had. Have a look at PR #3535 and especially this test case which shows ignoring checkpoints for functions with an argument of 7:

https://github.com/Parsl/parsl/pull/3535/files#diff-9ed62adab093c1407c03b122791050ce53e134fda39a3bca347680347c73e4f9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants