Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaskVineExecutor: write function data to temp directory #3592

Merged
merged 7 commits into from
Sep 6, 2024

Conversation

colinthomas-z80
Copy link
Contributor

Description

This moves the working directory for TaskVine executor function data to a directory in /tmp

These files were previously written adjacent to the logging in the working directory. They may be written in excess when running a large number of tasks.

Type of change

Choose which options apply, and delete the ones which do not apply.

  • New feature

@benclifford
Copy link
Collaborator

I ran in my dev container and I got this surprise.

>       tmp_dir = os.path.join('/tmp/', f'{self.label}-{os.getlogin()}')
E       OSError: [Errno 6] No such device or address

@benclifford
Copy link
Collaborator

did some strace, and it's looking in this file which contains a decimalised unsigned 32 bit integer representation of signed 32-bit -1:

$ cat /proc/self/loginuid 
4294967295

and then failing after that.

The documentation for Python's os.getlogin says

https://docs.python.org/3/library/os.html#os.getlogin

For most purposes, it is more useful to use getpass.getuser()

and in my dev environment that is able to figure out who I am:

>>> import getpass
>>> getpass.getuser()
'benc'

@@ -215,7 +216,8 @@ def __create_data_and_logging_dirs(self):

# Create directories for data and results
log_dir = os.path.join(run_dir, self.label)
self._function_data_dir = os.path.join(run_dir, self.label, "function_data")
tmp_dir = os.path.join('/tmp/', f'{self.label}-{os.getlogin()}')
self._function_data_dir = os.path.join(tmp_dir, datetime.now().strftime('%Y%m%d%H%M%S%f'), "function_data")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more a reference for readers of #1122 that review comment for @colinthomas-z80 - here's another instance of using a timestamp as a unique ID that #1122 might want to address

@benclifford
Copy link
Collaborator

Github Actions testing shows that same OS error - it's not just my wacky dev environment.

@colinthomas-z80
Copy link
Contributor Author

I do recognize the potential issues using datetime. In this instance it is specific to milliseconds so a conflict is unlikely. My rationale was to follow a logging convention in taskvine that will show the logs in a directory listing by order of creation, so it is easier to locate the most recent run. Though these files are rarely useful for a user to inspect themselves. I can change to uuid if you think it conforms better.

@benclifford
Copy link
Collaborator

I do recognize the potential issues using datetime.

I don't think there's anything to be too concerned about here... mostly I was tagging it for the eventual implementer of #1122 to come fix it up.

@khk-globus
Copy link
Collaborator

I do recognize the potential issues using datetime.

I don't think there's anything to be too concerned about here... mostly I was tagging it for the eventual implementer of #1122 to come fix it up.

FWIW, this is precisely the purview of the tempfile module [docs.python.org]. Consider:

import tempfile
...
def ...:
    ...
    now_str = datetime.now().strftime(...)
    prefix = f"{self.label}-{getpass.getuser()}-{now_str}-"
    tmp_dir = tempfile.TemporaryDirectory(prefix=prefix, delete=False)

At which point tmp_dir will have the property of sortability you've stated was your goal and will also be guaranteed unique (that's tempfile's guarantee).

I'm not saying it's the best solution in this precise scenario, but I do observe that temporary files are not as immediately trivial as many assume at first glance — there's a reasion mktemp (command) and Python's tempfile module exist.

@colinthomas-z80
Copy link
Contributor Author

I like that option. I was also hoping to delete the directory at the end and it takes care of that. Thanks!

@benclifford
Copy link
Collaborator

I just merged this into my kubernetes test PR #3482 after getting distracted making task vine work in there, to give me some more assurance about how this works when workers do not have a shared filesystem with the submit side. It works ok.

@benclifford benclifford merged commit dc7911f into Parsl:master Sep 6, 2024
8 checks passed
yadudoc pushed a commit to yadudoc/parsl that referenced this pull request Sep 19, 2024
This moves the working directory for TaskVine executor function data to a directory in /tmp

These files were previously written adjacent to the logging in the working directory. They may be written in excess when running a large number of tasks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants