Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Ruff rule family "N" and "S" #3892

Merged
merged 73 commits into from
Aug 7, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Jun 23, 2024

Summary

Follow up #3871:

  • Enable Ruff rule family "N", fix violation of NPY002 (the use of legacy np.random)

    NumPy recommends using a dedicated Generator instance rather than the random variate generation methods exposed directly on the random module, as the new Generator is both faster and has better statistical properties.

  • Enable Ruff rule family "S", for security related checks.
  • Replace standard lib random with np.random

if ext.upper() in ["Z", "GZ"]:
with subprocess.Popen(["gunzip", dest]) as process:
if ext.upper() in {"Z", "GZ"}:
with subprocess.Popen(["/usr/bin/gunzip", dest]) as process:
Copy link
Contributor Author

@DanielYang59 DanielYang59 Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 26ccb85 need careful confirmation and discussion.

Ruff rule S607 prevents "partial executable path (relative path)" which might be hijacked by modifying the PATH.

However I'm concerned about the consistency of absolute paths for these binaries across different OS, as I'm not 100% sure they would reside in the same position.

Also applying this fix avoids the PATH being searched altogether (losing the flexibility to put binary at any dir in PATH), perhaps we want to ignore this rule instead?

@DanielYang59 DanielYang59 marked this pull request as draft August 5, 2024 15:04
@DanielYang59 DanielYang59 marked this pull request as ready for review August 5, 2024 15:27
"PBE64Base.yaml": "480c41c2448cb25706181de268090618e282b264",
"VASPIncarBase.yaml": "19762515f8deefb970f2968fca48a0d67f7964d4",
"vdW_parameters.yaml": "04bb09bb563d159565bcceac6a11e8bdf0152b79",
"MVLGWSet.yaml": "eba4564a18b99494a08ab6fdbe5364e7212b5992c7a9ef109001ce314a5b33db",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinging @mkhorton to let you know @DanielYang59 switched to a new hash algo for the VASP input sets (presumably recommended by some ruff rule?) and updated the known_hashes accordingly

Copy link
Contributor Author

@DanielYang59 DanielYang59 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes #3892 (comment)

The primary vulnerability of SHA-1 is its collision resistance, which means that it is possible to find two different messages that produce the same hash value.

tests/test_cli.py Outdated Show resolved Hide resolved
urlretrieve(bader_url, "bader.tar.gz")
subprocess.call(["tar", "-zxf", "bader.tar.gz"])
urlretrieve(bader_url, "bader.tar.gz") # noqa: S310
subprocess.call(["/usr/bin/tar", "-zxf", "bader.tar.gz"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we can hard-code /usr/bin/tar here. leaving it at tar will search everything on the user's path, not just 1 directory

Copy link
Contributor Author

@DanielYang59 DanielYang59 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was quite concerned about these changes as well (the absolute path consistency across different OS), #3892 (comment), as security is not a core concern for pymatgen (probably?). Explicitly using the absolute path could avoid PATH hijacking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janosh I'm still pretty concerned about commit 26ccb85 which includes using the absolute path for the following:

  • /usr/bin/gunzip
  • /usr/bin/gzip
  • /usr/bin/make
  • /usr/bin/tar
  • /bin/cp

There are pretty basic packages but owing to the huge diversities of Linux distro, I'm unsure if there would be any exception. According to ChatGPT 4o:

the Filesystem Hierarchy Standard (FHS) provides guidelines and standards for directory structures and file placements in Unix-like operating systems, including Linux. The FHS is maintained by the Linux Foundation.

Here are some relevant points from the FHS regarding the locations of these binaries:

/bin: This directory contains essential command binaries that are required for single-user mode and for all users (e.g., cp, ls, cat).

/usr/bin: This directory contains the majority of user commands. Binaries that are not essential for system boot or repair but are intended for use by all users are placed here. This includes binaries like gzip, gunzip, make, and tar.

It should be quite safe IMO.


fname = f"packmol_{dct['name']}.xyz"
mapping[fname] = mol.to(fmt="xyz")
if " " in str(fname):
# NOTE - double quotes are deliberately used inside the f-string here, do not change
Copy link
Contributor Author

@DanielYang59 DanielYang59 Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkingsbury Can I get some confirmation on this change please, thanks:)

  1. I didn't see double quotes being used here, so I assume it's safe to remove this note. (The other note at line 172 is valid)
  2. I believe it's safe to remove the black tag fmt: on/off here? As I'm not seeing pre-commit formatting this file here (running black directly would indeed wrap some lines, but we're not replying on black being the formatter).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was probably carelessly refactored by me at some point as !r results in single quotes. thanks for pointing this out. let's go back to what i think was the original state

f'structure "{fname}"\n'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time I added this, packmol was VERY pecuiliar about things like single vs. double quotes in the input file. I believe I opened an issue about it at the time, and they fixed the behavior to be a little less finicky. In principle as long as the packmol unit tests pass I think we should be OK, but it might be wise to see if someone can do some additional manual testing as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the additional input and confirmation. And ping me at anytime if anything doesn't look right, thanks!

Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @DanielYang59! 👍

@janosh janosh enabled auto-merge (squash) August 7, 2024 17:38
@janosh janosh merged commit fa8d596 into materialsproject:master Aug 7, 2024
33 checks passed
@DanielYang59 DanielYang59 deleted the enable-ruff-s-npy branch August 8, 2024 03:08
@janosh janosh added the linting Linting and quality assurance label Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linting Linting and quality assurance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants