Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overview page modules #532

Merged
merged 43 commits into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
5c34c3e
init script
milachae Aug 4, 2023
baafb65
added workflow linter + tests
milachae Aug 4, 2023
adb09ce
add requirements + fix tests
milachae Aug 4, 2023
4a0c398
simple module functions
milachae Aug 4, 2023
9f4d822
added filter option
milachae Aug 4, 2023
c69364b
working data fetching
milachae Aug 4, 2023
76798ad
display list of packages
Aug 7, 2023
afc6153
added data pickle test file
milachae Aug 7, 2023
90d7e55
adding comments
milachae Aug 7, 2023
722c1bc
adding markdown generator tests
milachae Aug 7, 2023
5b48d7c
improve MarkDown generator
milachae Aug 7, 2023
ca4927e
add dummy tests for module
milachae Aug 7, 2023
2921067
changed to 1 file
milachae Aug 7, 2023
8655c41
cleanup
milachae Aug 7, 2023
9aa9684
fix tests
milachae Aug 7, 2023
041a47d
Update workflow test
milachae Aug 8, 2023
b05396e
Remove files made by tests
milachae Aug 8, 2023
bf63315
adding comments
milachae Aug 8, 2023
31c75ae
adding tests
milachae Aug 8, 2023
24f9fd6
Apply typo suggestions
milachae Aug 9, 2023
8594510
update tests
milachae Aug 9, 2023
64015e6
remove commented lines
milachae Aug 9, 2023
4583515
remove commented line + indent fix
milachae Aug 9, 2023
dfa4342
fixing typos
milachae Aug 9, 2023
ebfe731
update tests
milachae Aug 9, 2023
f121a74
remove .pickle files from tests
milachae Aug 9, 2023
ffc46c4
update script/module_list/README.md
milachae Aug 9, 2023
58d300d
add swap tests
milachae Aug 10, 2023
20108ee
update README.md: 1st version testing
milachae Aug 10, 2023
cb34d74
Update README.md: version 2
milachae Aug 10, 2023
cb8885b
add license header
milachae Aug 10, 2023
09e197c
Merge branch 'main' into module_list
milachae Aug 10, 2023
5c606b5
Apply suggestions from code review
milachae Aug 11, 2023
f294575
update script workflow
milachae Aug 11, 2023
3850a21
apply code review suggestions
milachae Aug 11, 2023
5e33613
rename to module_overview
milachae Aug 11, 2023
d80202c
update test script
milachae Aug 11, 2023
e5b4536
fix argument error in lmod_mock script
milachae Aug 11, 2023
26b46d8
add code review suggestions
milachae Aug 11, 2023
6801b49
add prints
milachae Aug 11, 2023
e3af110
code revies suggestions
milachae Aug 14, 2023
8da6b1e
Apply suggestions from code review
milachae Aug 14, 2023
3bc8337
print message mentioning filename of module overview MarkDown file
boegel Aug 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/script_module_list.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Run Linter and tests
milachae marked this conversation as resolved.
Show resolved Hide resolved
on: [push, pull_request]
# Declare default permissions as read only.
permissions: read-all
jobs:

flake8-lint:
runs-on: ubuntu-20.04
name: Lint
steps:
- name: Check out source repository
uses: actions/checkout@v3
- name: Set up Python environment
uses: actions/setup-python@v4
with:
python-version: "3.6"
- name: flake8 Lint
uses: py-actions/flake8@v2
with:
max-line-length: "120"
path: "scripts/module_list"
milachae marked this conversation as resolved.
Show resolved Hide resolved

pytest-tests:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
milachae marked this conversation as resolved.
Show resolved Hide resolved
- name: Install dependencies
run: |
cd scripts/module_list
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Test with pytest
run: |
cd scripts/module_list
./test.sh
3 changes: 3 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Scripts that can be used to automatically generate markdown files, can be found here.

* [`module_list`](module_list): script to generate overview of available environment modules;
milachae marked this conversation as resolved.
Show resolved Hide resolved
79 changes: 79 additions & 0 deletions scripts/module_list/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Module list
A script that generates a list of all available lmod modules in markdown.
It also indicates which package is avalaible on each server.
milachae marked this conversation as resolved.
Show resolved Hide resolved

## Requirements
- Required Python packages are listed in the `requirements.txt` file.
- Lmod must be available, $LMOD_CMD must specify path to the lmod binary.
milachae marked this conversation as resolved.
Show resolved Hide resolved

## Usage
milachae marked this conversation as resolved.
Show resolved Hide resolved
You can run the script with following command:

```shell
python module_overview.py
```

## Testing
You can run the tests by running the `test.sh` script.
```shell
./test.sh
```

The tests make use of a mocked `$LMOD_CMD` script, you can find [here](tests/data/lmod_mock.sh).
milachae marked this conversation as resolved.
Show resolved Hide resolved

### Write tests
If you want to write additional tests and use the script effectively, follow these guidelines:

1. **Setting up Mocked Script Path:**
milachae marked this conversation as resolved.
Show resolved Hide resolved

Before each test, ensure that you set the path to the mocked script.
milachae marked this conversation as resolved.
Show resolved Hide resolved
This can be done within the setup_class function.
```python
path = os.path.dirname(os.path.realpath(__file__))

@classmethod
def setup_class(cls):
os.environ["LMOD_CMD"] = cls.path + "/data/lmod_mock.sh"
```

2. **Handling mock avail cluster/ Output:**
milachae marked this conversation as resolved.
Show resolved Hide resolved

The output of the command `mock avail cluster/` can be put in a `.txt` file.
milachae marked this conversation as resolved.
Show resolved Hide resolved
Set the path to this file in the `MOCK_FILE_AVAIL_CLUSTER` variable.
milachae marked this conversation as resolved.
Show resolved Hide resolved
```python
os.environ["MOCK_FILE_AVAIL_CLUSTER"] = path + "/data/data_avail_cluster_simple.txt"
```

3. **Utilizing the Swap Command:**
milachae marked this conversation as resolved.
Show resolved Hide resolved

For utilizing the swap command, assign the path to the swap files to the MOCK_FILE_SWAP variable.
milachae marked this conversation as resolved.
Show resolved Hide resolved
Ensure that the filename contains the placeholder CLUSTER,
milachae marked this conversation as resolved.
Show resolved Hide resolved
which will later be replaced with the actual cluster name when performing the swap.

```python
os.environ["MOCK_FILE_SWAP"] = path + "/data/data_swap_CLUSTER.txt"
```
When trying to swap to, for example, the cluster/dialga cluster.
milachae marked this conversation as resolved.
Show resolved Hide resolved
It will use the data_swap_dialga.txt file as output for the swap command.
milachae marked this conversation as resolved.
Show resolved Hide resolved

### Example
An example of a possible `setup_class` function is given below.
```python
import os

@classmethod
def setup_class(cls):
os.environ["TESTS_PATH"] = cls.path
os.environ["LMOD_CMD"] = cls.path + "/data/lmod_mock.sh"
os.environ["MOCK_FILE_AVAIL_CLUSTER"] = cls.path + "/data/data_avail_cluster_simple.txt"
os.environ["MOCK_FILE_SWAP"] = cls.path + "/data/data_swap_CLUSTER.txt"
```

This does multiple things:
1. Set the path of the tests folder in `$TESTS_PATH`
2. Set the path to the `lmod_mock.sh` script in the environment variable `$LMOD_CMD`
3. Set the output file for the `module avail cluster/` to the `MOCK_FILE_AVAIL_CLUSTER` variable.
The actual output can be found in the `data/data_avail_cluster_simple.txt` file.
4. Set the swap files output to the `MOCK_FILE_SWAP` variable.
Files with swap outut will have the `data/data_swap_CLUSTER.txt`.
For example, `data/data_swap_dialga.txt` could be a possible file.
milachae marked this conversation as resolved.
Show resolved Hide resolved
201 changes: 201 additions & 0 deletions scripts/module_list/module_overview.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
#
# Copyright 2009-2023 Ghent University
milachae marked this conversation as resolved.
Show resolved Hide resolved
#
# This file is part of vsc_user_docs,
# originally created by the HPC team of Ghent University (http://ugent.be/hpc/en),
# with support of Ghent University (http://ugent.be/hpc),
# the Flemish Supercomputer Centre (VSC) (https://www.vscentrum.be),
# the Flemish Research Foundation (FWO) (http://www.fwo.be/en)
# and the Department of Economy, Science and Innovation (EWI) (http://www.ewi-vlaanderen.be/en).
#
# https://github.com/hpcugent/vsc_user_docs
#
# vsc_user_docs is licensed under a
# Creative Commons Attribution-ShareAlike 4.0 International License.
milachae marked this conversation as resolved.
Show resolved Hide resolved
#
# You should have received a copy of the license along with this
# work. If not, see <http://creativecommons.org/licenses/by-sa/4.0/>.
#
"""
Python script to convert all available modules in lmod to a markdown list.
milachae marked this conversation as resolved.
Show resolved Hide resolved

@author: Michiel Lachaert (Ghent University)
"""

import numpy as np
import os
import subprocess
from mdutils.mdutils import MdUtils
from typing import Union, Tuple


# --------------------------------------------------------------------------------------------------------
# Functions to run "module" commands
# --------------------------------------------------------------------------------------------------------

def module(*args, filter_fn=lambda x: x) -> np.ndarray:
"""
Function to run "module" commands.

@param args: Extra arguments for the module command.
@param filter_fn: Filter function on the ouput.
@return: Array with the output of the module command.
"""
lmod = os.getenv('LMOD_CMD')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to implement in this PR, but we should add some error handling here.
What if $LMOD_CMD is not set (or points to a non-existing path)?
What if the exit code of the Lmod command is non-zero?

proc = subprocess.run(
[lmod, "python", "--terse"] + list(args),
encoding="utf-8",
stderr=subprocess.PIPE,
stdout=subprocess.PIPE
)
exec(proc.stdout)
return filter_fn(np.array(proc.stderr.split()))
milachae marked this conversation as resolved.
Show resolved Hide resolved


def module_avail(name: str = "", filter_fn=lambda x: x) -> np.ndarray:
"""
Function to run "module avail" commands.

@param name: Possible module name.
milachae marked this conversation as resolved.
Show resolved Hide resolved
@param filter_fn: Filter on the output.
@return: List of all available modules of name, or all if name is not given.
"""
return module("avail", name, filter_fn=filter_fn)


def module_swap(name: str) -> None:
"""
Function to run "module swap" commands.

@param name: Name of module you want to swap to.
"""
module("swap", name)


# --------------------------------------------------------------------------------------------------------
# Fetch data
# --------------------------------------------------------------------------------------------------------

def filter_fn_gent_cluster(data: np.ndarray) -> np.ndarray:
"""
Filter function for output of "module avail" commands on HPC-UGent infrastructure.

Filters out lines ending with ':' (which are paths to module files),
and lines starting with 'env/' or 'cluster/default', which are not actually software modules
@param data: Output
@return: Filtered output
"""
return data[~np.char.endswith(data, ":") &
~np.char.startswith(data, "env/") &
~np.char.startswith(data, "cluster/default")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good enough for now, but a better way to filter these would be to:

  • for each module, keep track of which location it was found in, so iterate over output of module avail line by line, and build up a dict with path as key, list of modules as value;
  • filter based on /etc/modulefiles/vsc (which covers env/* and cluster/*) to discriminate between software module and cluster modules;

]


def filter_fn_gent_modules(data: np.ndarray) -> np.ndarray:
"""
Filter function for the output of all modules.
milachae marked this conversation as resolved.
Show resolved Hide resolved
@param data: Output
@return: Filtered output
"""
return data[~np.char.endswith(data, ":") &
~np.char.startswith(data, "env/") &
~np.char.startswith(data, "cluster/")
]


def clusters_ugent() -> np.ndarray:
"""
Returns all the cluster names of the HPC at UGent.
@return: cluster names
"""

return module_avail(name="cluster/", filter_fn=filter_fn_gent_cluster)


def modules_ugent() -> dict:
"""
Returns all the module names that are installed on the HPC on UGent.
milachae marked this conversation as resolved.
Show resolved Hide resolved
They are grouped by cluster.
@return: Dictionary with all the modules per cluster
"""

data = {}
for cluster in clusters_ugent():
module_swap(cluster)
milachae marked this conversation as resolved.
Show resolved Hide resolved
data[cluster] = module_avail(filter_fn=filter_fn_gent_modules)
return data


# --------------------------------------------------------------------------------------------------------
# Util functions
# --------------------------------------------------------------------------------------------------------

def simplify_modules(data: Union[dict, list, np.ndarray]) -> Union[dict, list, np.ndarray]:
milachae marked this conversation as resolved.
Show resolved Hide resolved
"""
Simplify list of modules by removing versions and duplicates.

@param data: List of modules
@return: List of programs.
milachae marked this conversation as resolved.
Show resolved Hide resolved
"""

if isinstance(data, dict):
simplified_data = {}
for cluster in data:
simplified_data[cluster] = np.unique([entry.split("/")[0] for entry in data[cluster]])
else:
simplified_data = np.unique([entry.split("/")[0] for entry in data])
milachae marked this conversation as resolved.
Show resolved Hide resolved

return simplified_data


# --------------------------------------------------------------------------------------------------------
# Generate markdown
# --------------------------------------------------------------------------------------------------------

def generate_table_data(data: dict) -> Tuple[np.ndarray, int, int]:
milachae marked this conversation as resolved.
Show resolved Hide resolved
"""
Generate data that can be used to construct a MarkDown table.

@param data: Available data
@return: Returns tuple (Table data, #col, #row)
"""
data = simplify_modules(data)
all_modules = simplify_modules(np.concatenate(list(data.values())))

final = np.array([" "])
final = np.append(final, list(data.keys()))
milachae marked this conversation as resolved.
Show resolved Hide resolved

for package in all_modules:
final = np.append(final, package)

for cluster in data:
final = np.append(final, "X" if package in data[cluster] else " ")

return final, len(data.keys()) + 1, len(all_modules) + 1
milachae marked this conversation as resolved.
Show resolved Hide resolved


def generate_module_table(data: dict, md_file: MdUtils) -> None:
"""
Generate the general table of the overview.

@param data: Dict with all the data. Keys are the cluster names.
@param md_file: MdUtils object.
"""
structured, col, row = generate_table_data(data)
md_file.new_table(columns=col, rows=row, text=list(structured), text_align='center')


def generate_general_overview() -> None:
"""
Generate the general overview in a markdown file.
It generates a list of all the available software and indicates on which cluster it is available.
"""
md_file = MdUtils(file_name='module_overview.md', title='Overview of available modules per cluster')
data = modules_ugent()
generate_module_table(data, md_file)
md_file.create_md_file()


if __name__ == '__main__':
# Generate the overview
generate_general_overview()
5 changes: 5 additions & 0 deletions scripts/module_list/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
flake8
pytest
mdutils
numpy
setuptools
milachae marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions scripts/module_list/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PYTHONPATH=$PWD:$PYTHONPATH pytest -v -s
milachae marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions scripts/module_list/tests/data/data_avail_cluster_simple.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/etc/modulefiles/vsc:
cluster/dialga
cluster/pikachu
cluster/default
27 changes: 27 additions & 0 deletions scripts/module_list/tests/data/data_avail_simple_dialga.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
/apps/modules/dialga/all:
cfd/1.0
cfd/2.0
cfd/24
cfd/5.0
cfd/2.0afqsdf
Markov/hidden-1.0.5
Markov/hidden-1.0.10
Markov/
science/
science/5.3.0
science/5.3.0
science/5.3.0
science/7.2.0
/etc/modulefiles/vsc:
cluster/
cluster/dialga
cluster/pikachu
env/slurm/
env/slurm/dialga
env/slurm/pikachu
env/software/
env/software/dialga
env/software/pikachu
env/vsc/
env/vsc/dialga
env/vsc/pikachu
Loading