hpcugent · boegel · Aug 14, 2023 · Aug 4, 2023 · Aug 4, 2023 · Aug 4, 2023
diff --git a/.github/workflows/script_module_list.yml b/.github/workflows/script_module_list.yml
@@ -0,0 +1,39 @@
+name: Run Linter and tests
+on: [push, pull_request]
+# Declare default permissions as read only.
+permissions: read-all
+jobs:
+
+  flake8-lint:
+    runs-on: ubuntu-20.04
+    name: Lint
+    steps:
+      - name: Check out source repository
+        uses: actions/checkout@v3
+      - name: Set up Python environment
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.6"
+      - name: flake8 Lint
+        uses: py-actions/flake8@v2
+        with:
+          max-line-length: "120"
+          path: "scripts/module_list"
+
+  pytest-tests:
+    runs-on: ubuntu-20.04
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.x'
+      - name: Install dependencies
+        run: |
+          cd scripts/module_list
+          python -m pip install --upgrade pip
+          pip install -r requirements.txt
+      - name: Test with pytest
+        run: |
+          cd scripts/module_list
+          ./test.sh
diff --git a/scripts/README.md b/scripts/README.md
@@ -0,0 +1,3 @@
+Scripts that can be used to automatically generate markdown files, can be found here.
+
+* [`module_list`](module_list): script to generate overview of available environment modules;
diff --git a/scripts/module_list/README.md b/scripts/module_list/README.md
@@ -0,0 +1,79 @@
+# Module list
+A script that generates a list of all available lmod modules in markdown.
+It also indicates which package is avalaible on each server. 
+
+## Requirements
+- Required Python packages are listed in the `requirements.txt`  file.
+- Lmod must be available, $LMOD_CMD must specify path to the lmod binary.
+
+## Usage
+You can run the script with following command:
+
+```shell
+python module_overview.py
+```
+
+## Testing
+You can run the tests by running the `test.sh` script.
+```shell
+./test.sh
+```
+
+The tests make use of a mocked `$LMOD_CMD` script, you can find [here](tests/data/lmod_mock.sh).
+
+### Write tests
+If you want to write additional tests and use the script effectively, follow these guidelines:
+
+1. **Setting up Mocked Script Path:**
+
+   Before each test, ensure that you set the path to the mocked script. 
+   This can be done within the setup_class function.
+   ```python
+   path = os.path.dirname(os.path.realpath(__file__))
+
+   @classmethod
+   def setup_class(cls):
+       os.environ["LMOD_CMD"] = cls.path + "/data/lmod_mock.sh"
+   ```
+
+2. **Handling mock avail cluster/ Output:**
+
+   The output of the command `mock avail cluster/` can be put in a `.txt` file. 
+   Set the path to this file in the `MOCK_FILE_AVAIL_CLUSTER` variable.
+   ```python
+   os.environ["MOCK_FILE_AVAIL_CLUSTER"] = path + "/data/data_avail_cluster_simple.txt"
+   ```
+
+3. **Utilizing the Swap Command:**
+
+   For utilizing the swap command, assign the path to the swap files to the MOCK_FILE_SWAP variable. 
+   Ensure that the filename contains the placeholder CLUSTER, 
+   which will later be replaced with the actual cluster name when performing the swap.
+
+   ```python
+   os.environ["MOCK_FILE_SWAP"] = path + "/data/data_swap_CLUSTER.txt"
+   ```
+   When trying to swap to, for example, the cluster/dialga cluster.
+   It will use the data_swap_dialga.txt file as output for the swap command.
+
+### Example 
+An example of a possible `setup_class` function is given below.
+```python
+import os
+
+@classmethod
+def setup_class(cls):
+    os.environ["TESTS_PATH"] = cls.path
+    os.environ["LMOD_CMD"] = cls.path + "/data/lmod_mock.sh"
+    os.environ["MOCK_FILE_AVAIL_CLUSTER"] = cls.path + "/data/data_avail_cluster_simple.txt"
+    os.environ["MOCK_FILE_SWAP"] = cls.path + "/data/data_swap_CLUSTER.txt"
+```
+
+This does multiple things:
+1. Set the path of the tests folder in `$TESTS_PATH`
+2. Set the path to the `lmod_mock.sh` script in the environment variable `$LMOD_CMD`
+3. Set the output file for the `module avail cluster/` to the `MOCK_FILE_AVAIL_CLUSTER` variable.
+   The actual output can be found in the `data/data_avail_cluster_simple.txt` file.
+4. Set the swap files output to the `MOCK_FILE_SWAP` variable.
+   Files with swap outut will have the `data/data_swap_CLUSTER.txt`.
+   For example, `data/data_swap_dialga.txt` could be a possible file.
diff --git a/scripts/module_list/module_overview.py b/scripts/module_list/module_overview.py
@@ -0,0 +1,201 @@
+#
+# Copyright 2009-2023 Ghent University
+#
+# This file is part of vsc_user_docs,
+# originally created by the HPC team of Ghent University (http://ugent.be/hpc/en),
+# with support of Ghent University (http://ugent.be/hpc),
+# the Flemish Supercomputer Centre (VSC) (https://www.vscentrum.be),
+# the Flemish Research Foundation (FWO) (http://www.fwo.be/en)
+# and the Department of Economy, Science and Innovation (EWI) (http://www.ewi-vlaanderen.be/en).
+#
+# https://github.com/hpcugent/vsc_user_docs
+#
+# vsc_user_docs is licensed under a
+# Creative Commons Attribution-ShareAlike 4.0 International License.
+#
+# You should have received a copy of the license along with this
+# work. If not, see <http://creativecommons.org/licenses/by-sa/4.0/>.
+#
+"""
+Python script to convert all available modules in lmod to a markdown list.
+
+@author: Michiel Lachaert (Ghent University)
+"""
+
+import numpy as np
+import os
+import subprocess
+from mdutils.mdutils import MdUtils
+from typing import Union, Tuple
+
+
+# --------------------------------------------------------------------------------------------------------
+# Functions to run "module" commands
+# --------------------------------------------------------------------------------------------------------
+
+def module(*args, filter_fn=lambda x: x) -> np.ndarray:
+    """
+    Function to run "module" commands.
+
+    @param args: Extra arguments for the module command.
+    @param filter_fn: Filter function on the ouput.
+    @return: Array with the output of the module command.
+    """
+    lmod = os.getenv('LMOD_CMD')
+    proc = subprocess.run(
+        [lmod, "python", "--terse"] + list(args),
+        encoding="utf-8",
+        stderr=subprocess.PIPE,
+        stdout=subprocess.PIPE
+    )
+    exec(proc.stdout)
+    return filter_fn(np.array(proc.stderr.split()))
+
+
+def module_avail(name: str = "", filter_fn=lambda x: x) -> np.ndarray:
+    """
+    Function to run "module avail" commands.
+
+    @param name: Possible module name.
+    @param filter_fn: Filter on the output.
+    @return: List of all available modules of name, or all if name is not given.
+    """
+    return module("avail", name, filter_fn=filter_fn)
+
+
+def module_swap(name: str) -> None:
+    """
+    Function to run "module swap" commands.
+
+    @param name: Name of module you want to swap to.
+    """
+    module("swap", name)
+
+
+# --------------------------------------------------------------------------------------------------------
+# Fetch data
+# --------------------------------------------------------------------------------------------------------
+
+def filter_fn_gent_cluster(data: np.ndarray) -> np.ndarray:
+    """
+    Filter function for output of "module avail" commands on HPC-UGent infrastructure.
+
+    Filters out lines ending with ':' (which are paths to module files),
+    and lines starting with 'env/' or 'cluster/default', which are not actually software modules
+    @param data: Output
+    @return: Filtered output
+    """
+    return data[~np.char.endswith(data, ":") &
+                ~np.char.startswith(data, "env/") &
+                ~np.char.startswith(data, "cluster/default")
+                ]
+
+
+def filter_fn_gent_modules(data: np.ndarray) -> np.ndarray:
+    """
+    Filter function for the output of all modules.
+    @param data: Output
+    @return: Filtered output
+    """
+    return data[~np.char.endswith(data, ":") &
+                ~np.char.startswith(data, "env/") &
+                ~np.char.startswith(data, "cluster/")
+                ]
+
+
+def clusters_ugent() -> np.ndarray:
+    """
+    Returns all the cluster names of the HPC at UGent.
+    @return: cluster names
+    """
+
+    return module_avail(name="cluster/", filter_fn=filter_fn_gent_cluster)
+
+
+def modules_ugent() -> dict:
+    """
+    Returns all the module names that are installed on the HPC on UGent.
+    They are grouped by cluster.
+    @return: Dictionary with all the modules per cluster
+    """
+
+    data = {}
+    for cluster in clusters_ugent():
+        module_swap(cluster)
+        data[cluster] = module_avail(filter_fn=filter_fn_gent_modules)
+    return data
+
+
+# --------------------------------------------------------------------------------------------------------
+# Util functions
+# --------------------------------------------------------------------------------------------------------
+
+def simplify_modules(data: Union[dict, list, np.ndarray]) -> Union[dict, list, np.ndarray]:
+    """
+    Simplify list of modules by removing versions and duplicates.
+
+    @param data: List of modules
+    @return: List of programs.
+    """
+
+    if isinstance(data, dict):
+        simplified_data = {}
+        for cluster in data:
+            simplified_data[cluster] = np.unique([entry.split("/")[0] for entry in data[cluster]])
+    else:
+        simplified_data = np.unique([entry.split("/")[0] for entry in data])
+
+    return simplified_data
+
+
+# --------------------------------------------------------------------------------------------------------
+# Generate markdown
+# --------------------------------------------------------------------------------------------------------
+
+def generate_table_data(data: dict) -> Tuple[np.ndarray, int, int]:
+    """
+    Generate data that can be used to construct a MarkDown table.
+
+    @param data: Available data
+    @return: Returns tuple (Table data, #col, #row)
+    """
+    data = simplify_modules(data)
+    all_modules = simplify_modules(np.concatenate(list(data.values())))
+
+    final = np.array([" "])
+    final = np.append(final, list(data.keys()))
+
+    for package in all_modules:
+        final = np.append(final, package)
+
+        for cluster in data:
+            final = np.append(final, "X" if package in data[cluster] else " ")
+
+    return final, len(data.keys()) + 1, len(all_modules) + 1
+
+
+def generate_module_table(data: dict, md_file: MdUtils) -> None:
+    """
+    Generate the general table of the overview.
+
+    @param data: Dict with all the data. Keys are the cluster names.
+    @param md_file: MdUtils object.
+    """
+    structured, col, row = generate_table_data(data)
+    md_file.new_table(columns=col, rows=row, text=list(structured), text_align='center')
+
+
+def generate_general_overview() -> None:
+    """
+    Generate the general overview in a markdown file.
+    It generates a list of all the available software and indicates on which cluster it is available.
+    """
+    md_file = MdUtils(file_name='module_overview.md', title='Overview of available modules per cluster')
+    data = modules_ugent()
+    generate_module_table(data, md_file)
+    md_file.create_md_file()
+
+
+if __name__ == '__main__':
+    # Generate the overview
+    generate_general_overview()
diff --git a/scripts/module_list/requirements.txt b/scripts/module_list/requirements.txt
@@ -0,0 +1,5 @@
+flake8
+pytest
+mdutils
+numpy
+setuptools
diff --git a/scripts/module_list/test.sh b/scripts/module_list/test.sh
@@ -0,0 +1 @@
+PYTHONPATH=$PWD:$PYTHONPATH pytest -v -s
diff --git a/scripts/module_list/tests/data/data_avail_cluster_simple.txt b/scripts/module_list/tests/data/data_avail_cluster_simple.txt
@@ -0,0 +1,4 @@
+/etc/modulefiles/vsc:
+cluster/dialga
+cluster/pikachu
+cluster/default
diff --git a/scripts/module_list/tests/data/data_avail_simple_dialga.txt b/scripts/module_list/tests/data/data_avail_simple_dialga.txt
@@ -0,0 +1,27 @@
+/apps/modules/dialga/all:
+cfd/1.0
+cfd/2.0
+cfd/24
+cfd/5.0
+cfd/2.0afqsdf
+Markov/hidden-1.0.5
+Markov/hidden-1.0.10
+Markov/
+science/
+science/5.3.0
+science/5.3.0
+science/5.3.0
+science/7.2.0
+/etc/modulefiles/vsc:
+cluster/
+cluster/dialga
+cluster/pikachu
+env/slurm/
+env/slurm/dialga
+env/slurm/pikachu
+env/software/
+env/software/dialga
+env/software/pikachu
+env/vsc/
+env/vsc/dialga
+env/vsc/pikachu