TudatPy structure #159

alfonsoSR · 2024-08-08T11:52:44Z

Issues addressed by this PR

Critical

Import time: Compiling tudatpy into a single library (kernel) results into long import times regardless of how small the part of tudatpy being loaded is.
Docstrings: Docstrings are currently defined in YAML files and added to functions with a package called multidoc. Separating the docstrings from the source code is not standard, makes it difficult to know if a function is documented or if its documentation is up to date and complicates the development process to the point that docstrings are not included in the current conda distribution (or are included but do not work).
Syntax higlighting: The inability to inspect the kernel currently prevents proper syntax highlighting and autocompletion and results into linters throwing warnings at correct code.
Autogeneration of __init__.py files: The autogeneration of __init__.py is problematic for modules including both C++ and Python functionality, adds complexity to the build process and makes it difficult for new developers to understand how to contribute code to the project.
Organization of source code: It would be nice to have a more intuitive organization of the source code used for the exposition (e.g. having expose_X.cpp under tudatpy/X rather than tudatpy/kernel/X as the former also contains the __init__.py file and, potentially, python-native functionality). The presence of _import_all_kernel_modules.py files and the fact that they are autogenerated also makes it difficult to understand where to contribute code or where to find the source code of the function you're using.
CMake: The CMakeLists.txt are very intimidating, mostly due to the use of custom tools like YACMA and the lack of maintenance, which often results into patches. Replacing custom tools with standard ones and trying to reduce the size of the CMake files would make the modification of the build more approachable.

Non-critical

Project layout: Tudatpy currently uses a flat layout (i.e. tudatpy/__init__.py). Migrating to a src layout (i.e. tudatpy/src/tudatpy/__init__.py) would allow for the creation of a stub-only package at no cost and significantly simplify a potential unification of tudat and tudatpy into a standalone repository (e.g. adding tudat, spice, sofa... to tudatpy/src). Both flat and src layouts are standard and accepted, but I consider the latter more suitable for this kind of project.

Solutions

Import time

I modified the build process to generate one shared library per submodule (i.e. per expose_X.cpp file). This results into import times being proportional to the size of the imported modules (e.g. import time will still be relatively high for a program loading numerical_simulation, but not for one loading math or astro, since those modules are much smaller), which could be reduced by splitting big submodules per functionality (e.g. numerical_simulation -> dynamics & estimation or estimation_setup.observation being splitted into submodules) or optimizing the compilation process.

Organization of source code

I moved each expose_X.cpp script to its submodule's directory (as suggested above) and modified the build process such that the associated kernel is saved in the same directory. This allows for the replacement of the autogenerated __init__.py and _import_all_kernel_modules.py files with a standard __init__.py scripts to be maintained by developers. Thus, a hybrid submodule (one including exposed and python-native functionality) X would have the following structure:

tudatpy/src/tudatpy/X
    |_ __init__.py
    |_ expose_X.cpp
    |_ expose_X.so  (after compilation)
    |_ foo.py

and the __init__.py file would look as follows:

from .expose_X import *
from .foo import <public-functionality-from-foo.py>    # Manually specified by users

Under the assumption that all the functionality defined in expose_X.cpp should belong to the API, the star import from expose_X prevents developers from having to update the __init__.py files when they expose a new function, effectively replacing autogeneration, while the manual imports from the python scripts allow them to keep "private" functions and helpers out of tudatpy's API. The presence of the shared libraries in the source directories is just a personal preference, but moving them into a lib or kernel directory would not be a problem.

NOTE: A positive side effect of having split the kernel is that the header files associated with the exposition scripts are no longer needed.

Docstrings

For each exposed function, I read the content of the docstring from the YAML file, gave it an adequate format and pasted it together with the source code that exposes the function. This eliminates the need for multidoc and the results into docstrings being included by default in all tudatpy distributions. As an example, this is the code snippet used to expose spice.load_kernel():

m.def("load_kernel", &tudat::spice_interface::loadSpiceKernelInTudat,
          py::arg("kernel_file"),
          R"doc(Loads a Spice kernel into the pool.

          This function loads a Spice kernel into the kernel pool, from which it can be used
          by the various internal spice routines. Matters regarding the manner in which Spice
          handles different kernels containing the same information can be found in the spice
          required reading documentation, kernel section. Wrapper for the
          `furnsh_c <https://naif.jpl.nasa.gov/pub/naif/toolkit_docs/C/cspice/furnsh_c.html>`_ function.

          :param file_path: Path to the spice kernel to be loaded.
          )doc");

Syntax highlighting

Taking #154 as reference, I included automatic stub generation as part of the build process. Stubs are generated using pybind11-stubgen following the compilation of tudatpy, and stored in a stub-only package with the same structure as the source directory, as described in Packaging type information.

The package is automatically installed together with tudatpy and provides syntax highlighters with information about the available functionality per submodule, function and class signatures, docstrings... This allows for regular autocompletion and greatly simplifies the access to typing and usage information.

Some notes on stub generation

The reason why I opted for pybind11-stubgen instead of mypy's stubgen is that the latter is not compatible with the generation of stub-only packages, is bad at inferring the signature of exposed functions and includes undesired information in the docstrings.

While significantly better than stubgen, pybind11-stubgen includes undesired from __future__ imports and is not particularly good at generating stubs for __init__.py files. Thus, while the vast majority of the stub generation process relies on pybind11-stubgen, I wrote some post-processing functions and a parser to generate __init__.pyi stubs, remove undesired imports and ensure proper indentation within docstrings.

CMake

I replaced YACMA with standard alternatives based on pybind11, raised CMake's minimum required version to 3.15 (which seems like the equivalent of C++11 or Python 3.0) and tried to reorganize the code and add comments to make it less intimidating and easier to understand.

I also noticed that Tudatpy's cmake files relied on the existence of environment variables that are only defined in Tudat's CMakeLists.txt, so I modified the cmake setup to ensure that, as long as tudat is installed, it is straightforward to build tudatpy as a standalone library.

Missing

The only thing missing from the PR with respect to the current version of Tudatpy is the generation of the API website, which I haven't still adapted to the new placement of the docstrings. This prevents us from merging into main but should not be a problem if we first merge into develop as all the API documentation is already accessible via autocompletion.

I think it is a good idea to first merge this and get feedback and undetected issues from users while I work on the API generation, which can be then added as a feature before merging into main.

- Splitted tudatpy kernel into submodules - Replaced YACMA with standard method to find python (CMake) - Introduced standard src structure for future integration of stubs

…l_simulation

- Moved sbdb and horizons interface from tudatpy/data to numerical simulation to avoid circular imports - Fixed typing issues in python scripts: data, util - A lot of functionality still relies on tudatpy/io, which is currently deprecated, so I moved it to tudatpy/data.

- Defined a CMake-based installation configuration based on the old version of tudatpy as an attempt to fix installation issues on azure. - Replaced stub-only package with in-tree stubs, which seem to work better with current code editors.

alfonsoSR added 5 commits August 6, 2024 13:22

Use pybind

1da7740

tmp

0678a84

Split kernel into submodules

e4aa905

- Splitted tudatpy kernel into submodules - Replaced YACMA with standard method to find python (CMake) - Introduced standard src structure for future integration of stubs

Remove YACMA

564e9af

Remove old tudatpy src directory

ed82c55

alfonsoSR changed the title ~~Restructure TudatPy~~ TudatPy structure Aug 8, 2024

alfonsoSR added 24 commits August 8, 2024 14:09

Restore spice __init__.py

1cbc98a

WIP: Standalone cmake + stubs

73efa21

WIP: stubs

0d06f41

WIP: Stubs

c71eec4

Integrated docstrings + stub generation. Circular imports in numerica…

e426405

…l_simulation

Solve circular imports

8255002

Include dir for pybind11

8b620a9

WIP: Fix issues with python scripts

de11783

Add build commands to cmake

aefffd1

Add tudat include dir

ff6d7bf

Show tudat environment

1565a49

System include pybind11

39f2114

Add include statements to add_extension macro

3726d06

Solar activity + NRLMSISE00

b9d2219

Modified CMake configuration to resemble original

223cca3

Delete old versions of cmake config

3df1ca8

Remove bodies, cli and apps

7826d2a

Stub generation

3b5d6fc

- Moved sbdb and horizons interface from tudatpy/data to numerical simulation to avoid circular imports - Fixed typing issues in python scripts: data, util - A lot of functionality still relies on tudatpy/io, which is currently deprecated, so I moved it to tudatpy/data.

Remove dependency on docstrings.h

53742ef

expose_environment

a9aea72

Star imports and fix in expose_environment

ab1ec7b

Back to working expose_environment

6f7dd8e

Fix compatibility issues with latests tudat version

494e8f3

Support for smart stub generation

a5a21ac

alfonsoSR added 19 commits September 7, 2024 19:05

Attempt pip-based azure build

bb21992

Attempt build.py-based build on azure

4bc24ec

Fix 3.9/10 compatibility issues + Add install script

341f01b

Demo cmake-based installation

a245959

Stub installation

06ccf09

Stubs

8131303

Stub generation - 2

9bf6096

More issues with stubs

99cb522

Clean CMakeLists.txt

ec1a5a9

Separate build from stub generation

fc45343

Fix: invalid relative import

df1d468

Attempt pip-based conda build

1d4015a

Test

961397f

Disable automatic stub generation on build and add stubs to git repo

56dc764

Add install configuration to cmake

5b87e7d

CMake install configuration & inplace stubs

4f5cedc

- Defined a CMake-based installation configuration based on the old version of tudatpy as an attempt to fix installation issues on azure. - Replaced stub-only package with in-tree stubs, which seem to work better with current code editors.

YACMA-based installation

3e9643f

Fix issues with tests and python include dir (attempt) - No stubs

a3aa65b

Update installation and add stubs

6eeb0b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TudatPy structure #159

TudatPy structure #159

alfonsoSR commented Aug 8, 2024 •

edited

Loading

TudatPy structure #159

Are you sure you want to change the base?

TudatPy structure #159

Conversation

alfonsoSR commented Aug 8, 2024 • edited Loading

Issues addressed by this PR

Critical

Non-critical

Solutions

Import time

Organization of source code

Docstrings

Syntax highlighting

Some notes on stub generation

CMake

Missing

alfonsoSR commented Aug 8, 2024 •

edited

Loading