Enforce assumptions in System Mode Manager initialisation #13

chgio · 2024-01-24T18:00:12Z

#4 and #11 introduced a relatively simple bug in the System Mode Manager: initial subsystem states are only assumed, not enforced.

As this bug became apparent through a discrepancy with the payload camera, for cloudview-1 (Release) this was hot-fixed by adding manual mode override through Serial in guorbit/camera-relay (cloudview-1 Release): src/main.cpp, lines 57-80 @ 3731638 (comment).

This "hack" adopts the approach of ensuring the assumptions are correct, which did work but is suboptimal in terms of robustness, especially in unmanned initialisation scenarios. Instead -- or, better yet, in addition -- it would be auspicable to programmatically enforce the assumptions in the initial transition of the System Mode Manager.
My idea is to substitute the assignments in the initialisation line with calls to the same code that requests the TC-based "arbitrary" subsystem mode changes, but requesting the currently assumed initial mode instead.

It should be possible to achieve this elegantly enough through SDL Joins or SDL Procedures: internal refactorings should be strongly preferred, to minimise the architectural impact and memory footprint of these changes.
(TASTE SDL Tutorial)

We should also think about how to handle failures: the most basic approach would be to wrap this in a loop to try again until success, but Rule 2 of NASA/JPL's The Power of 10 prescribes that indefinite loops should be avoided. This could potentially be an exception, as operational functionalities cannot be safely reached unless initialisation is successful; otherwise, a definite loop could leave an increasing time between successive tries to hopefully allow for completion of whatever process is causing the failure.
A perhaps better approach could be to get rid of the assumptions entirely, assigning the actual state contained in the subsystem's response instead of the requested assumption.

The text was updated successfully, but these errors were encountered:

chgio · 2024-02-06T23:58:28Z

Solid work on 3cb6756 to 0e6cd71 -- very elegant control flow injection 😄

As we'd discussed previously already, nice refactoring of the initial assumptions and initialisation of the report variables with the assumed data!
Still, we would be much better off following @ZonaDashti's suggestion and just querying subsystems about their actual initial modes to take it from there, which would get rid of the assumptions altogether.
I've set up #14 about that -- it's more work, and definitely beyond the scope of this initial bugfix, but probably the only and definitely the most elegant way to really fix this bug without having it come back to haunt us again! Plus, the extension to #15 would be useful in general for operations.

Nice implementation of the retry loop too! I'm only a little worried about the isolation of the initial state in the System Mode Manager's statechart view -- though analysing the transitions does reveal that it can only end up in Wait_change. I guess the statechart generator is getting a little caught up in the goto loops, so it might be worth pinging Maxime about this.
At any rate, the "unassuming mode readouts" of #14 should relieve this section of a lot of its criticality, as we can try automatically a few times and then in case of continued failure just put the burden of fixing this on the ground operator.

In the Mission Mode Change TC Handler, very interesting introduction of a separate Wait_initial state as opposed to the earlier idea of just initialising the component into Wait_report. Perhaps this could be used to accept both telecommands and report relay requests, to let a human into the loop to fix initialisation issues -- however, we should think about this carefully to avoid the risk of bricking anything instead!
Again, the "unassuming mode readouts" of #14 should remove the need for the dedicated mission_initial and system_mode_initial and keep our data model nice and clean.

Also, one thing that I forgot to mention: I noticed you only changed the test_subsystem_mode_change_validation implementation of the system_mode_manager -- I thought I remembered more code in that component!
This is an alternative implementation I made, and deliberately stripped of its logic to validate the requested subsystem mode changes, to test that its duplication distributed into each subsystem for redundancy is correct.
However, I realise this was a mistake on my part: multiple implementations are confusing and a smelly way to do this, as any change to the logic would need to be duplicated manually in both instances; instead, we should use the SDL Alternative (compilation option) symbol and do separate test runs with the alternative flag set and unset.
Apologies for the pain, please just move your changes to the default implementation and test that one; I'll try to take care of the Alternative unchecking branches (#16) -- unless, of course, you are particularly adamant about doing it yourself 😄

chgio · 2024-02-07T00:16:09Z

⬆️ @xszymonzur pls write down anything you do differently to the Capella model so I can backpropagate it ⬆️

(don't let me catch up ;)

chgio added the bug Something isn't working label Jan 24, 2024

chgio added this to the System-Wide Mode Change Capability (implementation) milestone Jan 24, 2024

chgio assigned xszymonzur Jan 24, 2024

chgio referenced this issue in guorbit/camera-relay Jan 24, 2024

initial commit: serial-overrideable PWM control

3731638

This was referenced Feb 6, 2024

Implementation of Invariant Subsystem Mode Status Requesting #14

Open

Refactor presence/absence of requested subsystem mode change validation logic in System Mode Manager using SDL Alternatives instead of multiple un-DRY implementations #16

Open

chgio mentioned this issue Feb 7, 2024

System Mode Management changes stemming from initialisation inconsistency bugfix guorbit/obc-model#62

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce assumptions in System Mode Manager initialisation #13

Enforce assumptions in System Mode Manager initialisation #13

chgio commented Jan 24, 2024 •

edited

Loading

chgio commented Feb 6, 2024

chgio commented Feb 7, 2024

Enforce assumptions in System Mode Manager initialisation #13

Enforce assumptions in System Mode Manager initialisation #13

Comments

chgio commented Jan 24, 2024 • edited Loading

chgio commented Feb 6, 2024

chgio commented Feb 7, 2024

chgio commented Jan 24, 2024 •

edited

Loading