Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Fix Minari tests #2419

Merged
merged 5 commits into from
Sep 17, 2024
Merged

[CI] Fix Minari tests #2419

merged 5 commits into from
Sep 17, 2024

Conversation

younik
Copy link
Contributor

@younik younik commented Sep 4, 2024

Description

With Minari 0.5.0 release, we regrouped dependencies to install only the required ones depending on user usage.
Moreover, we restructured the datasets to be hierarchical. This caused the torch/rl tests to fail; this PR should fix them.

I changed the tests to rely on Minari for getting the last version of the datasets.

Also, Minari 0.5.0 adds support for Arrow datasets, I am happy to integrate it in torch/rl.

Motivation and Context

Fixes failing CI due to new version of Minari.

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

  • I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

Copy link

pytorch-bot bot commented Sep 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2419

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 14 Unrelated Failures

As of commit cde51e9 with merge base 0326c41 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 4, 2024
@vmoens vmoens added Environments Adds or modifies an environment wrapper Data Data-related PR, will launch data-related jobs labels Sep 4, 2024
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for owning this!

@vmoens
Copy link
Contributor

vmoens commented Sep 4, 2024

We now have multiple. FAILED test/test_libs.py::TestMinari::test_load[D4RL/door/human-v2-True] - ModuleNotFoundError: No module named 'gymnasium_robotics' in the CI.

The linter (pre-commit) is also unhappy.

rl/CONTRIBUTING.md

Lines 34 to 52 in 0326c41

## Formatting your code
**Type annotation**
TorchRL is not strongly-typed, i.e. we do not enforce type hints, neither do we check that the ones that are present are valid. We rely on type hints purely for documentary purposes. Although this might change in the future, there is currently no need for this to be enforced at the moment.
**Linting**
Before your PR is ready, you'll probably want your code to be checked. This can be done easily by installing
```
pip install pre-commit
```
and running
```
pre-commit run --all-files
```
from within the torchrl cloned directory.
You can also install [pre-commit hooks](https://pre-commit.com/) (using `pre-commit install`
). You can disable the check by appending `-n` to your commit command: `git commit -m <commit message> -n`

@younik
Copy link
Contributor Author

younik commented Sep 4, 2024

We now have multiple. FAILED test/test_libs.py::TestMinari::test_load[D4RL/door/human-v2-True] - ModuleNotFoundError: No module named 'gymnasium_robotics' in the CI.

The linter (pre-commit) is also unhappy.

rl/CONTRIBUTING.md

Lines 34 to 52 in 0326c41

## Formatting your code
**Type annotation**
TorchRL is not strongly-typed, i.e. we do not enforce type hints, neither do we check that the ones that are present are valid. We rely on type hints purely for documentary purposes. Although this might change in the future, there is currently no need for this to be enforced at the moment.
**Linting**
Before your PR is ready, you'll probably want your code to be checked. This can be done easily by installing
```
pip install pre-commit
```
and running
```
pre-commit run --all-files
```
from within the torchrl cloned directory.
You can also install [pre-commit hooks](https://pre-commit.com/) (using `pre-commit install`
). You can disable the check by appending `-n` to your commit command: `git commit -m <commit message> -n`

Ups, my env was contaminated.
The dependency error is due to a line incorrectly unreachable in 0.4.3. However, throwing a dependency error during download is not the intended behavior, and I will fix it in Minari. I will release it this week.

@younik
Copy link
Contributor Author

younik commented Sep 6, 2024

After further analysis, the issue was with remote datasets lacking spaces, causing Minari to fall back to env_spec. This has been fixed. Could you please re-run the Minari tests in CI? @vmoens

@vmoens
Copy link
Contributor

vmoens commented Sep 10, 2024

There seems to be another error in the CI now:
For one dataset, we use a transform that expects a nested ("observation", "observation") entry but "observation" is already a leaf (so there can't be any nesting). Do you know what is happening?

@vmoens vmoens self-requested a review September 11, 2024 08:56
Comment on lines -2826 to -2853
# We rely on sorting the keys as v0 < v1 but if the version is greater than 9 this won't work
total_keys = sorted(minari.list_remote_datasets())
assert not any(
key[-2:] == "10" for key in total_keys
), "You should adapt the Minari test scripts as some dataset have a version >= 10 and sorting will fail."
total_keys_splits = [key.split("-") for key in total_keys]
total_keys = sorted(
minari.list_remote_datasets(latest_version=True, compatible_minari_version=True)
)
indices = torch.randperm(len(total_keys))[:20]
keys = [total_keys[idx] for idx in indices]
keys = [
key
for key in keys
if "=0.4" in minari.list_remote_datasets()[key]["minari_version"]
]

def _replace_with_max(key):
key_split = key.split("-")
same_entries = (
torch.tensor(
[total_key[:-1] == key_split[:-1] for total_key in total_keys_splits]
)
.nonzero()
.squeeze()
.tolist()
)
last_same_entry = same_entries[-1]
return total_keys[last_same_entry]

keys = [_replace_with_max(key) for key in keys]
Copy link
Contributor Author

@younik younik Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believed what changed is the order of _MINARI_DATASETS given the above modification.
The test work with kitchen or antmaze datasets as the observation space is a dictionary, while it doesn't work for pen datasets.
I can change the code to test only a kitchen dataset or to work with the pen one.

@younik
Copy link
Contributor Author

younik commented Sep 15, 2024

I changed the preprocessing test to only use a dataset with Dict observation space.
I have also updated the kitchen datasets that have wrong info structure.

@vmoens vmoens merged commit 224d637 into pytorch:main Sep 17, 2024
51 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Data Data-related PR, will launch data-related jobs Environments Adds or modifies an environment wrapper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants