Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Some improvements to VecNorm #2251

Merged
merged 4 commits into from
Jun 26, 2024
Merged

[Feature] Some improvements to VecNorm #2251

merged 4 commits into from
Jun 26, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 26, 2024

No description provided.

Copy link

pytorch-bot bot commented Jun 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2251

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit 9bee908 with merge base 849b3de (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2024
Copy link

github-actions bot commented Jun 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1120s 58.8067ms 17.0049 Ops/s 17.7724 Ops/s $\color{#d91a1a}-4.32\%$
test_sync 38.9659ms 32.0057ms 31.2444 Ops/s 32.5389 Ops/s $\color{#d91a1a}-3.98\%$
test_async 52.8188ms 28.9141ms 34.5852 Ops/s 35.9573 Ops/s $\color{#d91a1a}-3.82\%$
test_simple 0.3787s 0.3771s 2.6518 Ops/s 2.5462 Ops/s $\color{#35bf28}+4.15\%$
test_transformed 0.5311s 0.5279s 1.8943 Ops/s 1.8862 Ops/s $\color{#35bf28}+0.43\%$
test_serial 1.3209s 1.2665s 0.7896 Ops/s 0.7950 Ops/s $\color{#d91a1a}-0.68\%$
test_parallel 1.1287s 1.0734s 0.9316 Ops/s 0.9340 Ops/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-True-True-True-True] 0.1303ms 22.7322μs 43.9905 KOps/s 44.0741 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-True-True-False] 58.4190μs 13.3247μs 75.0488 KOps/s 74.4349 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[True-True-True-False-True] 42.4290μs 13.3582μs 74.8602 KOps/s 75.5393 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[True-True-True-False-False] 48.3000μs 7.8563μs 127.2859 KOps/s 126.1126 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[True-True-False-True-True] 52.1680μs 24.0447μs 41.5893 KOps/s 41.6433 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-True-False-True-False] 0.1134ms 14.5915μs 68.5331 KOps/s 67.6099 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-True-False-False-True] 42.3900μs 14.6326μs 68.3407 KOps/s 68.3668 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-True-False-False-False] 49.3630μs 9.0754μs 110.1882 KOps/s 108.1660 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-False-True-True-True] 67.3450μs 25.6618μs 38.9684 KOps/s 39.1910 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-False-True-True-False] 43.6710μs 16.0162μs 62.4366 KOps/s 61.5507 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[True-False-True-False-True] 51.4260μs 14.6894μs 68.0761 KOps/s 69.3131 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-False-True-False-False] 32.3300μs 9.0737μs 110.2082 KOps/s 108.2948 KOps/s $\color{#35bf28}+1.77\%$
test_step_mdp_speed[True-False-False-True-True] 69.9800μs 26.8049μs 37.3066 KOps/s 37.1380 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-False-False-True-False] 58.1180μs 17.3641μs 57.5901 KOps/s 57.4781 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[True-False-False-False-True] 80.7610μs 15.7129μs 63.6418 KOps/s 63.8093 KOps/s $\color{#d91a1a}-0.26\%$
test_step_mdp_speed[True-False-False-False-False] 48.9310μs 10.3287μs 96.8175 KOps/s 95.5833 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-True-True-True-True] 64.9200μs 25.2105μs 39.6660 KOps/s 38.8388 KOps/s $\color{#35bf28}+2.13\%$
test_step_mdp_speed[False-True-True-True-False] 57.7770μs 15.8830μs 62.9604 KOps/s 61.5874 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-True-True-False-True] 47.7780μs 16.9263μs 59.0797 KOps/s 58.6000 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-True-True-False-False] 0.1175ms 10.7839μs 92.7310 KOps/s 95.8231 KOps/s $\color{#d91a1a}-3.23\%$
test_step_mdp_speed[False-True-False-True-True] 65.9230μs 26.7296μs 37.4116 KOps/s 37.1734 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-True-False-True-False] 0.1480ms 17.1921μs 58.1662 KOps/s 57.2669 KOps/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[False-True-False-False-True] 42.7000μs 17.9108μs 55.8322 KOps/s 55.2278 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[False-True-False-False-False] 34.7750μs 11.5362μs 86.6836 KOps/s 85.2738 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[False-False-True-True-True] 61.3840μs 28.0042μs 35.7089 KOps/s 35.4414 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-False-True-True-False] 40.3450μs 18.5831μs 53.8122 KOps/s 52.8194 KOps/s $\color{#35bf28}+1.88\%$
test_step_mdp_speed[False-False-True-False-True] 0.1182ms 18.1937μs 54.9640 KOps/s 54.9911 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[False-False-True-False-False] 33.4530μs 11.4649μs 87.2226 KOps/s 86.4074 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-False-False-True-True] 43.4010μs 29.2292μs 34.2124 KOps/s 33.7243 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[False-False-False-True-False] 42.1990μs 19.7075μs 50.7422 KOps/s 49.8028 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[False-False-False-False-True] 57.9080μs 19.1146μs 52.3159 KOps/s 52.6311 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-False-False-False-False] 38.5510μs 12.5737μs 79.5308 KOps/s 77.1638 KOps/s $\color{#35bf28}+3.07\%$
test_values[generalized_advantage_estimate-True-True] 10.1120ms 9.7682ms 102.3728 Ops/s 103.8422 Ops/s $\color{#d91a1a}-1.41\%$
test_values[vec_generalized_advantage_estimate-True-True] 41.6651ms 35.6415ms 28.0571 Ops/s 28.4838 Ops/s $\color{#d91a1a}-1.50\%$
test_values[td0_return_estimate-False-False] 0.2314ms 0.1709ms 5.8519 KOps/s 5.6656 KOps/s $\color{#35bf28}+3.29\%$
test_values[td1_return_estimate-False-False] 27.0346ms 24.7680ms 40.3747 Ops/s 41.9804 Ops/s $\color{#d91a1a}-3.82\%$
test_values[vec_td1_return_estimate-False-False] 36.7785ms 35.3196ms 28.3129 Ops/s 28.1994 Ops/s $\color{#35bf28}+0.40\%$
test_values[td_lambda_return_estimate-True-False] 39.9827ms 35.7638ms 27.9612 Ops/s 28.9603 Ops/s $\color{#d91a1a}-3.45\%$
test_values[vec_td_lambda_return_estimate-True-False] 48.6897ms 35.7196ms 27.9959 Ops/s 26.9488 Ops/s $\color{#35bf28}+3.89\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.6637ms 8.5420ms 117.0691 Ops/s 116.8738 Ops/s $\color{#35bf28}+0.17\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3496ms 2.0243ms 494.0028 Ops/s 509.7247 Ops/s $\color{#d91a1a}-3.08\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5796ms 0.3587ms 2.7875 KOps/s 2.7802 KOps/s $\color{#35bf28}+0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.3004ms 46.0724ms 21.7049 Ops/s 21.7111 Ops/s $\color{#d91a1a}-0.03\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.6063ms 3.0786ms 324.8178 Ops/s 327.1839 Ops/s $\color{#d91a1a}-0.72\%$
test_dqn_speed 7.3879ms 1.3666ms 731.7495 Ops/s 738.6028 Ops/s $\color{#d91a1a}-0.93\%$
test_ddpg_speed 3.2812ms 2.8766ms 347.6371 Ops/s 345.5728 Ops/s $\color{#35bf28}+0.60\%$
test_sac_speed 10.9087ms 8.8326ms 113.2168 Ops/s 114.6935 Ops/s $\color{#d91a1a}-1.29\%$
test_redq_speed 15.7168ms 13.9510ms 71.6795 Ops/s 71.4770 Ops/s $\color{#35bf28}+0.28\%$
test_redq_deprec_speed 16.2585ms 14.2158ms 70.3441 Ops/s 71.5016 Ops/s $\color{#d91a1a}-1.62\%$
test_td3_speed 18.9967ms 8.7115ms 114.7906 Ops/s 115.7300 Ops/s $\color{#d91a1a}-0.81\%$
test_cql_speed 39.2783ms 37.2024ms 26.8800 Ops/s 26.7910 Ops/s $\color{#35bf28}+0.33\%$
test_a2c_speed 8.8927ms 7.6862ms 130.1032 Ops/s 130.8478 Ops/s $\color{#d91a1a}-0.57\%$
test_ppo_speed 9.4799ms 8.0999ms 123.4581 Ops/s 126.3851 Ops/s $\color{#d91a1a}-2.32\%$
test_reinforce_speed 8.0452ms 6.9550ms 143.7809 Ops/s 149.3898 Ops/s $\color{#d91a1a}-3.75\%$
test_iql_speed 35.4875ms 33.5426ms 29.8128 Ops/s 28.9250 Ops/s $\color{#35bf28}+3.07\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.0411ms 3.7456ms 266.9804 Ops/s 271.9881 Ops/s $\color{#d91a1a}-1.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8418ms 0.5100ms 1.9609 KOps/s 2.0269 KOps/s $\color{#d91a1a}-3.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.5308ms 0.4825ms 2.0725 KOps/s 2.1161 KOps/s $\color{#d91a1a}-2.06\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8892ms 3.7228ms 268.6127 Ops/s 278.9247 Ops/s $\color{#d91a1a}-3.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8261ms 0.4935ms 2.0262 KOps/s 2.0370 KOps/s $\color{#d91a1a}-0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7796ms 0.4742ms 2.1089 KOps/s 2.1094 KOps/s $\color{#d91a1a}-0.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.9391ms 1.7488ms 571.8121 Ops/s 578.7414 Ops/s $\color{#d91a1a}-1.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3766ms 1.6387ms 610.2317 Ops/s 606.5469 Ops/s $\color{#35bf28}+0.61\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.0964ms 3.8769ms 257.9373 Ops/s 269.4383 Ops/s $\color{#d91a1a}-4.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7792ms 0.6388ms 1.5654 KOps/s 1.5777 KOps/s $\color{#d91a1a}-0.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.8741ms 0.6150ms 1.6261 KOps/s 1.6487 KOps/s $\color{#d91a1a}-1.37\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.5325ms 3.7567ms 266.1896 Ops/s 274.3381 Ops/s $\color{#d91a1a}-2.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8631ms 0.5046ms 1.9819 KOps/s 2.0183 KOps/s $\color{#d91a1a}-1.81\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.5555ms 0.4797ms 2.0846 KOps/s 2.1016 KOps/s $\color{#d91a1a}-0.81\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3787ms 3.9036ms 256.1718 Ops/s 277.3589 Ops/s $\textbf{\color{#d91a1a}-7.64\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2443ms 0.5026ms 1.9898 KOps/s 2.0313 KOps/s $\color{#d91a1a}-2.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7594ms 0.4815ms 2.0766 KOps/s 2.1277 KOps/s $\color{#d91a1a}-2.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.6628ms 3.9843ms 250.9824 Ops/s 268.5990 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1856ms 0.6656ms 1.5024 KOps/s 1.5686 KOps/s $\color{#d91a1a}-4.22\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.7409ms 0.6303ms 1.5866 KOps/s 1.6428 KOps/s $\color{#d91a1a}-3.42\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1382s 6.4605ms 154.7861 Ops/s 171.9265 Ops/s $\textbf{\color{#d91a1a}-9.97\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1282s 15.3491ms 65.1503 Ops/s 78.1469 Ops/s $\textbf{\color{#d91a1a}-16.63\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.7937ms 1.0839ms 922.5626 Ops/s 941.8543 Ops/s $\color{#d91a1a}-2.05\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1219s 6.2170ms 160.8498 Ops/s 123.8497 Ops/s $\textbf{\color{#35bf28}+29.87\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 15.3204ms 12.8153ms 78.0317 Ops/s 79.0875 Ops/s $\color{#d91a1a}-1.33\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.6038ms 1.0740ms 931.0645 Ops/s 932.5495 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1153s 6.1979ms 161.3452 Ops/s 162.1375 Ops/s $\color{#d91a1a}-0.49\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1356s 15.7414ms 63.5269 Ops/s 76.4847 Ops/s $\textbf{\color{#d91a1a}-16.94\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8171ms 1.2376ms 808.0295 Ops/s 802.1448 Ops/s $\color{#35bf28}+0.73\%$

Copy link

github-actions bot commented Jun 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1770s 0.1241s 8.0611 Ops/s 8.5671 Ops/s $\textbf{\color{#d91a1a}-5.91\%}$
test_sync 0.1053s 0.1031s 9.7012 Ops/s 9.7299 Ops/s $\color{#d91a1a}-0.30\%$
test_async 0.1993s 98.6260ms 10.1393 Ops/s 9.8019 Ops/s $\color{#35bf28}+3.44\%$
test_single_pixels 0.1289s 0.1279s 7.8190 Ops/s 7.9318 Ops/s $\color{#d91a1a}-1.42\%$
test_sync_pixels 85.4997ms 82.2237ms 12.1619 Ops/s 12.2351 Ops/s $\color{#d91a1a}-0.60\%$
test_async_pixels 0.1587s 67.4352ms 14.8290 Ops/s 14.4435 Ops/s $\color{#35bf28}+2.67\%$
test_simple 0.8337s 0.8259s 1.2107 Ops/s 1.2216 Ops/s $\color{#d91a1a}-0.89\%$
test_transformed 1.0709s 1.0647s 0.9392 Ops/s 0.9432 Ops/s $\color{#d91a1a}-0.42\%$
test_serial 2.5504s 2.4946s 0.4009 Ops/s 0.4072 Ops/s $\color{#d91a1a}-1.56\%$
test_parallel 2.4357s 2.3808s 0.4200 Ops/s 0.4243 Ops/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[True-True-True-True-True] 79.5220μs 33.5151μs 29.8373 KOps/s 29.2195 KOps/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[True-True-True-True-False] 0.1640ms 19.4286μs 51.4704 KOps/s 51.0326 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[True-True-True-False-True] 41.8900μs 19.1398μs 52.2471 KOps/s 51.2166 KOps/s $\color{#35bf28}+2.01\%$
test_step_mdp_speed[True-True-True-False-False] 39.1610μs 11.1887μs 89.3755 KOps/s 89.4364 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-True-False-True-True] 67.4220μs 35.6970μs 28.0135 KOps/s 28.6677 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[True-True-False-True-False] 47.7620μs 21.2740μs 47.0057 KOps/s 48.3926 KOps/s $\color{#d91a1a}-2.87\%$
test_step_mdp_speed[True-True-False-False-True] 57.5110μs 21.1585μs 47.2622 KOps/s 49.0178 KOps/s $\color{#d91a1a}-3.58\%$
test_step_mdp_speed[True-True-False-False-False] 37.1220μs 12.9537μs 77.1979 KOps/s 77.1822 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-False-True-True-True] 59.8720μs 37.4374μs 26.7113 KOps/s 27.0524 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[True-False-True-True-False] 47.9120μs 22.9757μs 43.5242 KOps/s 42.3747 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[True-False-True-False-True] 48.2110μs 20.9912μs 47.6391 KOps/s 48.8112 KOps/s $\color{#d91a1a}-2.40\%$
test_step_mdp_speed[True-False-True-False-False] 35.2810μs 13.0026μs 76.9075 KOps/s 75.3381 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[True-False-False-True-True] 59.4610μs 38.8366μs 25.7489 KOps/s 25.7568 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[True-False-False-True-False] 46.5810μs 25.2733μs 39.5674 KOps/s 39.2989 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-False-False-False-True] 45.2420μs 22.8324μs 43.7974 KOps/s 44.3251 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-False-False-False-False] 40.3100μs 14.8425μs 67.3743 KOps/s 68.9019 KOps/s $\color{#d91a1a}-2.22\%$
test_step_mdp_speed[False-True-True-True-True] 70.0810μs 37.2805μs 26.8236 KOps/s 26.0004 KOps/s $\color{#35bf28}+3.17\%$
test_step_mdp_speed[False-True-True-True-False] 46.1310μs 23.1962μs 43.1105 KOps/s 43.6806 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-True-True-False-True] 46.8910μs 25.0929μs 39.8519 KOps/s 41.3815 KOps/s $\color{#d91a1a}-3.70\%$
test_step_mdp_speed[False-True-True-False-False] 81.7520μs 14.9696μs 66.8021 KOps/s 68.9902 KOps/s $\color{#d91a1a}-3.17\%$
test_step_mdp_speed[False-True-False-True-True] 63.1810μs 39.4487μs 25.3494 KOps/s 25.6687 KOps/s $\color{#d91a1a}-1.24\%$
test_step_mdp_speed[False-True-False-True-False] 59.5020μs 25.2607μs 39.5872 KOps/s 40.7971 KOps/s $\color{#d91a1a}-2.97\%$
test_step_mdp_speed[False-True-False-False-True] 44.0520μs 27.4531μs 36.4258 KOps/s 38.5013 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_step_mdp_speed[False-True-False-False-False] 47.4210μs 16.7826μs 59.5854 KOps/s 61.4971 KOps/s $\color{#d91a1a}-3.11\%$
test_step_mdp_speed[False-False-True-True-True] 64.9010μs 41.0334μs 24.3704 KOps/s 24.6491 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-False-True-True-False] 51.8410μs 26.7037μs 37.4480 KOps/s 37.9561 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[False-False-True-False-True] 53.4000μs 27.6921μs 36.1114 KOps/s 37.4420 KOps/s $\color{#d91a1a}-3.55\%$
test_step_mdp_speed[False-False-True-False-False] 47.5310μs 16.7271μs 59.7833 KOps/s 60.9461 KOps/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[False-False-False-True-True] 77.4810μs 43.4845μs 22.9967 KOps/s 23.1289 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-False-True-False] 61.0410μs 28.9032μs 34.5982 KOps/s 33.8720 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-False-False-False-True] 51.2310μs 28.6543μs 34.8988 KOps/s 35.7517 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[False-False-False-False-False] 42.1410μs 18.3913μs 54.3737 KOps/s 55.5565 KOps/s $\color{#d91a1a}-2.13\%$
test_values[generalized_advantage_estimate-True-True] 27.0181ms 25.9520ms 38.5327 Ops/s 38.0594 Ops/s $\color{#35bf28}+1.24\%$
test_values[vec_generalized_advantage_estimate-True-True] 90.8803ms 2.7295ms 366.3738 Ops/s 353.0238 Ops/s $\color{#35bf28}+3.78\%$
test_values[td0_return_estimate-False-False] 90.7420μs 67.1189μs 14.8989 KOps/s 14.9087 KOps/s $\color{#d91a1a}-0.07\%$
test_values[td1_return_estimate-False-False] 57.4579ms 57.2567ms 17.4652 Ops/s 17.0266 Ops/s $\color{#35bf28}+2.58\%$
test_values[vec_td1_return_estimate-False-False] 1.3309ms 1.0978ms 910.9496 Ops/s 908.0525 Ops/s $\color{#35bf28}+0.32\%$
test_values[td_lambda_return_estimate-True-False] 95.3161ms 92.4814ms 10.8130 Ops/s 10.5982 Ops/s $\color{#35bf28}+2.03\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2592ms 1.0936ms 914.4414 Ops/s 907.8784 Ops/s $\color{#35bf28}+0.72\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.9615ms 25.7728ms 38.8006 Ops/s 37.1476 Ops/s $\color{#35bf28}+4.45\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9449ms 0.7391ms 1.3529 KOps/s 1.3562 KOps/s $\color{#d91a1a}-0.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7896ms 0.6838ms 1.4624 KOps/s 1.4594 KOps/s $\color{#35bf28}+0.21\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5112ms 1.4831ms 674.2699 Ops/s 665.6246 Ops/s $\color{#35bf28}+1.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7419ms 0.6995ms 1.4296 KOps/s 1.4275 KOps/s $\color{#35bf28}+0.15\%$
test_dqn_speed 78.9740ms 1.6200ms 617.2856 Ops/s 685.9277 Ops/s $\textbf{\color{#d91a1a}-10.01\%}$
test_ddpg_speed 3.2679ms 2.9988ms 333.4704 Ops/s 335.3738 Ops/s $\color{#d91a1a}-0.57\%$
test_sac_speed 9.0293ms 8.5250ms 117.3027 Ops/s 118.9850 Ops/s $\color{#d91a1a}-1.41\%$
test_redq_speed 11.7620ms 10.7513ms 93.0122 Ops/s 94.4505 Ops/s $\color{#d91a1a}-1.52\%$
test_redq_deprec_speed 0.1075s 12.8546ms 77.7932 Ops/s 87.9555 Ops/s $\textbf{\color{#d91a1a}-11.55\%}$
test_td3_speed 8.5390ms 8.4304ms 118.6178 Ops/s 118.3714 Ops/s $\color{#35bf28}+0.21\%$
test_cql_speed 26.4573ms 25.8185ms 38.7319 Ops/s 39.3363 Ops/s $\color{#d91a1a}-1.54\%$
test_a2c_speed 5.8392ms 5.6179ms 178.0026 Ops/s 175.3360 Ops/s $\color{#35bf28}+1.52\%$
test_ppo_speed 6.1779ms 5.9617ms 167.7376 Ops/s 166.8892 Ops/s $\color{#35bf28}+0.51\%$
test_reinforce_speed 4.8866ms 4.6411ms 215.4681 Ops/s 215.6654 Ops/s $\color{#d91a1a}-0.09\%$
test_iql_speed 20.6466ms 19.7365ms 50.6676 Ops/s 50.9873 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8138ms 4.5969ms 217.5387 Ops/s 219.5481 Ops/s $\color{#d91a1a}-0.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7350ms 0.5985ms 1.6709 KOps/s 1.6741 KOps/s $\color{#d91a1a}-0.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.4175ms 0.5764ms 1.7348 KOps/s 1.7518 KOps/s $\color{#d91a1a}-0.97\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8249ms 4.5249ms 220.9985 Ops/s 220.3549 Ops/s $\color{#35bf28}+0.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7143ms 0.5899ms 1.6953 KOps/s 1.7063 KOps/s $\color{#d91a1a}-0.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.2951ms 0.5718ms 1.7490 KOps/s 1.7748 KOps/s $\color{#d91a1a}-1.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3447ms 2.1323ms 468.9785 Ops/s 475.7310 Ops/s $\color{#d91a1a}-1.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 5.8078ms 2.0517ms 487.3999 Ops/s 494.6179 Ops/s $\color{#d91a1a}-1.46\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.8446ms 4.7009ms 212.7249 Ops/s 214.9662 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1218s 0.8668ms 1.1536 KOps/s 1.1478 KOps/s $\color{#35bf28}+0.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9201ms 0.7237ms 1.3818 KOps/s 1.3838 KOps/s $\color{#d91a1a}-0.15\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.6560ms 4.5461ms 219.9709 Ops/s 218.8757 Ops/s $\color{#35bf28}+0.50\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2724ms 0.5992ms 1.6688 KOps/s 1.6824 KOps/s $\color{#d91a1a}-0.81\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7239ms 0.5739ms 1.7424 KOps/s 1.7328 KOps/s $\color{#35bf28}+0.55\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7312ms 4.5399ms 220.2691 Ops/s 216.7870 Ops/s $\color{#35bf28}+1.61\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7295ms 0.5909ms 1.6923 KOps/s 1.2961 KOps/s $\textbf{\color{#35bf28}+30.56\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7747ms 0.5699ms 1.7546 KOps/s 1.7573 KOps/s $\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.8169ms 4.7195ms 211.8876 Ops/s 212.1855 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4484ms 0.7507ms 1.3320 KOps/s 1.3529 KOps/s $\color{#d91a1a}-1.54\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9512ms 0.7280ms 1.3736 KOps/s 1.3709 KOps/s $\color{#35bf28}+0.19\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1458s 7.7683ms 128.7280 Ops/s 134.8946 Ops/s $\color{#d91a1a}-4.57\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.9179ms 15.5114ms 64.4685 Ops/s 64.4275 Ops/s $\color{#35bf28}+0.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0048ms 1.4851ms 673.3464 Ops/s 769.7994 Ops/s $\textbf{\color{#d91a1a}-12.53\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1255s 7.3899ms 135.3203 Ops/s 100.3279 Ops/s $\textbf{\color{#35bf28}+34.88\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 18.0612ms 15.5278ms 64.4008 Ops/s 64.6176 Ops/s $\color{#d91a1a}-0.34\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.7523ms 1.4312ms 698.7279 Ops/s 768.9918 Ops/s $\textbf{\color{#d91a1a}-9.14\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1266s 9.9961ms 100.0386 Ops/s 131.3948 Ops/s $\textbf{\color{#d91a1a}-23.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 18.0498ms 15.5798ms 64.1855 Ops/s 63.4690 Ops/s $\color{#35bf28}+1.13\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5539ms 1.5072ms 663.4861 Ops/s 671.3786 Ops/s $\color{#d91a1a}-1.18\%$

@vmoens vmoens added the enhancement New feature or request label Jun 26, 2024
@vmoens vmoens merged commit 670a8cf into main Jun 26, 2024
53 of 58 checks passed
@vmoens vmoens deleted the vecnorm-improvements branch August 7, 2024 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants