Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix update_ with nested, heterogeneous envs #544

Closed
wants to merge 1 commit into from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 12, 2023

Description

Describe your changes in detail.

Motivation and Context

Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax close #15213 if this solves the issue #15213

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2023
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 105. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 38.0010μs 19.9554μs 50.1116 KOps/s 50.0941 KOps/s $\color{#35bf28}+0.04\%$
test_plain_set_stack_nested 0.2192ms 0.1853ms 5.3955 KOps/s 5.3819 KOps/s $\color{#35bf28}+0.25\%$
test_plain_set_nested_inplace 0.1400ms 23.6721μs 42.2438 KOps/s 42.0199 KOps/s $\color{#35bf28}+0.53\%$
test_plain_set_stack_nested_inplace 0.2650ms 0.2198ms 4.5490 KOps/s 4.5197 KOps/s $\color{#35bf28}+0.65\%$
test_items 83.1020μs 3.5172μs 284.3193 KOps/s 281.3240 KOps/s $\color{#35bf28}+1.06\%$
test_items_nested 0.4836ms 0.3713ms 2.6933 KOps/s 2.7457 KOps/s $\color{#d91a1a}-1.91\%$
test_items_nested_locked 1.2691ms 0.3691ms 2.7095 KOps/s 2.7515 KOps/s $\color{#d91a1a}-1.52\%$
test_items_nested_leaf 6.2175ms 0.2303ms 4.3418 KOps/s 4.5001 KOps/s $\color{#d91a1a}-3.52\%$
test_items_stack_nested 2.1212ms 1.9887ms 502.8537 Ops/s 503.8344 Ops/s $\color{#d91a1a}-0.19\%$
test_items_stack_nested_leaf 1.9516ms 1.7997ms 555.6426 Ops/s 553.3672 Ops/s $\color{#35bf28}+0.41\%$
test_items_stack_nested_locked 1.1094ms 0.9752ms 1.0255 KOps/s 1.0116 KOps/s $\color{#35bf28}+1.37\%$
test_keys 30.4010μs 5.0516μs 197.9585 KOps/s 194.2908 KOps/s $\color{#35bf28}+1.89\%$
test_keys_nested 1.0265ms 0.1858ms 5.3827 KOps/s 4.9928 KOps/s $\textbf{\color{#35bf28}+7.81\%}$
test_keys_nested_locked 0.2396ms 0.1796ms 5.5690 KOps/s 5.5257 KOps/s $\color{#35bf28}+0.78\%$
test_keys_nested_leaf 0.2979ms 0.1741ms 5.7438 KOps/s 5.7562 KOps/s $\color{#d91a1a}-0.22\%$
test_keys_stack_nested 1.9735ms 1.8320ms 545.8644 Ops/s 548.4935 Ops/s $\color{#d91a1a}-0.48\%$
test_keys_stack_nested_leaf 1.9586ms 1.8283ms 546.9666 Ops/s 546.3758 Ops/s $\color{#35bf28}+0.11\%$
test_keys_stack_nested_locked 0.9696ms 0.8116ms 1.2322 KOps/s 1.2247 KOps/s $\color{#35bf28}+0.62\%$
test_values 78.8020μs 1.6395μs 609.9334 KOps/s 633.5185 KOps/s $\color{#d91a1a}-3.72\%$
test_values_nested 0.1696ms 66.7998μs 14.9701 KOps/s 14.6751 KOps/s $\color{#35bf28}+2.01\%$
test_values_nested_locked 95.7020μs 67.0378μs 14.9169 KOps/s 14.6580 KOps/s $\color{#35bf28}+1.77\%$
test_values_nested_leaf 0.1584ms 58.2745μs 17.1602 KOps/s 16.5538 KOps/s $\color{#35bf28}+3.66\%$
test_values_stack_nested 1.7610ms 1.5895ms 629.1207 Ops/s 629.6594 Ops/s $\color{#d91a1a}-0.09\%$
test_values_stack_nested_leaf 1.7173ms 1.5813ms 632.4075 Ops/s 634.5622 Ops/s $\color{#d91a1a}-0.34\%$
test_values_stack_nested_locked 2.5167ms 0.6404ms 1.5615 KOps/s 1.5571 KOps/s $\color{#35bf28}+0.28\%$
test_membership 26.5010μs 1.9052μs 524.8728 KOps/s 523.0911 KOps/s $\color{#35bf28}+0.34\%$
test_membership_nested 18.3010μs 3.6640μs 272.9234 KOps/s 277.4494 KOps/s $\color{#d91a1a}-1.63\%$
test_membership_nested_leaf 68.8020μs 3.6987μs 270.3618 KOps/s 273.0883 KOps/s $\color{#d91a1a}-1.00\%$
test_membership_stacked_nested 87.5020μs 14.1881μs 70.4816 KOps/s 69.0450 KOps/s $\color{#35bf28}+2.08\%$
test_membership_stacked_nested_leaf 40.5010μs 14.3060μs 69.9007 KOps/s 68.2598 KOps/s $\color{#35bf28}+2.40\%$
test_membership_nested_last 30.5000μs 7.6626μs 130.5043 KOps/s 133.3132 KOps/s $\color{#d91a1a}-2.11\%$
test_membership_nested_leaf_last 0.1727ms 7.6912μs 130.0182 KOps/s 132.0826 KOps/s $\color{#d91a1a}-1.56\%$
test_membership_stacked_nested_last 0.2785ms 0.2287ms 4.3733 KOps/s 4.4288 KOps/s $\color{#d91a1a}-1.25\%$
test_membership_stacked_nested_leaf_last 43.7020μs 16.7400μs 59.7372 KOps/s 59.1070 KOps/s $\color{#35bf28}+1.07\%$
test_nested_getleaf 42.5010μs 15.6931μs 63.7224 KOps/s 64.0469 KOps/s $\color{#d91a1a}-0.51\%$
test_nested_get 74.6020μs 14.8846μs 67.1837 KOps/s 67.5900 KOps/s $\color{#d91a1a}-0.60\%$
test_stacked_getleaf 1.0214ms 0.8807ms 1.1355 KOps/s 1.1539 KOps/s $\color{#d91a1a}-1.59\%$
test_stacked_get 0.9581ms 0.8418ms 1.1879 KOps/s 1.2005 KOps/s $\color{#d91a1a}-1.04\%$
test_nested_getitemleaf 43.7010μs 15.6538μs 63.8824 KOps/s 64.1794 KOps/s $\color{#d91a1a}-0.46\%$
test_nested_getitem 41.3010μs 14.8966μs 67.1293 KOps/s 67.2981 KOps/s $\color{#d91a1a}-0.25\%$
test_stacked_getitemleaf 0.9959ms 0.8787ms 1.1380 KOps/s 1.1420 KOps/s $\color{#d91a1a}-0.35\%$
test_stacked_getitem 0.8870ms 0.8398ms 1.1908 KOps/s 1.1999 KOps/s $\color{#d91a1a}-0.76\%$
test_lock_nested 88.3206ms 1.5802ms 632.8443 Ops/s 671.9152 Ops/s $\textbf{\color{#d91a1a}-5.81\%}$
test_lock_stack_nested 0.1370s 21.9318ms 45.5958 Ops/s 47.2200 Ops/s $\color{#d91a1a}-3.44\%$
test_unlock_nested 80.1654ms 1.5787ms 633.4237 Ops/s 632.2600 Ops/s $\color{#35bf28}+0.18\%$
test_unlock_stack_nested 0.1086s 21.9013ms 45.6593 Ops/s 45.2034 Ops/s $\color{#35bf28}+1.01\%$
test_flatten_speed 1.1931ms 1.0050ms 995.0531 Ops/s 974.8227 Ops/s $\color{#35bf28}+2.08\%$
test_unflatten_speed 1.8516ms 1.7918ms 558.1046 Ops/s 545.9797 Ops/s $\color{#35bf28}+2.22\%$
test_common_ops 4.1783ms 1.1259ms 888.1411 Ops/s 885.8042 Ops/s $\color{#35bf28}+0.26\%$
test_creation 39.7020μs 6.2564μs 159.8366 KOps/s 161.7950 KOps/s $\color{#d91a1a}-1.21\%$
test_creation_empty 30.9010μs 13.4805μs 74.1810 KOps/s 73.8872 KOps/s $\color{#35bf28}+0.40\%$
test_creation_nested_1 52.6020μs 24.8186μs 40.2923 KOps/s 40.2065 KOps/s $\color{#35bf28}+0.21\%$
test_creation_nested_2 55.5020μs 26.9377μs 37.1227 KOps/s 36.7026 KOps/s $\color{#35bf28}+1.14\%$
test_clone 0.2103ms 24.8327μs 40.2694 KOps/s 40.8264 KOps/s $\color{#d91a1a}-1.36\%$
test_getitem[int] 59.7020μs 28.4529μs 35.1458 KOps/s 34.7300 KOps/s $\color{#35bf28}+1.20\%$
test_getitem[slice_int] 0.1561ms 59.1335μs 16.9109 KOps/s 18.1704 KOps/s $\textbf{\color{#d91a1a}-6.93\%}$
test_getitem[range] 0.2298ms 82.6629μs 12.0973 KOps/s 12.2578 KOps/s $\color{#d91a1a}-1.31\%$
test_getitem[tuple] 85.9030μs 46.5169μs 21.4976 KOps/s 21.4917 KOps/s $\color{#35bf28}+0.03\%$
test_getitem[list] 0.4050ms 76.1327μs 13.1350 KOps/s 12.9258 KOps/s $\color{#35bf28}+1.62\%$
test_setitem_dim[int] 55.2020μs 34.8149μs 28.7233 KOps/s 29.1391 KOps/s $\color{#d91a1a}-1.43\%$
test_setitem_dim[slice_int] 98.0030μs 60.2518μs 16.5970 KOps/s 16.5927 KOps/s $\color{#35bf28}+0.03\%$
test_setitem_dim[range] 0.1081ms 79.7203μs 12.5439 KOps/s 12.2843 KOps/s $\color{#35bf28}+2.11\%$
test_setitem_dim[tuple] 88.3030μs 50.7075μs 19.7209 KOps/s 19.8158 KOps/s $\color{#d91a1a}-0.48\%$
test_setitem 0.1805ms 32.8102μs 30.4783 KOps/s 30.7663 KOps/s $\color{#d91a1a}-0.94\%$
test_set 0.1957ms 31.4968μs 31.7492 KOps/s 31.8174 KOps/s $\color{#d91a1a}-0.21\%$
test_set_shared 0.3252ms 0.2039ms 4.9032 KOps/s 4.8062 KOps/s $\color{#35bf28}+2.02\%$
test_update 0.2476ms 35.4831μs 28.1824 KOps/s 28.4382 KOps/s $\color{#d91a1a}-0.90\%$
test_update_nested 0.2211ms 52.3398μs 19.1059 KOps/s 19.2196 KOps/s $\color{#d91a1a}-0.59\%$
test_set_nested 0.2253ms 34.9713μs 28.5949 KOps/s 29.1097 KOps/s $\color{#d91a1a}-1.77\%$
test_set_nested_new 0.2752ms 54.4766μs 18.3565 KOps/s 18.7397 KOps/s $\color{#d91a1a}-2.04\%$
test_select 0.3373ms 0.1003ms 9.9686 KOps/s 10.2043 KOps/s $\color{#d91a1a}-2.31\%$
test_unbind_speed 0.6856ms 0.6504ms 1.5375 KOps/s 1.5300 KOps/s $\color{#35bf28}+0.49\%$
test_unbind_speed_stack0 97.1470ms 9.0164ms 110.9084 Ops/s 109.4403 Ops/s $\color{#35bf28}+1.34\%$
test_unbind_speed_stack1 5.8002μs 0.9349μs 1.0697 MOps/s 873.5872 KOps/s $\textbf{\color{#35bf28}+22.45\%}$
test_creation[device0] 0.5864ms 0.4480ms 2.2319 KOps/s 2.2262 KOps/s $\color{#35bf28}+0.26\%$
test_creation_from_tensor 4.4133ms 0.5060ms 1.9764 KOps/s 1.9667 KOps/s $\color{#35bf28}+0.49\%$
test_add_one[memmap_tensor0] 2.3314ms 33.4701μs 29.8774 KOps/s 29.7849 KOps/s $\color{#35bf28}+0.31\%$
test_contiguous[memmap_tensor0] 0.2522ms 8.8945μs 112.4289 KOps/s 111.1571 KOps/s $\color{#35bf28}+1.14\%$
test_stack[memmap_tensor0] 0.1346ms 27.1036μs 36.8955 KOps/s 37.6862 KOps/s $\color{#d91a1a}-2.10\%$
test_memmaptd_index 0.3604ms 0.3150ms 3.1742 KOps/s 3.1608 KOps/s $\color{#35bf28}+0.43\%$
test_memmaptd_index_astensor 1.2964ms 1.2210ms 819.0094 Ops/s 812.8724 Ops/s $\color{#35bf28}+0.75\%$
test_memmaptd_index_op 2.6976ms 2.6350ms 379.5032 Ops/s 373.5550 Ops/s $\color{#35bf28}+1.59\%$
test_reshape_pytree 0.1244ms 32.7538μs 30.5308 KOps/s 29.6630 KOps/s $\color{#35bf28}+2.93\%$
test_reshape_td 88.6030μs 40.8307μs 24.4913 KOps/s 24.0098 KOps/s $\color{#35bf28}+2.01\%$
test_view_pytree 90.3020μs 32.4651μs 30.8023 KOps/s 30.0728 KOps/s $\color{#35bf28}+2.43\%$
test_view_td 36.1010μs 8.9208μs 112.0972 KOps/s 114.2998 KOps/s $\color{#d91a1a}-1.93\%$
test_unbind_pytree 79.3020μs 37.5548μs 26.6277 KOps/s 25.9630 KOps/s $\color{#35bf28}+2.56\%$
test_unbind_td 0.1407ms 97.0390μs 10.3051 KOps/s 10.3710 KOps/s $\color{#d91a1a}-0.64\%$
test_split_pytree 0.1326ms 37.2385μs 26.8539 KOps/s 26.5826 KOps/s $\color{#35bf28}+1.02\%$
test_split_td 0.9731ms 0.1077ms 9.2814 KOps/s 9.0189 KOps/s $\color{#35bf28}+2.91\%$
test_add_pytree 96.3020μs 47.2635μs 21.1580 KOps/s 21.1776 KOps/s $\color{#d91a1a}-0.09\%$
test_add_td 0.1393ms 78.1707μs 12.7925 KOps/s 12.6654 KOps/s $\color{#35bf28}+1.00\%$
test_distributed 29.6010μs 8.8247μs 113.3181 KOps/s 111.8837 KOps/s $\color{#35bf28}+1.28\%$
test_tdmodule 1.3488ms 28.8383μs 34.6761 KOps/s 33.9656 KOps/s $\color{#35bf28}+2.09\%$
test_tdmodule_dispatch 0.3082ms 52.7787μs 18.9470 KOps/s 18.4068 KOps/s $\color{#35bf28}+2.93\%$
test_tdseq 0.1532ms 32.3284μs 30.9325 KOps/s 29.7500 KOps/s $\color{#35bf28}+3.97\%$
test_tdseq_dispatch 0.6417ms 65.9202μs 15.1699 KOps/s 14.8998 KOps/s $\color{#35bf28}+1.81\%$
test_instantiation_functorch 1.7876ms 1.6566ms 603.6439 Ops/s 600.0467 Ops/s $\color{#35bf28}+0.60\%$
test_instantiation_td 2.2159ms 1.3859ms 721.5416 Ops/s 652.3972 Ops/s $\textbf{\color{#35bf28}+10.60\%}$
test_exec_functorch 0.2493ms 0.1976ms 5.0613 KOps/s 5.0878 KOps/s $\color{#d91a1a}-0.52\%$
test_exec_td 0.2800ms 0.1888ms 5.2954 KOps/s 5.3636 KOps/s $\color{#d91a1a}-1.27\%$
test_vmap_mlp_speed[True-True] 19.2389ms 2.1553ms 463.9766 Ops/s 810.4029 Ops/s $\textbf{\color{#d91a1a}-42.75\%}$
test_vmap_mlp_speed[True-False] 12.5444ms 0.6626ms 1.5092 KOps/s 1.5353 KOps/s $\color{#d91a1a}-1.70\%$
test_vmap_mlp_speed[False-True] 8.2823ms 1.0541ms 948.6360 Ops/s 942.3521 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_mlp_speed[False-False] 8.1342ms 0.4923ms 2.0311 KOps/s 1.7520 KOps/s $\textbf{\color{#35bf28}+15.93\%}$

@vmoens vmoens closed this Oct 21, 2023
@vmoens vmoens deleted the fix_update branch October 21, 2023 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants