Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Fix tutorials #1002

Merged
merged 3 commits into from
Sep 20, 2024
Merged

[Doc] Fix tutorials #1002

merged 3 commits into from
Sep 20, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 20, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 20, 2024
ghstack-source-id: ec367ae3c2225d19af596a4522ddae137970e2bc
Pull Request resolved: #1002
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 20, 2024
@vmoens vmoens added the documentation Improvements or additions to documentation label Sep 20, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 20, 2024
ghstack-source-id: f5d8b434fced22f58ef8abb39707936411569c4f
Pull Request resolved: #1002
Copy link

github-actions bot commented Sep 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}21$. Worsened: $\large\color{#d91a1a}24$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 50.9150μs 20.9422μs 47.7505 KOps/s 49.9547 KOps/s $\color{#d91a1a}-4.41\%$
test_plain_set_stack_nested 54.2010μs 21.0390μs 47.5309 KOps/s 49.1599 KOps/s $\color{#d91a1a}-3.31\%$
test_plain_set_nested_inplace 88.2050μs 23.0156μs 43.4488 KOps/s 46.0603 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_plain_set_stack_nested_inplace 85.9710μs 22.9493μs 43.5743 KOps/s 46.1343 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_items 26.1590μs 4.1700μs 239.8108 KOps/s 240.0902 KOps/s $\color{#d91a1a}-0.12\%$
test_items_nested 0.7424ms 0.3630ms 2.7544 KOps/s 2.6994 KOps/s $\color{#35bf28}+2.04\%$
test_items_nested_locked 0.7644ms 0.3620ms 2.7627 KOps/s 2.6849 KOps/s $\color{#35bf28}+2.90\%$
test_items_nested_leaf 0.1215ms 67.1587μs 14.8901 KOps/s 14.6554 KOps/s $\color{#35bf28}+1.60\%$
test_items_stack_nested 0.5230ms 0.3642ms 2.7458 KOps/s 2.7333 KOps/s $\color{#35bf28}+0.46\%$
test_items_stack_nested_leaf 0.1340ms 69.4375μs 14.4014 KOps/s 14.1358 KOps/s $\color{#35bf28}+1.88\%$
test_items_stack_nested_locked 0.5605ms 0.3643ms 2.7446 KOps/s 2.7208 KOps/s $\color{#35bf28}+0.88\%$
test_keys 44.5030μs 3.4965μs 285.9981 KOps/s 264.7575 KOps/s $\textbf{\color{#35bf28}+8.02\%}$
test_keys_nested 0.1692ms 0.1041ms 9.6047 KOps/s 9.6856 KOps/s $\color{#d91a1a}-0.84\%$
test_keys_nested_locked 1.7653ms 0.1095ms 9.1334 KOps/s 9.0505 KOps/s $\color{#35bf28}+0.92\%$
test_keys_nested_leaf 0.1504ms 86.7673μs 11.5251 KOps/s 11.5036 KOps/s $\color{#35bf28}+0.19\%$
test_keys_stack_nested 0.1670ms 0.1024ms 9.7614 KOps/s 9.6094 KOps/s $\color{#35bf28}+1.58\%$
test_keys_stack_nested_leaf 0.1402ms 85.6740μs 11.6722 KOps/s 11.5630 KOps/s $\color{#35bf28}+0.94\%$
test_keys_stack_nested_locked 0.1760ms 0.1091ms 9.1672 KOps/s 9.2437 KOps/s $\color{#d91a1a}-0.83\%$
test_values 8.9386μs 1.0664μs 937.7176 KOps/s 948.9017 KOps/s $\color{#d91a1a}-1.18\%$
test_values_nested 0.1355ms 75.6490μs 13.2189 KOps/s 13.4006 KOps/s $\color{#d91a1a}-1.36\%$
test_values_nested_locked 0.1592ms 74.8300μs 13.3636 KOps/s 13.2400 KOps/s $\color{#35bf28}+0.93\%$
test_values_nested_leaf 0.1105ms 62.2922μs 16.0534 KOps/s 14.8192 KOps/s $\textbf{\color{#35bf28}+8.33\%}$
test_values_stack_nested 0.1261ms 75.2924μs 13.2816 KOps/s 13.1284 KOps/s $\color{#35bf28}+1.17\%$
test_values_stack_nested_leaf 0.1135ms 62.4543μs 16.0117 KOps/s 16.2001 KOps/s $\color{#d91a1a}-1.16\%$
test_values_stack_nested_locked 0.1414ms 78.2294μs 12.7829 KOps/s 13.0710 KOps/s $\color{#d91a1a}-2.20\%$
test_membership 2.9505μs 0.7310μs 1.3680 MOps/s 1.1053 MOps/s $\textbf{\color{#35bf28}+23.77\%}$
test_membership_nested 24.1160μs 2.8272μs 353.7029 KOps/s 360.8147 KOps/s $\color{#d91a1a}-1.97\%$
test_membership_nested_leaf 44.5030μs 2.7752μs 360.3381 KOps/s 360.1785 KOps/s $\color{#35bf28}+0.04\%$
test_membership_stacked_nested 18.1430μs 2.8182μs 354.8307 KOps/s 357.8784 KOps/s $\color{#d91a1a}-0.85\%$
test_membership_stacked_nested_leaf 24.5760μs 2.8123μs 355.5783 KOps/s 354.2827 KOps/s $\color{#35bf28}+0.37\%$
test_membership_nested_last 25.6070μs 4.1103μs 243.2935 KOps/s 247.5698 KOps/s $\color{#d91a1a}-1.73\%$
test_membership_nested_leaf_last 47.9000μs 4.0724μs 245.5572 KOps/s 247.9503 KOps/s $\color{#d91a1a}-0.97\%$
test_membership_stacked_nested_last 22.5020μs 4.0264μs 248.3615 KOps/s 179.7133 KOps/s $\textbf{\color{#35bf28}+38.20\%}$
test_membership_stacked_nested_leaf_last 26.2790μs 4.0374μs 247.6845 KOps/s 177.5468 KOps/s $\textbf{\color{#35bf28}+39.50\%}$
test_nested_getleaf 31.6190μs 10.6119μs 94.2339 KOps/s 95.1425 KOps/s $\color{#d91a1a}-0.96\%$
test_nested_get 48.2600μs 10.2875μs 97.2056 KOps/s 99.1595 KOps/s $\color{#d91a1a}-1.97\%$
test_stacked_getleaf 50.5950μs 10.6645μs 93.7693 KOps/s 96.0132 KOps/s $\color{#d91a1a}-2.34\%$
test_stacked_get 27.0000μs 10.1230μs 98.7852 KOps/s 99.5548 KOps/s $\color{#d91a1a}-0.77\%$
test_nested_getitemleaf 60.8680μs 10.9839μs 91.0421 KOps/s 91.3099 KOps/s $\color{#d91a1a}-0.29\%$
test_nested_getitem 45.1040μs 10.3465μs 96.6513 KOps/s 97.8467 KOps/s $\color{#d91a1a}-1.22\%$
test_stacked_getitemleaf 36.2480μs 11.2269μs 89.0715 KOps/s 92.5016 KOps/s $\color{#d91a1a}-3.71\%$
test_stacked_getitem 39.2530μs 10.4849μs 95.3754 KOps/s 96.4533 KOps/s $\color{#d91a1a}-1.12\%$
test_lock_nested 90.1331ms 0.5880ms 1.7006 KOps/s 1.9354 KOps/s $\textbf{\color{#d91a1a}-12.14\%}$
test_lock_stack_nested 0.7883ms 0.4694ms 2.1305 KOps/s 2.0255 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_unlock_nested 86.2177ms 0.5049ms 1.9808 KOps/s 2.2134 KOps/s $\textbf{\color{#d91a1a}-10.51\%}$
test_unlock_stack_nested 0.7411ms 0.3864ms 2.5882 KOps/s 2.4793 KOps/s $\color{#35bf28}+4.39\%$
test_flatten_speed 0.1790ms 88.5021μs 11.2992 KOps/s 11.4111 KOps/s $\color{#d91a1a}-0.98\%$
test_unflatten_speed 0.5469ms 0.4620ms 2.1643 KOps/s 2.1940 KOps/s $\color{#d91a1a}-1.35\%$
test_common_ops 3.4160ms 1.1756ms 850.6209 Ops/s 869.0319 Ops/s $\color{#d91a1a}-2.12\%$
test_creation 15.3690μs 2.0616μs 485.0613 KOps/s 487.6667 KOps/s $\color{#d91a1a}-0.53\%$
test_creation_empty 61.7450μs 19.4422μs 51.4345 KOps/s 56.4903 KOps/s $\textbf{\color{#d91a1a}-8.95\%}$
test_creation_nested_1 70.1210μs 22.5491μs 44.3476 KOps/s 48.5955 KOps/s $\textbf{\color{#d91a1a}-8.74\%}$
test_creation_nested_2 69.8000μs 27.1192μs 36.8743 KOps/s 38.1289 KOps/s $\color{#d91a1a}-3.29\%$
test_clone 1.2606ms 17.3588μs 57.6078 KOps/s 57.9628 KOps/s $\color{#d91a1a}-0.61\%$
test_getitem[int] 0.7328ms 16.7713μs 59.6258 KOps/s 60.2951 KOps/s $\color{#d91a1a}-1.11\%$
test_getitem[slice_int] 0.1396ms 32.0558μs 31.1956 KOps/s 31.6875 KOps/s $\color{#d91a1a}-1.55\%$
test_getitem[range] 0.2048ms 58.9089μs 16.9753 KOps/s 16.5590 KOps/s $\color{#35bf28}+2.51\%$
test_getitem[tuple] 0.1387ms 26.1242μs 38.2787 KOps/s 39.5633 KOps/s $\color{#d91a1a}-3.25\%$
test_getitem[list] 0.1685ms 53.3780μs 18.7343 KOps/s 17.8227 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_setitem_dim[int] 68.1270μs 34.0252μs 29.3900 KOps/s 30.6141 KOps/s $\color{#d91a1a}-4.00\%$
test_setitem_dim[slice_int] 0.1020ms 63.1812μs 15.8275 KOps/s 16.1129 KOps/s $\color{#d91a1a}-1.77\%$
test_setitem_dim[range] 0.1261ms 84.3810μs 11.8510 KOps/s 11.5296 KOps/s $\color{#35bf28}+2.79\%$
test_setitem_dim[tuple] 0.1718ms 51.4708μs 19.4285 KOps/s 20.6308 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_setitem 0.1093ms 31.6416μs 31.6040 KOps/s 33.2174 KOps/s $\color{#d91a1a}-4.86\%$
test_set 0.1211ms 30.5778μs 32.7035 KOps/s 34.0028 KOps/s $\color{#d91a1a}-3.82\%$
test_set_shared 1.2691ms 0.2102ms 4.7585 KOps/s 4.4257 KOps/s $\textbf{\color{#35bf28}+7.52\%}$
test_update 0.2068ms 38.2109μs 26.1706 KOps/s 27.4368 KOps/s $\color{#d91a1a}-4.62\%$
test_update_nested 0.1334ms 48.6194μs 20.5679 KOps/s 21.1185 KOps/s $\color{#d91a1a}-2.61\%$
test_update__nested 86.1810μs 36.2239μs 27.6061 KOps/s 28.8879 KOps/s $\color{#d91a1a}-4.44\%$
test_set_nested 0.1296ms 33.2902μs 30.0389 KOps/s 31.3456 KOps/s $\color{#d91a1a}-4.17\%$
test_set_nested_new 94.1360μs 38.2743μs 26.1272 KOps/s 26.3104 KOps/s $\color{#d91a1a}-0.70\%$
test_select 0.1854ms 55.4104μs 18.0471 KOps/s 18.3822 KOps/s $\color{#d91a1a}-1.82\%$
test_select_nested 0.1175ms 59.3658μs 16.8447 KOps/s 17.0400 KOps/s $\color{#d91a1a}-1.15\%$
test_exclude_nested 0.1402ms 75.3005μs 13.2801 KOps/s 13.5507 KOps/s $\color{#d91a1a}-2.00\%$
test_empty[True] 1.0550ms 0.3282ms 3.0468 KOps/s 3.0975 KOps/s $\color{#d91a1a}-1.64\%$
test_empty[False] 11.0182μs 1.2227μs 817.8452 KOps/s 838.3711 KOps/s $\color{#d91a1a}-2.45\%$
test_unbind_speed 0.5314ms 0.3067ms 3.2610 KOps/s 3.3171 KOps/s $\color{#d91a1a}-1.69\%$
test_unbind_speed_stack0 0.4142ms 0.3052ms 3.2766 KOps/s 3.4238 KOps/s $\color{#d91a1a}-4.30\%$
test_unbind_speed_stack1 93.0003ms 0.8167ms 1.2244 KOps/s 1.3116 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_split 97.4449ms 2.2091ms 452.6718 Ops/s 445.3663 Ops/s $\color{#35bf28}+1.64\%$
test_chunk 2.2256ms 2.0270ms 493.3320 Ops/s 440.4183 Ops/s $\textbf{\color{#35bf28}+12.01\%}$
test_creation[device0] 0.2495ms 0.1195ms 8.3708 KOps/s 8.3775 KOps/s $\color{#d91a1a}-0.08\%$
test_creation_from_tensor 4.1200ms 0.1186ms 8.4343 KOps/s 8.3746 KOps/s $\color{#35bf28}+0.71\%$
test_add_one[memmap_tensor0] 0.1816ms 7.3740μs 135.6112 KOps/s 133.8599 KOps/s $\color{#35bf28}+1.31\%$
test_contiguous[memmap_tensor0] 19.4160μs 1.9406μs 515.3049 KOps/s 512.2941 KOps/s $\color{#35bf28}+0.59\%$
test_stack[memmap_tensor0] 44.6830μs 5.9926μs 166.8711 KOps/s 175.9605 KOps/s $\textbf{\color{#d91a1a}-5.17\%}$
test_memmaptd_index 1.1496ms 0.4047ms 2.4712 KOps/s 2.4589 KOps/s $\color{#35bf28}+0.50\%$
test_memmaptd_index_astensor 0.8359ms 0.4826ms 2.0721 KOps/s 2.0493 KOps/s $\color{#35bf28}+1.11\%$
test_memmaptd_index_op 1.4359ms 1.0443ms 957.5945 Ops/s 964.2324 Ops/s $\color{#d91a1a}-0.69\%$
test_serialize_model 0.1294s 0.1233s 8.1129 Ops/s 7.8302 Ops/s $\color{#35bf28}+3.61\%$
test_serialize_model_pickle 0.4495s 0.3910s 2.5575 Ops/s 2.5087 Ops/s $\color{#35bf28}+1.94\%$
test_serialize_weights 0.1236s 0.1165s 8.5857 Ops/s 8.1212 Ops/s $\textbf{\color{#35bf28}+5.72\%}$
test_serialize_weights_returnearly 0.2609s 0.1712s 5.8411 Ops/s 5.4722 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_serialize_weights_pickle 0.5271s 0.4618s 2.1654 Ops/s 2.5710 Ops/s $\textbf{\color{#d91a1a}-15.77\%}$
test_serialize_weights_filesystem 0.1559s 0.1428s 7.0051 Ops/s 6.6461 Ops/s $\textbf{\color{#35bf28}+5.40\%}$
test_serialize_model_filesystem 0.1505s 0.1430s 6.9916 Ops/s 6.2365 Ops/s $\textbf{\color{#35bf28}+12.11\%}$
test_reshape_pytree 0.1014ms 39.4380μs 25.3563 KOps/s 23.4690 KOps/s $\textbf{\color{#35bf28}+8.04\%}$
test_reshape_td 0.1022ms 48.4440μs 20.6424 KOps/s 18.9675 KOps/s $\textbf{\color{#35bf28}+8.83\%}$
test_view_pytree 0.1047ms 38.9299μs 25.6872 KOps/s 26.0874 KOps/s $\color{#d91a1a}-1.53\%$
test_view_td 0.1388ms 53.6277μs 18.6471 KOps/s 19.1346 KOps/s $\color{#d91a1a}-2.55\%$
test_unbind_pytree 78.8470μs 37.1957μs 26.8848 KOps/s 28.4758 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_unbind_td 0.3225ms 45.7314μs 21.8668 KOps/s 22.5449 KOps/s $\color{#d91a1a}-3.01\%$
test_split_pytree 97.9320μs 39.7142μs 25.1799 KOps/s 26.3415 KOps/s $\color{#d91a1a}-4.41\%$
test_split_td 0.5079ms 59.8549μs 16.7071 KOps/s 16.9279 KOps/s $\color{#d91a1a}-1.30\%$
test_add_pytree 92.0320μs 44.7307μs 22.3560 KOps/s 22.5856 KOps/s $\color{#d91a1a}-1.02\%$
test_add_td 0.4324ms 89.8034μs 11.1354 KOps/s 12.4216 KOps/s $\textbf{\color{#d91a1a}-10.35\%}$
test_compile_add_one_nested[tensordict-compile] 0.1059ms 56.6927μs 17.6390 KOps/s 17.1138 KOps/s $\color{#35bf28}+3.07\%$
test_compile_add_one_nested[tensordict-eager] 0.4050ms 0.1796ms 5.5689 KOps/s 5.7350 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_add_one_nested[pytree-compile] 0.1065ms 57.3755μs 17.4290 KOps/s 17.6827 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_add_one_nested[pytree-eager] 0.3270ms 0.1422ms 7.0335 KOps/s 7.1085 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_copy_nested[tensordict-compile] 86.1610μs 21.1636μs 47.2510 KOps/s 47.2521 KOps/s $-0.00\%$
test_compile_copy_nested[tensordict-eager] 0.1646ms 69.2960μs 14.4308 KOps/s 15.0455 KOps/s $\color{#d91a1a}-4.09\%$
test_compile_copy_nested[pytree-compile] 0.1546ms 76.4402μs 13.0821 KOps/s 13.1880 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_copy_nested[pytree-eager] 0.1498ms 70.0896μs 14.2675 KOps/s 14.5761 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_add_one_flat[tensordict-compile] 0.4211ms 0.1748ms 5.7223 KOps/s 5.6701 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[tensordict-eager] 0.3473ms 0.1922ms 5.2018 KOps/s 5.2278 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_one_flat[tensorclass-compile] 0.1181ms 47.0650μs 21.2472 KOps/s 21.0094 KOps/s $\color{#35bf28}+1.13\%$
test_compile_add_one_flat[tensorclass-eager] 0.1364ms 68.9064μs 14.5124 KOps/s 14.0226 KOps/s $\color{#35bf28}+3.49\%$
test_compile_add_one_flat[pytree-compile] 0.2725ms 0.1745ms 5.7294 KOps/s 5.6575 KOps/s $\color{#35bf28}+1.27\%$
test_compile_add_one_flat[pytree-eager] 0.4482ms 0.2866ms 3.4894 KOps/s 3.4349 KOps/s $\color{#35bf28}+1.59\%$
test_compile_add_self_flat[tensordict-eager] 0.6553ms 0.2037ms 4.9101 KOps/s 4.9877 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_add_self_flat[tensordict-compile] 0.3371ms 0.1751ms 5.7119 KOps/s 5.6504 KOps/s $\color{#35bf28}+1.09\%$
test_compile_add_self_flat[tensorclass-eager] 0.1374ms 63.1408μs 15.8376 KOps/s 15.9471 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_self_flat[tensorclass-compile] 0.1147ms 46.8723μs 21.3346 KOps/s 21.2173 KOps/s $\color{#35bf28}+0.55\%$
test_compile_add_self_flat[pytree-eager] 0.3302ms 0.2330ms 4.2920 KOps/s 4.4145 KOps/s $\color{#d91a1a}-2.78\%$
test_compile_add_self_flat[pytree-compile] 0.2787ms 0.1774ms 5.6375 KOps/s 5.7522 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_copy_flat[tensordict-compile] 0.1922ms 0.1019ms 9.8169 KOps/s 9.6385 KOps/s $\color{#35bf28}+1.85\%$
test_compile_copy_flat[tensordict-eager] 0.1333ms 58.2801μs 17.1585 KOps/s 17.9553 KOps/s $\color{#d91a1a}-4.44\%$
test_compile_copy_flat[pytree-compile] 0.1583ms 79.3341μs 12.6049 KOps/s 12.9353 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_copy_flat[pytree-eager] 0.1475ms 72.3772μs 13.8165 KOps/s 14.4795 KOps/s $\color{#d91a1a}-4.58\%$
test_compile_assign_and_add[tensordict-compile] 0.3841ms 0.1928ms 5.1870 KOps/s 5.0692 KOps/s $\color{#35bf28}+2.32\%$
test_compile_assign_and_add[tensordict-eager] 1.8766ms 1.6296ms 613.6539 Ops/s 585.5582 Ops/s $\color{#35bf28}+4.80\%$
test_compile_assign_and_add[pytree-compile] 0.2997ms 0.1913ms 5.2286 KOps/s 4.9080 KOps/s $\textbf{\color{#35bf28}+6.53\%}$
test_compile_assign_and_add[pytree-eager] 1.7940ms 1.0997ms 909.3081 Ops/s 893.4362 Ops/s $\color{#35bf28}+1.78\%$
test_compile_assign_and_add_stack[compile] 0.4764ms 0.4090ms 2.4452 KOps/s 2.3898 KOps/s $\color{#35bf28}+2.32\%$
test_compile_assign_and_add_stack[eager] 5.9345ms 3.8733ms 258.1793 Ops/s 266.7563 Ops/s $\color{#d91a1a}-3.22\%$
test_compile_indexing[tensor-tensordict-compile] 84.5180μs 34.9633μs 28.6014 KOps/s 28.0705 KOps/s $\color{#35bf28}+1.89\%$
test_compile_indexing[tensor-tensordict-eager] 0.9896ms 48.5062μs 20.6159 KOps/s 19.9617 KOps/s $\color{#35bf28}+3.28\%$
test_compile_indexing[tensor-tensorclass-compile] 82.8540μs 30.6162μs 32.6625 KOps/s 33.2211 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_indexing[tensor-tensorclass-eager] 72.8660μs 29.1219μs 34.3384 KOps/s 33.7523 KOps/s $\color{#35bf28}+1.74\%$
test_compile_indexing[tensor-pytree-compile] 82.5250μs 30.2672μs 33.0390 KOps/s 33.6942 KOps/s $\color{#d91a1a}-1.94\%$
test_compile_indexing[tensor-pytree-eager] 69.2490μs 29.0374μs 34.4384 KOps/s 33.7849 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[slice-tensordict-compile] 0.1360ms 74.9286μs 13.3460 KOps/s 13.6221 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_indexing[slice-tensordict-eager] 0.6163ms 28.7091μs 34.8321 KOps/s 35.4066 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[slice-tensorclass-compile] 0.1452ms 69.6398μs 14.3596 KOps/s 14.9640 KOps/s $\color{#d91a1a}-4.04\%$
test_compile_indexing[slice-tensorclass-eager] 75.0500μs 23.7835μs 42.0460 KOps/s 44.0071 KOps/s $\color{#d91a1a}-4.46\%$
test_compile_indexing[slice-pytree-compile] 0.1922ms 70.3053μs 14.2237 KOps/s 14.6809 KOps/s $\color{#d91a1a}-3.11\%$
test_compile_indexing[slice-pytree-eager] 62.4470μs 23.9712μs 41.7168 KOps/s 44.3005 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_compile_indexing[int-tensordict-compile] 0.1490ms 74.6734μs 13.3917 KOps/s 13.5173 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_indexing[int-tensordict-eager] 0.9882ms 28.8305μs 34.6855 KOps/s 35.7716 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_indexing[int-tensorclass-compile] 0.1546ms 70.9110μs 14.1022 KOps/s 14.7487 KOps/s $\color{#d91a1a}-4.38\%$
test_compile_indexing[int-tensorclass-eager] 0.1037ms 24.2505μs 41.2362 KOps/s 43.7334 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_compile_indexing[int-pytree-compile] 0.1332ms 69.9140μs 14.3033 KOps/s 14.9337 KOps/s $\color{#d91a1a}-4.22\%$
test_compile_indexing[int-pytree-eager] 80.4100μs 24.6246μs 40.6097 KOps/s 44.0670 KOps/s $\textbf{\color{#d91a1a}-7.85\%}$
test_mod_add[eager] 89.8180μs 26.7189μs 37.4267 KOps/s 39.6476 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_mod_add[compile] 0.1126ms 38.7968μs 25.7753 KOps/s 24.7317 KOps/s $\color{#35bf28}+4.22\%$
test_mod_add[compile-overhead] 94.8070μs 38.7815μs 25.7855 KOps/s 25.0741 KOps/s $\color{#35bf28}+2.84\%$
test_mod_wrap[eager] 1.4465ms 0.2685ms 3.7244 KOps/s 4.8752 KOps/s $\textbf{\color{#d91a1a}-23.60\%}$
test_mod_wrap[compile] 0.3431ms 0.2376ms 4.2080 KOps/s 4.2802 KOps/s $\color{#d91a1a}-1.69\%$
test_mod_wrap[compile-overhead] 0.3749ms 0.2371ms 4.2184 KOps/s 4.2647 KOps/s $\color{#d91a1a}-1.09\%$
test_mod_wrap_and_backward[eager] 13.1003ms 10.6610ms 93.7994 Ops/s 85.8110 Ops/s $\textbf{\color{#35bf28}+9.31\%}$
test_mod_wrap_and_backward[compile] 11.7590ms 10.5422ms 94.8569 Ops/s 80.0324 Ops/s $\textbf{\color{#35bf28}+18.52\%}$
test_mod_wrap_and_backward[compile-overhead] 11.3393ms 10.5987ms 94.3512 Ops/s 90.1772 Ops/s $\color{#35bf28}+4.63\%$
test_seq_add[eager] 0.2050ms 94.9824μs 10.5283 KOps/s 10.9960 KOps/s $\color{#d91a1a}-4.25\%$
test_seq_add[compile] 0.1319ms 64.3500μs 15.5400 KOps/s 14.9021 KOps/s $\color{#35bf28}+4.28\%$
test_seq_add[compile-overhead] 0.1355ms 62.5962μs 15.9754 KOps/s 15.2577 KOps/s $\color{#35bf28}+4.70\%$
test_seq_wrap[eager] 0.7317ms 0.3975ms 2.5157 KOps/s 2.6462 KOps/s $\color{#d91a1a}-4.93\%$
test_seq_wrap[compile] 1.3682ms 0.2730ms 3.6624 KOps/s 3.6577 KOps/s $\color{#35bf28}+0.13\%$
test_seq_wrap[compile-overhead] 1.1926ms 0.2720ms 3.6759 KOps/s 3.6278 KOps/s $\color{#35bf28}+1.33\%$
test_func_call_runtime[False-eager] 0.9227ms 0.5338ms 1.8734 KOps/s 1.9518 KOps/s $\color{#d91a1a}-4.01\%$
test_func_call_runtime[False-compile] 0.6458ms 0.5067ms 1.9735 KOps/s 1.9949 KOps/s $\color{#d91a1a}-1.07\%$
test_func_call_runtime[False-compile-overhead] 0.8282ms 0.5074ms 1.9707 KOps/s 2.0297 KOps/s $\color{#d91a1a}-2.91\%$
test_func_call_runtime[True-eager] 1.2524ms 0.7556ms 1.3234 KOps/s 1.3789 KOps/s $\color{#d91a1a}-4.03\%$
test_func_call_runtime[True-compile] 0.9227ms 0.5210ms 1.9193 KOps/s 1.9609 KOps/s $\color{#d91a1a}-2.12\%$
test_func_call_runtime[True-compile-overhead] 1.0311ms 0.5200ms 1.9231 KOps/s 1.9561 KOps/s $\color{#d91a1a}-1.69\%$
test_func_call_cm_runtime[False-eager] 0.8036ms 0.5272ms 1.8967 KOps/s 1.9867 KOps/s $\color{#d91a1a}-4.53\%$
test_func_call_cm_runtime[False-compile] 0.6052ms 0.5053ms 1.9788 KOps/s 2.0108 KOps/s $\color{#d91a1a}-1.59\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6300ms 0.5066ms 1.9739 KOps/s 2.0189 KOps/s $\color{#d91a1a}-2.23\%$
test_func_call_cm_runtime[True-eager] 0.9912ms 0.8838ms 1.1315 KOps/s 1.1887 KOps/s $\color{#d91a1a}-4.81\%$
test_func_call_cm_runtime[True-compile] 0.8690ms 0.7507ms 1.3322 KOps/s 1.3803 KOps/s $\color{#d91a1a}-3.49\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9517ms 0.7530ms 1.3281 KOps/s 1.3688 KOps/s $\color{#d91a1a}-2.98\%$
test_vmap_func_call_cm_runtime[eager] 2.3811ms 1.8816ms 531.4738 Ops/s 540.1841 Ops/s $\color{#d91a1a}-1.61\%$
test_vmap_func_call_cm_runtime[compile] 5.0364ms 1.9824ms 504.4395 Ops/s 526.1805 Ops/s $\color{#d91a1a}-4.13\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.0235ms 1.9410ms 515.2035 Ops/s 527.7360 Ops/s $\color{#d91a1a}-2.37\%$
test_distributed 0.2707ms 0.1267ms 7.8952 KOps/s 7.8628 KOps/s $\color{#35bf28}+0.41\%$
test_tdmodule 75.9010μs 19.4638μs 51.3774 KOps/s 54.9887 KOps/s $\textbf{\color{#d91a1a}-6.57\%}$
test_tdmodule_dispatch 68.1670μs 37.6325μs 26.5728 KOps/s 28.7207 KOps/s $\textbf{\color{#d91a1a}-7.48\%}$
test_tdseq 0.1401ms 22.7669μs 43.9234 KOps/s 48.1664 KOps/s $\textbf{\color{#d91a1a}-8.81\%}$
test_tdseq_dispatch 80.8110μs 43.4535μs 23.0131 KOps/s 24.4607 KOps/s $\textbf{\color{#d91a1a}-5.92\%}$
test_instantiation_functorch 1.8659ms 1.6075ms 622.0661 Ops/s 636.3732 Ops/s $\color{#d91a1a}-2.25\%$
test_instantiation_td 2.4807ms 1.1815ms 846.4111 Ops/s 863.9451 Ops/s $\color{#d91a1a}-2.03\%$
test_exec_functorch 0.4180ms 0.1918ms 5.2146 KOps/s 5.4634 KOps/s $\color{#d91a1a}-4.56\%$
test_exec_functional_call 0.2742ms 0.1779ms 5.6199 KOps/s 5.8148 KOps/s $\color{#d91a1a}-3.35\%$
test_exec_td 0.2899ms 0.1752ms 5.7070 KOps/s 5.9183 KOps/s $\color{#d91a1a}-3.57\%$
test_exec_td_decorator 0.3348ms 0.2298ms 4.3507 KOps/s 4.5387 KOps/s $\color{#d91a1a}-4.14\%$
test_vmap_mlp_speed[True-True] 1.3829ms 0.6527ms 1.5321 KOps/s 1.5115 KOps/s $\color{#35bf28}+1.36\%$
test_vmap_mlp_speed[True-False] 2.3339ms 0.6634ms 1.5074 KOps/s 1.5755 KOps/s $\color{#d91a1a}-4.32\%$
test_vmap_mlp_speed[False-True] 0.7937ms 0.5025ms 1.9900 KOps/s 2.0235 KOps/s $\color{#d91a1a}-1.65\%$
test_vmap_mlp_speed[False-False] 0.8889ms 0.5026ms 1.9895 KOps/s 2.0246 KOps/s $\color{#d91a1a}-1.73\%$
test_vmap_mlp_speed_decorator[True-True] 1.1980ms 0.6315ms 1.5836 KOps/s 1.6205 KOps/s $\color{#d91a1a}-2.28\%$
test_vmap_mlp_speed_decorator[True-False] 0.9665ms 0.6364ms 1.5714 KOps/s 1.6268 KOps/s $\color{#d91a1a}-3.40\%$
test_vmap_mlp_speed_decorator[False-True] 0.6565ms 0.5158ms 1.9386 KOps/s 1.9756 KOps/s $\color{#d91a1a}-1.87\%$
test_vmap_mlp_speed_decorator[False-False] 0.7351ms 0.5173ms 1.9330 KOps/s 1.9782 KOps/s $\color{#d91a1a}-2.28\%$
test_to_module_speed[True] 1.6304ms 1.2962ms 771.5075 Ops/s 772.3326 Ops/s $\color{#d91a1a}-0.11\%$
test_to_module_speed[False] 1.7617ms 1.2642ms 790.9944 Ops/s 806.5972 Ops/s $\color{#d91a1a}-1.93\%$
test_tc_init 0.1355ms 47.8081μs 20.9169 KOps/s 23.6653 KOps/s $\textbf{\color{#d91a1a}-11.61\%}$
test_tc_init_nested 0.1820ms 92.4141μs 10.8209 KOps/s 11.8406 KOps/s $\textbf{\color{#d91a1a}-8.61\%}$
test_tc_first_layer_tensor 17.5530μs 1.5713μs 636.4197 KOps/s 636.6084 KOps/s $\color{#d91a1a}-0.03\%$
test_tc_first_layer_nontensor 47.9290μs 4.8186μs 207.5303 KOps/s 210.2810 KOps/s $\color{#d91a1a}-1.31\%$
test_tc_second_layer_tensor 28.1320μs 2.9238μs 342.0242 KOps/s 351.8876 KOps/s $\color{#d91a1a}-2.80\%$
test_tc_second_layer_nontensor 39.9140μs 6.1374μs 162.9346 KOps/s 166.0407 KOps/s $\color{#d91a1a}-1.87\%$
test_unbind 0.4866s 13.4999ms 74.0747 Ops/s 75.7689 Ops/s $\color{#d91a1a}-2.24\%$
test_full_like 8.1448ms 7.4555ms 134.1290 Ops/s 140.5532 Ops/s $\color{#d91a1a}-4.57\%$
test_zeros_like 3.3205ms 2.8546ms 350.3088 Ops/s 171.2675 Ops/s $\textbf{\color{#35bf28}+104.54\%}$
test_ones_like 14.1564ms 6.2203ms 160.7628 Ops/s 123.1162 Ops/s $\textbf{\color{#35bf28}+30.58\%}$
test_clone 13.4694ms 8.1081ms 123.3330 Ops/s 102.7092 Ops/s $\textbf{\color{#35bf28}+20.08\%}$
test_squeeze 65.0010μs 12.6709μs 78.9210 KOps/s 79.3441 KOps/s $\color{#d91a1a}-0.53\%$
test_unsqueeze 0.2932ms 95.0325μs 10.5227 KOps/s 11.2104 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_split 0.3726ms 0.2024ms 4.9414 KOps/s 5.1710 KOps/s $\color{#d91a1a}-4.44\%$
test_permute 0.3787ms 0.2288ms 4.3710 KOps/s 4.5803 KOps/s $\color{#d91a1a}-4.57\%$
test_stack 25.9392ms 25.4442ms 39.3016 Ops/s 39.8631 Ops/s $\color{#d91a1a}-1.41\%$
test_cat 29.8595ms 25.4536ms 39.2872 Ops/s 40.1294 Ops/s $\color{#d91a1a}-2.10\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 20, 2024
ghstack-source-id: d4117bb329425145bc781b457b8de43a67a53732
Pull Request resolved: #1002
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}27$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1378ms 12.8442μs 77.8561 KOps/s 69.3485 KOps/s $\textbf{\color{#35bf28}+12.27\%}$
test_plain_set_stack_nested 42.4110μs 12.9810μs 77.0356 KOps/s 68.4314 KOps/s $\textbf{\color{#35bf28}+12.57\%}$
test_plain_set_nested_inplace 49.8210μs 14.0210μs 71.3214 KOps/s 64.3909 KOps/s $\textbf{\color{#35bf28}+10.76\%}$
test_plain_set_stack_nested_inplace 53.6710μs 14.0912μs 70.9663 KOps/s 64.8776 KOps/s $\textbf{\color{#35bf28}+9.38\%}$
test_items 29.7800μs 2.8791μs 347.3282 KOps/s 346.9299 KOps/s $\color{#35bf28}+0.11\%$
test_items_nested 0.5044ms 0.3380ms 2.9584 KOps/s 3.0043 KOps/s $\color{#d91a1a}-1.53\%$
test_items_nested_locked 0.3825ms 0.3393ms 2.9471 KOps/s 3.0093 KOps/s $\color{#d91a1a}-2.07\%$
test_items_nested_leaf 0.1045ms 55.1898μs 18.1193 KOps/s 18.0396 KOps/s $\color{#35bf28}+0.44\%$
test_items_stack_nested 0.3948ms 0.3386ms 2.9529 KOps/s 2.9934 KOps/s $\color{#d91a1a}-1.35\%$
test_items_stack_nested_leaf 84.1810μs 57.2094μs 17.4797 KOps/s 17.5385 KOps/s $\color{#d91a1a}-0.34\%$
test_items_stack_nested_locked 0.3873ms 0.3400ms 2.9411 KOps/s 2.9790 KOps/s $\color{#d91a1a}-1.27\%$
test_keys 36.5500μs 3.4279μs 291.7226 KOps/s 290.7935 KOps/s $\color{#35bf28}+0.32\%$
test_keys_nested 79.0210μs 56.6416μs 17.6549 KOps/s 18.2984 KOps/s $\color{#d91a1a}-3.52\%$
test_keys_nested_locked 2.2819ms 62.8517μs 15.9105 KOps/s 16.3422 KOps/s $\color{#d91a1a}-2.64\%$
test_keys_nested_leaf 80.6520μs 47.7372μs 20.9480 KOps/s 21.1694 KOps/s $\color{#d91a1a}-1.05\%$
test_keys_stack_nested 92.7310μs 56.9093μs 17.5718 KOps/s 17.7897 KOps/s $\color{#d91a1a}-1.22\%$
test_keys_stack_nested_leaf 80.8810μs 48.7191μs 20.5258 KOps/s 20.7289 KOps/s $\color{#d91a1a}-0.98\%$
test_keys_stack_nested_locked 90.7320μs 62.5976μs 15.9750 KOps/s 16.1749 KOps/s $\color{#d91a1a}-1.24\%$
test_values 4.5667μs 0.8423μs 1.1872 MOps/s 1.1749 MOps/s $\color{#35bf28}+1.05\%$
test_values_nested 67.3310μs 40.9533μs 24.4180 KOps/s 24.6165 KOps/s $\color{#d91a1a}-0.81\%$
test_values_nested_locked 71.3820μs 42.6249μs 23.4605 KOps/s 23.4230 KOps/s $\color{#35bf28}+0.16\%$
test_values_nested_leaf 66.3820μs 35.4662μs 28.1959 KOps/s 28.2676 KOps/s $\color{#d91a1a}-0.25\%$
test_values_stack_nested 98.7920μs 41.8214μs 23.9112 KOps/s 23.7514 KOps/s $\color{#35bf28}+0.67\%$
test_values_stack_nested_leaf 67.4610μs 36.1725μs 27.6453 KOps/s 27.7243 KOps/s $\color{#d91a1a}-0.28\%$
test_values_stack_nested_locked 65.5510μs 43.1812μs 23.1582 KOps/s 22.9219 KOps/s $\color{#35bf28}+1.03\%$
test_membership 1.7425μs 0.5002μs 1.9994 MOps/s 1.9878 MOps/s $\color{#35bf28}+0.58\%$
test_membership_nested 16.7550μs 1.9145μs 522.3327 KOps/s 524.6353 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_leaf 14.8805μs 1.9490μs 513.0753 KOps/s 538.0806 KOps/s $\color{#d91a1a}-4.65\%$
test_membership_stacked_nested 27.0410μs 1.9692μs 507.8312 KOps/s 521.3813 KOps/s $\color{#d91a1a}-2.60\%$
test_membership_stacked_nested_leaf 22.4900μs 1.9567μs 511.0634 KOps/s 518.0586 KOps/s $\color{#d91a1a}-1.35\%$
test_membership_nested_last 28.2700μs 2.8124μs 355.5650 KOps/s 356.8045 KOps/s $\color{#d91a1a}-0.35\%$
test_membership_nested_leaf_last 32.2910μs 2.8105μs 355.8138 KOps/s 359.3570 KOps/s $\color{#d91a1a}-0.99\%$
test_membership_stacked_nested_last 26.9300μs 4.4722μs 223.6039 KOps/s 235.1275 KOps/s $\color{#d91a1a}-4.90\%$
test_membership_stacked_nested_leaf_last 91.3510μs 4.4606μs 224.1858 KOps/s 235.7149 KOps/s $\color{#d91a1a}-4.89\%$
test_nested_getleaf 26.1200μs 6.1094μs 163.6830 KOps/s 165.2221 KOps/s $\color{#d91a1a}-0.93\%$
test_nested_get 29.1300μs 5.7187μs 174.8648 KOps/s 173.9090 KOps/s $\color{#35bf28}+0.55\%$
test_stacked_getleaf 26.1700μs 6.0999μs 163.9359 KOps/s 163.0027 KOps/s $\color{#35bf28}+0.57\%$
test_stacked_get 25.4800μs 5.6844μs 175.9202 KOps/s 174.9278 KOps/s $\color{#35bf28}+0.57\%$
test_nested_getitemleaf 30.2500μs 6.2340μs 160.4109 KOps/s 162.1122 KOps/s $\color{#d91a1a}-1.05\%$
test_nested_getitem 37.6000μs 5.7999μs 172.4179 KOps/s 172.8718 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_getitemleaf 29.8000μs 6.1500μs 162.6023 KOps/s 162.4149 KOps/s $\color{#35bf28}+0.12\%$
test_stacked_getitem 22.9810μs 5.6673μs 176.4522 KOps/s 173.5859 KOps/s $\color{#35bf28}+1.65\%$
test_lock_nested 4.5268ms 0.4267ms 2.3438 KOps/s 2.3908 KOps/s $\color{#d91a1a}-1.96\%$
test_lock_stack_nested 0.4475ms 0.3782ms 2.6439 KOps/s 2.6264 KOps/s $\color{#35bf28}+0.66\%$
test_unlock_nested 0.7449ms 0.3616ms 2.7652 KOps/s 2.7903 KOps/s $\color{#d91a1a}-0.90\%$
test_unlock_stack_nested 0.3472ms 0.3149ms 3.1754 KOps/s 3.1208 KOps/s $\color{#35bf28}+1.75\%$
test_flatten_speed 0.1494ms 69.4767μs 14.3933 KOps/s 14.4401 KOps/s $\color{#d91a1a}-0.32\%$
test_unflatten_speed 0.3105ms 0.2834ms 3.5285 KOps/s 3.5224 KOps/s $\color{#35bf28}+0.18\%$
test_common_ops 1.4815ms 1.2186ms 820.5976 Ops/s 785.4985 Ops/s $\color{#35bf28}+4.47\%$
test_creation 21.6410μs 1.4869μs 672.5255 KOps/s 675.4936 KOps/s $\color{#d91a1a}-0.44\%$
test_creation_empty 39.6610μs 13.6534μs 73.2417 KOps/s 60.3764 KOps/s $\textbf{\color{#35bf28}+21.31\%}$
test_creation_nested_1 38.3910μs 15.1468μs 66.0205 KOps/s 54.1476 KOps/s $\textbf{\color{#35bf28}+21.93\%}$
test_creation_nested_2 42.5110μs 18.2814μs 54.7003 KOps/s 48.2714 KOps/s $\textbf{\color{#35bf28}+13.32\%}$
test_clone 70.0610μs 29.2702μs 34.1644 KOps/s 35.2504 KOps/s $\color{#d91a1a}-3.08\%$
test_getitem[int] 1.0613ms 15.6211μs 64.0162 KOps/s 61.3844 KOps/s $\color{#35bf28}+4.29\%$
test_getitem[slice_int] 0.1204ms 27.8806μs 35.8673 KOps/s 35.4367 KOps/s $\color{#35bf28}+1.22\%$
test_getitem[range] 0.2354ms 0.1141ms 8.7617 KOps/s 8.9170 KOps/s $\color{#d91a1a}-1.74\%$
test_getitem[tuple] 0.1178ms 24.0346μs 41.6067 KOps/s 41.1852 KOps/s $\color{#35bf28}+1.02\%$
test_getitem[list] 0.2238ms 0.1028ms 9.7296 KOps/s 9.8603 KOps/s $\color{#d91a1a}-1.33\%$
test_setitem_dim[int] 68.4110μs 46.3984μs 21.5525 KOps/s 21.8736 KOps/s $\color{#d91a1a}-1.47\%$
test_setitem_dim[slice_int] 0.1134ms 68.8977μs 14.5143 KOps/s 14.6110 KOps/s $\color{#d91a1a}-0.66\%$
test_setitem_dim[range] 0.1707ms 0.1301ms 7.6841 KOps/s 7.8406 KOps/s $\color{#d91a1a}-2.00\%$
test_setitem_dim[tuple] 98.3410μs 62.5407μs 15.9896 KOps/s 16.3203 KOps/s $\color{#d91a1a}-2.03\%$
test_setitem 73.2410μs 40.9359μs 24.4284 KOps/s 23.8431 KOps/s $\color{#35bf28}+2.45\%$
test_set 84.1120μs 39.7952μs 25.1287 KOps/s 24.5357 KOps/s $\color{#35bf28}+2.42\%$
test_set_shared 0.3609ms 51.0755μs 19.5789 KOps/s 19.6667 KOps/s $\color{#d91a1a}-0.45\%$
test_update 99.7920μs 47.3737μs 21.1088 KOps/s 19.6561 KOps/s $\textbf{\color{#35bf28}+7.39\%}$
test_update_nested 90.7910μs 54.4682μs 18.3593 KOps/s 17.6470 KOps/s $\color{#35bf28}+4.04\%$
test_update__nested 97.0410μs 60.0660μs 16.6483 KOps/s 17.3578 KOps/s $\color{#d91a1a}-4.09\%$
test_set_nested 80.2910μs 42.5190μs 23.5189 KOps/s 23.0945 KOps/s $\color{#35bf28}+1.84\%$
test_set_nested_new 83.8920μs 45.9778μs 21.7496 KOps/s 20.9723 KOps/s $\color{#35bf28}+3.71\%$
test_select 92.7020μs 59.1188μs 16.9151 KOps/s 16.3595 KOps/s $\color{#35bf28}+3.40\%$
test_select_nested 76.3620μs 41.6375μs 24.0168 KOps/s 23.8918 KOps/s $\color{#35bf28}+0.52\%$
test_exclude_nested 89.0110μs 59.4025μs 16.8343 KOps/s 16.8898 KOps/s $\color{#d91a1a}-0.33\%$
test_empty[True] 0.2841ms 0.2513ms 3.9801 KOps/s 4.0425 KOps/s $\color{#d91a1a}-1.54\%$
test_empty[False] 3.7170μs 0.7481μs 1.3368 MOps/s 1.3510 MOps/s $\color{#d91a1a}-1.05\%$
test_to 55.1710μs 25.3588μs 39.4340 KOps/s 39.4598 KOps/s $\color{#d91a1a}-0.07\%$
test_to_nonblocking 59.7510μs 24.7042μs 40.4790 KOps/s 41.7113 KOps/s $\color{#d91a1a}-2.95\%$
test_unbind_speed 0.9543ms 0.2801ms 3.5697 KOps/s 3.5671 KOps/s $\color{#35bf28}+0.07\%$
test_unbind_speed_stack0 0.3310ms 0.2784ms 3.5915 KOps/s 3.6189 KOps/s $\color{#d91a1a}-0.76\%$
test_unbind_speed_stack1 91.9917ms 0.7048ms 1.4188 KOps/s 1.4271 KOps/s $\color{#d91a1a}-0.58\%$
test_split 93.1959ms 2.1942ms 455.7378 Ops/s 468.3019 Ops/s $\color{#d91a1a}-2.68\%$
test_chunk 95.5748ms 2.1890ms 456.8326 Ops/s 470.0994 Ops/s $\color{#d91a1a}-2.82\%$
test_creation[device0] 0.2640ms 0.1287ms 7.7719 KOps/s 7.8599 KOps/s $\color{#d91a1a}-1.12\%$
test_creation_from_tensor 0.3498ms 0.1347ms 7.4261 KOps/s 7.7436 KOps/s $\color{#d91a1a}-4.10\%$
test_add_one[memmap_tensor0] 0.2379ms 8.9929μs 111.1982 KOps/s 111.9800 KOps/s $\color{#d91a1a}-0.70\%$
test_contiguous[memmap_tensor0] 28.5500μs 2.2684μs 440.8461 KOps/s 445.8087 KOps/s $\color{#d91a1a}-1.11\%$
test_stack[memmap_tensor0] 33.7610μs 6.9545μs 143.7921 KOps/s 147.8469 KOps/s $\color{#d91a1a}-2.74\%$
test_memmaptd_index 1.0766ms 0.4229ms 2.3646 KOps/s 2.3589 KOps/s $\color{#35bf28}+0.24\%$
test_memmaptd_index_astensor 0.9079ms 0.4825ms 2.0728 KOps/s 2.0963 KOps/s $\color{#d91a1a}-1.12\%$
test_memmaptd_index_op 1.4050ms 0.9935ms 1.0066 KOps/s 965.2885 Ops/s $\color{#35bf28}+4.28\%$
test_serialize_model 0.1306s 0.1290s 7.7532 Ops/s 7.7002 Ops/s $\color{#35bf28}+0.69\%$
test_serialize_model_pickle 1.2197s 1.2080s 0.8278 Ops/s 0.7828 Ops/s $\textbf{\color{#35bf28}+5.75\%}$
test_serialize_weights 0.1292s 0.1286s 7.7764 Ops/s 7.7585 Ops/s $\color{#35bf28}+0.23\%$
test_serialize_weights_returnearly 51.8635ms 47.1438ms 21.2117 Ops/s 20.8455 Ops/s $\color{#35bf28}+1.76\%$
test_serialize_weights_pickle 1.2135s 1.1933s 0.8380 Ops/s 0.7370 Ops/s $\textbf{\color{#35bf28}+13.71\%}$
test_reshape_pytree 64.1110μs 35.7241μs 27.9923 KOps/s 27.4429 KOps/s $\color{#35bf28}+2.00\%$
test_reshape_td 88.8820μs 42.2071μs 23.6927 KOps/s 23.2499 KOps/s $\color{#35bf28}+1.90\%$
test_view_pytree 70.9520μs 35.6973μs 28.0134 KOps/s 27.7684 KOps/s $\color{#35bf28}+0.88\%$
test_view_td 72.9620μs 45.2580μs 22.0956 KOps/s 20.8465 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_unbind_pytree 60.5710μs 34.4776μs 29.0043 KOps/s 28.6431 KOps/s $\color{#35bf28}+1.26\%$
test_unbind_td 0.4545ms 43.3455μs 23.0704 KOps/s 23.0177 KOps/s $\color{#35bf28}+0.23\%$
test_split_pytree 77.8220μs 46.0786μs 21.7020 KOps/s 20.6820 KOps/s $\color{#35bf28}+4.93\%$
test_split_td 0.1798ms 56.1793μs 17.8002 KOps/s 18.0870 KOps/s $\color{#d91a1a}-1.59\%$
test_add_pytree 0.1028ms 56.9213μs 17.5681 KOps/s 16.9421 KOps/s $\color{#35bf28}+3.69\%$
test_add_td 0.1427ms 89.4225μs 11.1829 KOps/s 10.8120 KOps/s $\color{#35bf28}+3.43\%$
test_compile_add_one_nested[tensordict-compile] 0.4105ms 0.2121ms 4.7142 KOps/s 4.6663 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_one_nested[tensordict-eager] 0.2122ms 0.1517ms 6.5941 KOps/s 6.7588 KOps/s $\color{#d91a1a}-2.44\%$
test_compile_add_one_nested[pytree-compile] 0.2062ms 0.1456ms 6.8671 KOps/s 6.8301 KOps/s $\color{#35bf28}+0.54\%$
test_compile_add_one_nested[pytree-eager] 0.2461ms 0.1839ms 5.4388 KOps/s 5.4811 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_copy_nested[tensordict-compile] 57.0710μs 21.3376μs 46.8656 KOps/s 47.2832 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_copy_nested[tensordict-eager] 79.9520μs 44.0277μs 22.7130 KOps/s 23.1583 KOps/s $\color{#d91a1a}-1.92\%$
test_compile_copy_nested[pytree-compile] 0.1049ms 63.6012μs 15.7230 KOps/s 15.6966 KOps/s $\color{#35bf28}+0.17\%$
test_compile_copy_nested[pytree-eager] 86.2520μs 49.3241μs 20.2741 KOps/s 20.3423 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_one_flat[tensordict-compile] 0.3735ms 0.3220ms 3.1058 KOps/s 3.1279 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_add_one_flat[tensordict-eager] 0.2758ms 0.2067ms 4.8380 KOps/s 4.8264 KOps/s $\color{#35bf28}+0.24\%$
test_compile_add_one_flat[tensorclass-compile] 0.1763ms 0.1291ms 7.7439 KOps/s 7.8826 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_add_one_flat[tensorclass-eager] 0.1111ms 60.8306μs 16.4391 KOps/s 16.5150 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_one_flat[pytree-compile] 0.4261ms 0.3209ms 3.1161 KOps/s 3.1398 KOps/s $\color{#d91a1a}-0.75\%$
test_compile_add_one_flat[pytree-eager] 0.9959ms 0.6133ms 1.6305 KOps/s 1.6261 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_self_flat[tensordict-eager] 0.6403ms 0.2470ms 4.0481 KOps/s 4.0625 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_add_self_flat[tensordict-compile] 0.3856ms 0.3234ms 3.0923 KOps/s 3.1205 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_add_self_flat[tensorclass-eager] 0.4823ms 73.8923μs 13.5332 KOps/s 14.4917 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2119ms 0.1330ms 7.5204 KOps/s 7.7906 KOps/s $\color{#d91a1a}-3.47\%$
test_compile_add_self_flat[pytree-eager] 0.9263ms 0.5341ms 1.8723 KOps/s 1.8987 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_add_self_flat[pytree-compile] 0.3794ms 0.3193ms 3.1322 KOps/s 3.1321 KOps/s $+0.00\%$
test_compile_copy_flat[tensordict-compile] 0.4252ms 19.6137μs 50.9849 KOps/s 55.2682 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_compile_copy_flat[tensordict-eager] 59.3310μs 27.3425μs 36.5730 KOps/s 37.8277 KOps/s $\color{#d91a1a}-3.32\%$
test_compile_copy_flat[pytree-compile] 0.1248ms 68.8249μs 14.5296 KOps/s 14.5505 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_copy_flat[pytree-eager] 0.1040ms 51.0750μs 19.5790 KOps/s 19.5361 KOps/s $\color{#35bf28}+0.22\%$
test_compile_assign_and_add[tensordict-compile] 2.3187ms 0.8131ms 1.2299 KOps/s 1.1254 KOps/s $\textbf{\color{#35bf28}+9.29\%}$
test_compile_assign_and_add[tensordict-eager] 3.3393ms 3.1889ms 313.5858 Ops/s 311.2700 Ops/s $\color{#35bf28}+0.74\%$
test_compile_assign_and_add[pytree-compile] 2.2715ms 0.8040ms 1.2438 KOps/s 1.1329 KOps/s $\textbf{\color{#35bf28}+9.79\%}$
test_compile_assign_and_add[pytree-eager] 3.5146ms 3.3484ms 298.6487 Ops/s 310.3878 Ops/s $\color{#d91a1a}-3.78\%$
test_compile_indexing[tensor-tensordict-compile] 0.1622ms 0.1108ms 9.0256 KOps/s 9.2214 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_indexing[tensor-tensordict-eager] 0.1896ms 65.4716μs 15.2738 KOps/s 15.8268 KOps/s $\color{#d91a1a}-3.49\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1494ms 0.1045ms 9.5734 KOps/s 9.6075 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1127ms 44.6683μs 22.3873 KOps/s 22.6265 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_indexing[tensor-pytree-compile] 0.1556ms 0.1055ms 9.4786 KOps/s 9.5399 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_indexing[tensor-pytree-eager] 0.1049ms 44.1184μs 22.6663 KOps/s 22.6074 KOps/s $\color{#35bf28}+0.26\%$
test_compile_indexing[slice-tensordict-compile] 0.1920ms 0.1390ms 7.1929 KOps/s 7.2696 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_indexing[slice-tensordict-eager] 0.1597ms 24.9617μs 40.0614 KOps/s 40.0030 KOps/s $\color{#35bf28}+0.15\%$
test_compile_indexing[slice-tensorclass-compile] 0.1882ms 0.1329ms 7.5238 KOps/s 7.5870 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[slice-tensorclass-eager] 59.9910μs 21.0353μs 47.5390 KOps/s 47.8252 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_indexing[slice-pytree-compile] 0.2055ms 0.1333ms 7.5016 KOps/s 7.5073 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[slice-pytree-eager] 57.1210μs 20.3255μs 49.1993 KOps/s 48.5426 KOps/s $\color{#35bf28}+1.35\%$
test_compile_indexing[int-tensordict-compile] 0.2046ms 0.1403ms 7.1292 KOps/s 7.2475 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_indexing[int-tensordict-eager] 0.4893ms 24.7192μs 40.4544 KOps/s 39.1590 KOps/s $\color{#35bf28}+3.31\%$
test_compile_indexing[int-tensorclass-compile] 0.2078ms 0.1341ms 7.4583 KOps/s 7.5175 KOps/s $\color{#d91a1a}-0.79\%$
test_compile_indexing[int-tensorclass-eager] 0.1232ms 22.3764μs 44.6900 KOps/s 48.4668 KOps/s $\textbf{\color{#d91a1a}-7.79\%}$
test_compile_indexing[int-pytree-compile] 0.1880ms 0.1334ms 7.4971 KOps/s 7.5348 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_indexing[int-pytree-eager] 55.4210μs 20.5538μs 48.6527 KOps/s 48.2155 KOps/s $\color{#35bf28}+0.91\%$
test_mod_add[eager] 76.5010μs 30.2609μs 33.0460 KOps/s 28.7708 KOps/s $\textbf{\color{#35bf28}+14.86\%}$
test_mod_add[compile] 0.3775ms 69.9712μs 14.2916 KOps/s 14.0211 KOps/s $\color{#35bf28}+1.93\%$
test_mod_add[compile-overhead] 0.2665ms 0.1384ms 7.2255 KOps/s 7.0691 KOps/s $\color{#35bf28}+2.21\%$
test_mod_wrap[eager] 0.3221ms 0.2586ms 3.8666 KOps/s 3.9637 KOps/s $\color{#d91a1a}-2.45\%$
test_mod_wrap[compile] 1.4991ms 0.2973ms 3.3634 KOps/s 3.3098 KOps/s $\color{#35bf28}+1.62\%$
test_mod_wrap[compile-overhead] 7.8187ms 4.1272ms 242.2955 Ops/s 244.8052 Ops/s $\color{#d91a1a}-1.03\%$
test_mod_wrap_and_backward[eager] 1.4662ms 1.3380ms 747.3856 Ops/s 689.4137 Ops/s $\textbf{\color{#35bf28}+8.41\%}$
test_mod_wrap_and_backward[compile] 1.5752ms 1.3261ms 754.0932 Ops/s 695.3781 Ops/s $\textbf{\color{#35bf28}+8.44\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3318ms 0.9057ms 1.1041 KOps/s 989.1854 Ops/s $\textbf{\color{#35bf28}+11.62\%}$
test_seq_add[eager] 0.1714ms 98.1153μs 10.1921 KOps/s 9.5408 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_seq_add[compile] 0.1356ms 85.3907μs 11.7109 KOps/s 12.1661 KOps/s $\color{#d91a1a}-3.74\%$
test_seq_add[compile-overhead] 0.1645ms 0.1167ms 8.5689 KOps/s 8.6865 KOps/s $\color{#d91a1a}-1.35\%$
test_seq_wrap[eager] 0.4641ms 0.3963ms 2.5236 KOps/s 2.5166 KOps/s $\color{#35bf28}+0.28\%$
test_seq_wrap[compile] 0.3876ms 0.3300ms 3.0306 KOps/s 3.1271 KOps/s $\color{#d91a1a}-3.09\%$
test_seq_wrap[compile-overhead] 0.2818ms 0.2317ms 4.3161 KOps/s 4.4588 KOps/s $\color{#d91a1a}-3.20\%$
test_func_call_runtime[False-eager] 0.8794ms 0.7914ms 1.2636 KOps/s 1.3232 KOps/s $\color{#d91a1a}-4.51\%$
test_func_call_runtime[False-compile] 0.8961ms 0.8440ms 1.1848 KOps/s 1.2335 KOps/s $\color{#d91a1a}-3.95\%$
test_func_call_runtime[False-compile-overhead] 0.4530ms 0.3653ms 2.7375 KOps/s 2.7428 KOps/s $\color{#d91a1a}-0.19\%$
test_func_call_runtime[True-eager] 1.0107ms 0.9358ms 1.0686 KOps/s 1.0790 KOps/s $\color{#d91a1a}-0.97\%$
test_func_call_runtime[True-compile] 0.9353ms 0.8749ms 1.1430 KOps/s 1.1784 KOps/s $\color{#d91a1a}-3.00\%$
test_func_call_runtime[True-compile-overhead] 0.4639ms 0.4031ms 2.4808 KOps/s 2.5193 KOps/s $\color{#d91a1a}-1.53\%$
test_func_call_cm_runtime[False-eager] 0.8555ms 0.7929ms 1.2611 KOps/s 1.3294 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_func_call_cm_runtime[False-compile] 0.9186ms 0.8346ms 1.1982 KOps/s 1.2288 KOps/s $\color{#d91a1a}-2.49\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4285ms 0.3662ms 2.7309 KOps/s 2.7322 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_cm_runtime[True-eager] 1.1400ms 1.0153ms 984.9738 Ops/s 980.8340 Ops/s $\color{#35bf28}+0.42\%$
test_func_call_cm_runtime[True-compile] 0.9353ms 0.8656ms 1.1553 KOps/s 1.1489 KOps/s $\color{#35bf28}+0.56\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4707ms 0.4231ms 2.3636 KOps/s 2.3441 KOps/s $\color{#35bf28}+0.83\%$
test_vmap_func_call_cm_runtime[eager] 2.5275ms 2.0785ms 481.1196 Ops/s 479.1466 Ops/s $\color{#35bf28}+0.41\%$
test_vmap_func_call_cm_runtime[compile] 0.9294ms 0.8749ms 1.1430 KOps/s 1.1347 KOps/s $\color{#35bf28}+0.73\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4943ms 0.4305ms 2.3229 KOps/s 2.3303 KOps/s $\color{#d91a1a}-0.32\%$
test_distributed 0.6880ms 0.1588ms 6.2983 KOps/s 8.8879 KOps/s $\textbf{\color{#d91a1a}-29.14\%}$
test_tdmodule 61.5410μs 13.6496μs 73.2623 KOps/s 64.9524 KOps/s $\textbf{\color{#35bf28}+12.79\%}$
test_tdmodule_dispatch 50.0910μs 26.7531μs 37.3789 KOps/s 33.2347 KOps/s $\textbf{\color{#35bf28}+12.47\%}$
test_tdseq 34.6610μs 14.5487μs 68.7347 KOps/s 61.2301 KOps/s $\textbf{\color{#35bf28}+12.26\%}$
test_tdseq_dispatch 51.8510μs 29.4498μs 33.9561 KOps/s 30.3883 KOps/s $\textbf{\color{#35bf28}+11.74\%}$
test_instantiation_functorch 2.0322ms 1.8797ms 532.0027 Ops/s 529.8369 Ops/s $\color{#35bf28}+0.41\%$
test_instantiation_td 1.7928ms 1.1997ms 833.5731 Ops/s 826.4731 Ops/s $\color{#35bf28}+0.86\%$
test_exec_functorch 0.2434ms 0.2112ms 4.7348 KOps/s 4.6948 KOps/s $\color{#35bf28}+0.85\%$
test_exec_functional_call 0.2612ms 0.2135ms 4.6841 KOps/s 4.5742 KOps/s $\color{#35bf28}+2.40\%$
test_exec_td 0.2936ms 0.2173ms 4.6022 KOps/s 4.2462 KOps/s $\textbf{\color{#35bf28}+8.38\%}$
test_exec_td_decorator 0.9117ms 0.2614ms 3.8252 KOps/s 3.6028 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_vmap_mlp_speed[True-True] 0.7747ms 0.6798ms 1.4709 KOps/s 1.4283 KOps/s $\color{#35bf28}+2.98\%$
test_vmap_mlp_speed[True-False] 0.7866ms 0.6797ms 1.4713 KOps/s 1.4462 KOps/s $\color{#35bf28}+1.74\%$
test_vmap_mlp_speed[False-True] 0.7133ms 0.5728ms 1.7458 KOps/s 1.6717 KOps/s $\color{#35bf28}+4.43\%$
test_vmap_mlp_speed[False-False] 0.6189ms 0.5734ms 1.7441 KOps/s 1.6446 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_vmap_mlp_speed_decorator[True-True] 1.3042ms 0.6653ms 1.5031 KOps/s 1.4581 KOps/s $\color{#35bf28}+3.09\%$
test_vmap_mlp_speed_decorator[True-False] 0.8094ms 0.6669ms 1.4994 KOps/s 1.4759 KOps/s $\color{#35bf28}+1.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.6886ms 0.5862ms 1.7058 KOps/s 1.6411 KOps/s $\color{#35bf28}+3.94\%$
test_vmap_mlp_speed_decorator[False-False] 0.7339ms 0.5878ms 1.7013 KOps/s 1.6892 KOps/s $\color{#35bf28}+0.72\%$
test_vmap_transformer_speed[True-True] 8.3802ms 8.2639ms 121.0087 Ops/s 118.9130 Ops/s $\color{#35bf28}+1.76\%$
test_vmap_transformer_speed[True-False] 8.6515ms 8.3308ms 120.0368 Ops/s 118.6528 Ops/s $\color{#35bf28}+1.17\%$
test_vmap_transformer_speed[False-True] 8.1521ms 8.0742ms 123.8506 Ops/s 121.2721 Ops/s $\color{#35bf28}+2.13\%$
test_vmap_transformer_speed[False-False] 8.1612ms 8.0775ms 123.8011 Ops/s 121.5116 Ops/s $\color{#35bf28}+1.88\%$
test_vmap_transformer_speed_decorator[True-True] 19.5565ms 19.4210ms 51.4906 Ops/s 50.6293 Ops/s $\color{#35bf28}+1.70\%$
test_vmap_transformer_speed_decorator[True-False] 19.9809ms 19.4029ms 51.5388 Ops/s 50.9368 Ops/s $\color{#35bf28}+1.18\%$
test_vmap_transformer_speed_decorator[False-True] 20.6143ms 19.2919ms 51.8353 Ops/s 51.2857 Ops/s $\color{#35bf28}+1.07\%$
test_vmap_transformer_speed_decorator[False-False] 19.2935ms 19.2156ms 52.0409 Ops/s 51.3247 Ops/s $\color{#35bf28}+1.40\%$
test_to_module_speed[True] 1.2256ms 0.9486ms 1.0542 KOps/s 1.0477 KOps/s $\color{#35bf28}+0.62\%$
test_to_module_speed[False] 1.3372ms 0.9209ms 1.0859 KOps/s 1.0794 KOps/s $\color{#35bf28}+0.60\%$
test_tc_init 52.6410μs 32.0424μs 31.2087 KOps/s 27.5615 KOps/s $\textbf{\color{#35bf28}+13.23\%}$
test_tc_init_nested 98.2320μs 61.8736μs 16.1620 KOps/s 13.8124 KOps/s $\textbf{\color{#35bf28}+17.01\%}$
test_tc_first_layer_tensor 4.0759μs 0.6814μs 1.4675 MOps/s 1.5033 MOps/s $\color{#d91a1a}-2.38\%$
test_tc_first_layer_nontensor 85.4610μs 2.2566μs 443.1425 KOps/s 444.7666 KOps/s $\color{#d91a1a}-0.37\%$
test_tc_second_layer_tensor 31.6880μs 1.3718μs 728.9920 KOps/s 724.4455 KOps/s $\color{#35bf28}+0.63\%$
test_tc_second_layer_nontensor 21.9400μs 2.9580μs 338.0607 KOps/s 334.5873 KOps/s $\color{#35bf28}+1.04\%$
test_unbind 0.1937s 12.2616ms 81.5557 Ops/s 92.7401 Ops/s $\textbf{\color{#d91a1a}-12.06\%}$
test_full_like 0.6554ms 0.5738ms 1.7427 KOps/s 1.7389 KOps/s $\color{#35bf28}+0.22\%$
test_zeros_like 0.2645ms 0.1980ms 5.0509 KOps/s 5.0544 KOps/s $\color{#d91a1a}-0.07\%$
test_ones_like 0.2384ms 0.1978ms 5.0557 KOps/s 5.0589 KOps/s $\color{#d91a1a}-0.06\%$
test_clone 0.4427ms 0.4145ms 2.4123 KOps/s 2.4146 KOps/s $\color{#d91a1a}-0.09\%$
test_squeeze 36.9600μs 9.7728μs 102.3243 KOps/s 99.9040 KOps/s $\color{#35bf28}+2.42\%$
test_unsqueeze 0.2893ms 75.1522μs 13.3063 KOps/s 13.3618 KOps/s $\color{#d91a1a}-0.42\%$
test_split 0.2597ms 0.1600ms 6.2482 KOps/s 6.2008 KOps/s $\color{#35bf28}+0.76\%$
test_permute 0.2230ms 0.1788ms 5.5937 KOps/s 5.3665 KOps/s $\color{#35bf28}+4.23\%$
test_stack 1.2533ms 0.8771ms 1.1401 KOps/s 1.1719 KOps/s $\color{#d91a1a}-2.71\%$
test_cat 1.2546ms 1.2312ms 812.2141 Ops/s 812.1163 Ops/s $\color{#35bf28}+0.01\%$

@vmoens vmoens merged commit 2061034 into gh/vmoens/18/base Sep 20, 2024
43 of 48 checks passed
vmoens added a commit that referenced this pull request Sep 20, 2024
ghstack-source-id: d4117bb329425145bc781b457b8de43a67a53732
Pull Request resolved: #1002
@vmoens vmoens deleted the gh/vmoens/18/head branch September 20, 2024 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants