Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix displaying of tensor sizes in buffers #2456

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 26, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 511609a169996b680dccbd272e3d5b5710618558
Pull Request resolved: #2456
Copy link

pytorch-bot bot commented Sep 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2456

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures, 14 Unrelated Failures

As of commit 2b10d09 with merge base ca3a595 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2024
@vmoens vmoens merged commit 2b10d09 into gh/vmoens/31/base Sep 26, 2024
35 of 53 checks passed
@vmoens vmoens deleted the gh/vmoens/31/head branch September 26, 2024 14:29
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 146. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 73.4348ms 61.6314ms 16.2255 Ops/s 16.4999 Ops/s $\color{#d91a1a}-1.66\%$
test_sync 38.9631ms 33.4299ms 29.9133 Ops/s 27.0776 Ops/s $\textbf{\color{#35bf28}+10.47\%}$
test_async 62.0133ms 32.1626ms 31.0920 Ops/s 30.8971 Ops/s $\color{#35bf28}+0.63\%$
test_simple 0.5037s 0.4288s 2.3319 Ops/s 2.4492 Ops/s $\color{#d91a1a}-4.79\%$
test_transformed 0.5790s 0.5774s 1.7320 Ops/s 1.6780 Ops/s $\color{#35bf28}+3.22\%$
test_serial 1.2860s 1.2848s 0.7783 Ops/s 0.7867 Ops/s $\color{#d91a1a}-1.06\%$
test_parallel 1.2393s 1.1589s 0.8629 Ops/s 0.8801 Ops/s $\color{#d91a1a}-1.96\%$
test_step_mdp_speed[True-True-True-True-True] 0.2973ms 27.4295μs 36.4571 KOps/s 36.7164 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[True-True-True-True-False] 44.1430μs 16.0545μs 62.2879 KOps/s 61.8351 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[True-True-True-False-True] 70.6560μs 15.6346μs 63.9607 KOps/s 63.8204 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-True-True-False-False] 34.0940μs 9.2716μs 107.8567 KOps/s 108.9811 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[True-True-False-True-True] 91.5390μs 29.1899μs 34.2584 KOps/s 34.5989 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-True-False-True-False] 77.3550μs 17.9064μs 55.8459 KOps/s 56.1530 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-True-False-False-True] 42.5000μs 17.5443μs 56.9985 KOps/s 57.9096 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[True-True-False-False-False] 66.4340μs 10.9131μs 91.6331 KOps/s 91.8315 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-False-True-True-True] 84.5990μs 31.0057μs 32.2522 KOps/s 32.1252 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[True-False-True-True-False] 50.5150μs 19.5226μs 51.2226 KOps/s 51.7621 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[True-False-True-False-True] 44.1230μs 17.2260μs 58.0519 KOps/s 57.4523 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[True-False-True-False-False] 44.3240μs 10.8140μs 92.4730 KOps/s 90.8503 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[True-False-False-True-True] 95.4320μs 31.9929μs 31.2570 KOps/s 30.5193 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-False-False-True-False] 51.2460μs 21.0131μs 47.5894 KOps/s 47.4196 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-False-False-False-True] 47.6090μs 18.8873μs 52.9457 KOps/s 52.9375 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-False-False-False-False] 36.0580μs 12.3903μs 80.7086 KOps/s 79.6534 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[False-True-True-True-True] 71.3950μs 30.9897μs 32.2688 KOps/s 32.2218 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-True-True-False] 63.1690μs 19.3162μs 51.7700 KOps/s 50.9348 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[False-True-True-False-True] 50.3650μs 19.9057μs 50.2368 KOps/s 50.1344 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-True-False-False] 40.8470μs 12.0594μs 82.9229 KOps/s 82.1180 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-True-False-True-True] 66.5850μs 32.7320μs 30.5511 KOps/s 30.9556 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-True-False-True-False] 57.7790μs 20.7891μs 48.1022 KOps/s 47.7908 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-True-False-False-True] 3.0402ms 21.7573μs 45.9617 KOps/s 46.9936 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[False-True-False-False-False] 41.8390μs 13.6800μs 73.0992 KOps/s 73.8325 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[False-False-True-True-True] 69.6610μs 34.3740μs 29.0917 KOps/s 29.6147 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-False-True-True-False] 64.3310μs 22.7783μs 43.9014 KOps/s 44.2247 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-False-True-False-True] 63.0490μs 21.3534μs 46.8309 KOps/s 47.9885 KOps/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[False-False-True-False-False] 42.1500μs 13.7944μs 72.4933 KOps/s 73.9063 KOps/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[False-False-False-True-True] 70.8630μs 34.9623μs 28.6022 KOps/s 28.1211 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-False-True-False] 65.6330μs 23.7617μs 42.0846 KOps/s 42.2443 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-False-False-False-True] 64.6820μs 22.8768μs 43.7124 KOps/s 44.4109 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[False-False-False-False-False] 67.5470μs 15.0969μs 66.2388 KOps/s 66.8686 KOps/s $\color{#d91a1a}-0.94\%$
test_values[generalized_advantage_estimate-True-True] 10.1036ms 9.6905ms 103.1938 Ops/s 104.0724 Ops/s $\color{#d91a1a}-0.84\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.3282ms 33.6331ms 29.7326 Ops/s 29.8742 Ops/s $\color{#d91a1a}-0.47\%$
test_values[td0_return_estimate-False-False] 0.2428ms 0.1921ms 5.2058 KOps/s 5.5864 KOps/s $\textbf{\color{#d91a1a}-6.81\%}$
test_values[td1_return_estimate-False-False] 24.5470ms 24.0417ms 41.5944 Ops/s 41.8562 Ops/s $\color{#d91a1a}-0.63\%$
test_values[vec_td1_return_estimate-False-False] 37.6121ms 35.3151ms 28.3165 Ops/s 29.8205 Ops/s $\textbf{\color{#d91a1a}-5.04\%}$
test_values[td_lambda_return_estimate-True-False] 37.9502ms 34.3038ms 29.1513 Ops/s 28.7882 Ops/s $\color{#35bf28}+1.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.2201ms 35.0403ms 28.5386 Ops/s 29.7712 Ops/s $\color{#d91a1a}-4.14\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.7983ms 8.3187ms 120.2114 Ops/s 122.2178 Ops/s $\color{#d91a1a}-1.64\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5588ms 2.0280ms 493.0865 Ops/s 552.5772 Ops/s $\textbf{\color{#d91a1a}-10.77\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6181ms 0.3673ms 2.7225 KOps/s 2.8062 KOps/s $\color{#d91a1a}-2.98\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 51.4551ms 45.7887ms 21.8394 Ops/s 24.7995 Ops/s $\textbf{\color{#d91a1a}-11.94\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.1362ms 3.0298ms 330.0567 Ops/s 330.0048 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[False-None] 6.6856ms 1.3637ms 733.2975 Ops/s 746.7841 Ops/s $\color{#d91a1a}-1.81\%$
test_dqn_speed[False-backward] 2.3629ms 1.9100ms 523.5585 Ops/s 547.4979 Ops/s $\color{#d91a1a}-4.37\%$
test_dqn_speed[True-None] 0.7282ms 0.4641ms 2.1548 KOps/s 2.1372 KOps/s $\color{#35bf28}+0.82\%$
test_dqn_speed[True-backward] 0.9793ms 0.9072ms 1.1023 KOps/s 1.1377 KOps/s $\color{#d91a1a}-3.11\%$
test_dqn_speed[reduce-overhead-None] 0.6750ms 0.4730ms 2.1143 KOps/s 2.1759 KOps/s $\color{#d91a1a}-2.83\%$
test_dqn_speed[reduce-overhead-backward] 0.9428ms 0.8863ms 1.1282 KOps/s 1.1226 KOps/s $\color{#35bf28}+0.50\%$
test_ddpg_speed[False-None] 5.6258ms 2.9700ms 336.6988 Ops/s 361.5033 Ops/s $\textbf{\color{#d91a1a}-6.86\%}$
test_ddpg_speed[False-backward] 4.2949ms 4.0119ms 249.2590 Ops/s 254.4311 Ops/s $\color{#d91a1a}-2.03\%$
test_ddpg_speed[True-None] 2.1124ms 1.0163ms 983.9400 Ops/s 998.3676 Ops/s $\color{#d91a1a}-1.45\%$
test_ddpg_speed[True-backward] 1.9766ms 1.9106ms 523.3865 Ops/s 503.7244 Ops/s $\color{#35bf28}+3.90\%$
test_ddpg_speed[reduce-overhead-None] 1.1332ms 1.0137ms 986.4636 Ops/s 987.8335 Ops/s $\color{#d91a1a}-0.14\%$
test_ddpg_speed[reduce-overhead-backward] 2.1296ms 1.9621ms 509.6474 Ops/s 535.6258 Ops/s $\color{#d91a1a}-4.85\%$
test_sac_speed[False-None] 10.2812ms 8.4145ms 118.8429 Ops/s 100.3925 Ops/s $\textbf{\color{#35bf28}+18.38\%}$
test_sac_speed[False-backward] 12.7879ms 11.4939ms 87.0026 Ops/s 94.1632 Ops/s $\textbf{\color{#d91a1a}-7.60\%}$
test_sac_speed[True-None] 2.4531ms 1.8612ms 537.2751 Ops/s 530.8548 Ops/s $\color{#35bf28}+1.21\%$
test_sac_speed[True-backward] 3.6655ms 3.5522ms 281.5149 Ops/s 282.2977 Ops/s $\color{#d91a1a}-0.28\%$
test_sac_speed[reduce-overhead-None] 2.0967ms 1.8371ms 544.3233 Ops/s 528.3341 Ops/s $\color{#35bf28}+3.03\%$
test_sac_speed[reduce-overhead-backward] 3.5614ms 3.5013ms 285.6062 Ops/s 277.6611 Ops/s $\color{#35bf28}+2.86\%$
test_redq_speed[False-None] 14.4936ms 13.0583ms 76.5795 Ops/s 77.5801 Ops/s $\color{#d91a1a}-1.29\%$
test_redq_speed[False-backward] 24.3101ms 22.3132ms 44.8165 Ops/s 45.3410 Ops/s $\color{#d91a1a}-1.16\%$
test_redq_speed[True-None] 5.1990ms 4.6027ms 217.2640 Ops/s 203.7556 Ops/s $\textbf{\color{#35bf28}+6.63\%}$
test_redq_speed[True-backward] 12.8232ms 12.1040ms 82.6174 Ops/s 81.8945 Ops/s $\color{#35bf28}+0.88\%$
test_redq_speed[reduce-overhead-None] 5.6266ms 4.9974ms 200.1039 Ops/s 209.7415 Ops/s $\color{#d91a1a}-4.59\%$
test_redq_speed[reduce-overhead-backward] 13.9777ms 12.9423ms 77.2661 Ops/s 82.3822 Ops/s $\textbf{\color{#d91a1a}-6.21\%}$
test_redq_deprec_speed[False-None] 14.9349ms 13.6088ms 73.4821 Ops/s 78.3694 Ops/s $\textbf{\color{#d91a1a}-6.24\%}$
test_redq_deprec_speed[False-backward] 20.2232ms 19.0362ms 52.5316 Ops/s 53.8039 Ops/s $\color{#d91a1a}-2.36\%$
test_redq_deprec_speed[True-None] 4.6416ms 3.5714ms 279.9996 Ops/s 276.1056 Ops/s $\color{#35bf28}+1.41\%$
test_redq_deprec_speed[True-backward] 9.0094ms 8.3532ms 119.7143 Ops/s 123.2801 Ops/s $\color{#d91a1a}-2.89\%$
test_redq_deprec_speed[reduce-overhead-None] 4.8780ms 3.7068ms 269.7752 Ops/s 278.6487 Ops/s $\color{#d91a1a}-3.18\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.5171ms 8.0540ms 124.1623 Ops/s 121.5656 Ops/s $\color{#35bf28}+2.14\%$
test_td3_speed[False-None] 8.6255ms 8.0705ms 123.9084 Ops/s 126.5918 Ops/s $\color{#d91a1a}-2.12\%$
test_td3_speed[False-backward] 11.3479ms 10.5977ms 94.3603 Ops/s 97.5412 Ops/s $\color{#d91a1a}-3.26\%$
test_td3_speed[True-None] 2.1701ms 1.9178ms 521.4409 Ops/s 497.4550 Ops/s $\color{#35bf28}+4.82\%$
test_td3_speed[True-backward] 3.8030ms 3.5672ms 280.3286 Ops/s 254.5867 Ops/s $\textbf{\color{#35bf28}+10.11\%}$
test_td3_speed[reduce-overhead-None] 2.1297ms 1.9252ms 519.4309 Ops/s 500.2700 Ops/s $\color{#35bf28}+3.83\%$
test_td3_speed[reduce-overhead-backward] 3.6710ms 3.4975ms 285.9174 Ops/s 280.6622 Ops/s $\color{#35bf28}+1.87\%$
test_cql_speed[False-None] 39.5758ms 36.2774ms 27.5654 Ops/s 27.3848 Ops/s $\color{#35bf28}+0.66\%$
test_cql_speed[False-backward] 49.2279ms 46.5228ms 21.4948 Ops/s 21.4798 Ops/s $\color{#35bf28}+0.07\%$
test_cql_speed[True-None] 17.5884ms 15.6569ms 63.8696 Ops/s 62.0923 Ops/s $\color{#35bf28}+2.86\%$
test_cql_speed[True-backward] 23.7623ms 22.5543ms 44.3375 Ops/s 44.0669 Ops/s $\color{#35bf28}+0.61\%$
test_cql_speed[reduce-overhead-None] 16.9164ms 15.6120ms 64.0531 Ops/s 62.3195 Ops/s $\color{#35bf28}+2.78\%$
test_cql_speed[reduce-overhead-backward] 23.3170ms 21.8891ms 45.6848 Ops/s 44.3212 Ops/s $\color{#35bf28}+3.08\%$
test_a2c_speed[False-None] 7.8842ms 7.2792ms 137.3777 Ops/s 135.1682 Ops/s $\color{#35bf28}+1.63\%$
test_a2c_speed[False-backward] 15.1308ms 14.7219ms 67.9258 Ops/s 67.1363 Ops/s $\color{#35bf28}+1.18\%$
test_a2c_speed[True-None] 3.6990ms 3.3127ms 301.8660 Ops/s 296.1040 Ops/s $\color{#35bf28}+1.95\%$
test_a2c_speed[True-backward] 10.3665ms 9.8353ms 101.6746 Ops/s 101.9252 Ops/s $\color{#d91a1a}-0.25\%$
test_a2c_speed[reduce-overhead-None] 3.6450ms 3.2961ms 303.3877 Ops/s 300.1014 Ops/s $\color{#35bf28}+1.10\%$
test_a2c_speed[reduce-overhead-backward] 10.3778ms 9.7407ms 102.6617 Ops/s 99.0136 Ops/s $\color{#35bf28}+3.68\%$
test_ppo_speed[False-None] 9.1320ms 7.4469ms 134.2846 Ops/s 130.5207 Ops/s $\color{#35bf28}+2.88\%$
test_ppo_speed[False-backward] 15.5924ms 14.6562ms 68.2303 Ops/s 66.2435 Ops/s $\color{#35bf28}+3.00\%$
test_ppo_speed[True-None] 4.4571ms 3.7203ms 268.7944 Ops/s 262.1513 Ops/s $\color{#35bf28}+2.53\%$
test_ppo_speed[True-backward] 10.4586ms 9.8013ms 102.0277 Ops/s 100.5898 Ops/s $\color{#35bf28}+1.43\%$
test_ppo_speed[reduce-overhead-None] 4.4387ms 3.7320ms 267.9497 Ops/s 264.2345 Ops/s $\color{#35bf28}+1.41\%$
test_ppo_speed[reduce-overhead-backward] 10.1685ms 9.6248ms 103.8979 Ops/s 103.2285 Ops/s $\color{#35bf28}+0.65\%$
test_reinforce_speed[False-None] 7.6654ms 6.5120ms 153.5633 Ops/s 152.2413 Ops/s $\color{#35bf28}+0.87\%$
test_reinforce_speed[False-backward] 10.3347ms 9.7997ms 102.0444 Ops/s 101.3394 Ops/s $\color{#35bf28}+0.70\%$
test_reinforce_speed[True-None] 2.9313ms 2.6230ms 381.2472 Ops/s 376.0563 Ops/s $\color{#35bf28}+1.38\%$
test_reinforce_speed[True-backward] 10.2984ms 9.0138ms 110.9413 Ops/s 111.6414 Ops/s $\color{#d91a1a}-0.63\%$
test_reinforce_speed[reduce-overhead-None] 5.1909ms 2.6632ms 375.4897 Ops/s 372.1645 Ops/s $\color{#35bf28}+0.89\%$
test_reinforce_speed[reduce-overhead-backward] 9.4946ms 8.6435ms 115.6933 Ops/s 115.2546 Ops/s $\color{#35bf28}+0.38\%$
test_iql_speed[False-None] 33.7163ms 32.1540ms 31.1003 Ops/s 30.6932 Ops/s $\color{#35bf28}+1.33\%$
test_iql_speed[False-backward] 48.6473ms 45.4093ms 22.0219 Ops/s 22.1100 Ops/s $\color{#d91a1a}-0.40\%$
test_iql_speed[True-None] 14.5170ms 13.6629ms 73.1909 Ops/s 72.8413 Ops/s $\color{#35bf28}+0.48\%$
test_iql_speed[True-backward] 25.0812ms 24.3069ms 41.1405 Ops/s 40.5204 Ops/s $\color{#35bf28}+1.53\%$
test_iql_speed[reduce-overhead-None] 14.3695ms 13.4846ms 74.1585 Ops/s 72.8991 Ops/s $\color{#35bf28}+1.73\%$
test_iql_speed[reduce-overhead-backward] 26.0907ms 24.4427ms 40.9120 Ops/s 40.8013 Ops/s $\color{#35bf28}+0.27\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.2680ms 5.0598ms 197.6352 Ops/s 191.0840 Ops/s $\color{#35bf28}+3.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2136ms 0.4923ms 2.0314 KOps/s 2.0842 KOps/s $\color{#d91a1a}-2.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.4524ms 0.4654ms 2.1489 KOps/s 2.2106 KOps/s $\color{#d91a1a}-2.79\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6682ms 5.0034ms 199.8661 Ops/s 193.2184 Ops/s $\color{#35bf28}+3.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.7621ms 0.4716ms 2.1206 KOps/s 2.1242 KOps/s $\color{#d91a1a}-0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9495ms 0.4515ms 2.2147 KOps/s 2.2320 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4332ms 1.6734ms 597.5786 Ops/s 600.1601 Ops/s $\color{#d91a1a}-0.43\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7229ms 1.5138ms 660.5947 Ops/s 660.0690 Ops/s $\color{#35bf28}+0.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.0478ms 5.1659ms 193.5755 Ops/s 190.5019 Ops/s $\color{#35bf28}+1.61\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3102ms 0.6161ms 1.6232 KOps/s 686.1544 Ops/s $\textbf{\color{#35bf28}+136.57\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0220ms 0.5935ms 1.6848 KOps/s 1.6937 KOps/s $\color{#d91a1a}-0.52\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1881ms 5.2782ms 189.4588 Ops/s 189.1702 Ops/s $\color{#35bf28}+0.15\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9571ms 0.4762ms 2.0999 KOps/s 2.1286 KOps/s $\color{#d91a1a}-1.35\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8276ms 0.4594ms 2.1769 KOps/s 2.1040 KOps/s $\color{#35bf28}+3.47\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9890ms 5.1229ms 195.2024 Ops/s 188.2019 Ops/s $\color{#35bf28}+3.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6984ms 0.4757ms 2.1022 KOps/s 2.1298 KOps/s $\color{#d91a1a}-1.29\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7941ms 0.4517ms 2.2138 KOps/s 2.2422 KOps/s $\color{#d91a1a}-1.27\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5624ms 5.1750ms 193.2386 Ops/s 186.6539 Ops/s $\color{#35bf28}+3.53\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8660ms 0.6118ms 1.6346 KOps/s 1.6197 KOps/s $\color{#35bf28}+0.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 8.1800ms 0.5958ms 1.6784 KOps/s 1.6557 KOps/s $\color{#35bf28}+1.37\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.2649ms 4.0972ms 244.0695 Ops/s 229.6048 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.8356ms 2.2563ms 443.1983 Ops/s 497.6762 Ops/s $\textbf{\color{#d91a1a}-10.95\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9493ms 1.2566ms 795.7887 Ops/s 735.3753 Ops/s $\textbf{\color{#35bf28}+8.22\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3748s 11.5794ms 86.3606 Ops/s 233.2352 Ops/s $\textbf{\color{#d91a1a}-62.97\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.4613ms 1.9720ms 507.0957 Ops/s 449.6507 Ops/s $\textbf{\color{#35bf28}+12.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.4896ms 1.4059ms 711.2951 Ops/s 776.0237 Ops/s $\textbf{\color{#d91a1a}-8.34\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.4962ms 4.3287ms 231.0174 Ops/s 231.3095 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.5620ms 2.5149ms 397.6285 Ops/s 401.4568 Ops/s $\color{#d91a1a}-0.95\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2571ms 1.4419ms 693.5393 Ops/s 625.1193 Ops/s $\textbf{\color{#35bf28}+10.95\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1019s 0.1017s 9.8305 Ops/s 9.8575 Ops/s $\color{#d91a1a}-0.27\%$
test_sync 92.3707ms 89.2536ms 11.2040 Ops/s 11.4260 Ops/s $\color{#d91a1a}-1.94\%$
test_async 0.2603s 86.0393ms 11.6226 Ops/s 11.8806 Ops/s $\color{#d91a1a}-2.17\%$
test_single_pixels 0.1109s 0.1091s 9.1663 Ops/s 9.2363 Ops/s $\color{#d91a1a}-0.76\%$
test_sync_pixels 72.2278ms 71.2905ms 14.0271 Ops/s 13.9436 Ops/s $\color{#35bf28}+0.60\%$
test_async_pixels 0.1268s 66.1428ms 15.1188 Ops/s 14.9434 Ops/s $\color{#35bf28}+1.17\%$
test_simple 0.7277s 0.7242s 1.3808 Ops/s 1.3473 Ops/s $\color{#35bf28}+2.49\%$
test_transformed 0.9452s 0.9428s 1.0607 Ops/s 1.0424 Ops/s $\color{#35bf28}+1.75\%$
test_serial 2.1276s 2.0560s 0.4864 Ops/s 0.4875 Ops/s $\color{#d91a1a}-0.22\%$
test_parallel 1.9689s 1.8569s 0.5385 Ops/s 0.5349 Ops/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-True-True-True-True] 0.2607ms 36.6702μs 27.2701 KOps/s 28.2196 KOps/s $\color{#d91a1a}-3.36\%$
test_step_mdp_speed[True-True-True-True-False] 49.0210μs 20.2598μs 49.3589 KOps/s 48.3028 KOps/s $\color{#35bf28}+2.19\%$
test_step_mdp_speed[True-True-True-False-True] 55.6410μs 20.7234μs 48.2547 KOps/s 48.0530 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[True-True-True-False-False] 38.3010μs 11.8665μs 84.2709 KOps/s 84.4033 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-True-False-True-True] 67.7310μs 38.5442μs 25.9443 KOps/s 26.5479 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[True-True-False-True-False] 54.3720μs 22.3567μs 44.7294 KOps/s 44.6184 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-False-False-True] 54.5010μs 22.8206μs 43.8200 KOps/s 44.2772 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[True-True-False-False-False] 44.1810μs 13.6865μs 73.0645 KOps/s 73.8984 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[True-False-True-True-True] 75.7710μs 40.8190μs 24.4984 KOps/s 24.9091 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[True-False-True-True-False] 56.2010μs 24.5525μs 40.7290 KOps/s 40.5634 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[True-False-True-False-True] 53.9310μs 22.1947μs 45.0558 KOps/s 44.5331 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[True-False-True-False-False] 39.9810μs 13.6171μs 73.4373 KOps/s 69.4757 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_step_mdp_speed[True-False-False-True-True] 0.1176ms 41.1097μs 24.3252 KOps/s 23.0632 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_step_mdp_speed[True-False-False-True-False] 54.1410μs 26.2429μs 38.1056 KOps/s 36.8475 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[True-False-False-False-True] 58.7310μs 24.1276μs 41.4462 KOps/s 39.7200 KOps/s $\color{#35bf28}+4.35\%$
test_step_mdp_speed[True-False-False-False-False] 42.3110μs 15.5628μs 64.2557 KOps/s 63.2551 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-True-True-True-True] 71.9120μs 40.2440μs 24.8484 KOps/s 24.0132 KOps/s $\color{#35bf28}+3.48\%$
test_step_mdp_speed[False-True-True-True-False] 53.2910μs 24.4575μs 40.8873 KOps/s 37.4989 KOps/s $\textbf{\color{#35bf28}+9.04\%}$
test_step_mdp_speed[False-True-True-False-True] 55.8410μs 25.7183μs 38.8829 KOps/s 39.1476 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[False-True-True-False-False] 45.6310μs 15.0518μs 66.4375 KOps/s 65.7630 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-True-False-True-True] 72.5810μs 42.3189μs 23.6301 KOps/s 23.4700 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-True-False-True-False] 59.3110μs 26.2807μs 38.0508 KOps/s 36.9805 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[False-True-False-False-True] 3.5927ms 27.8302μs 35.9321 KOps/s 36.6433 KOps/s $\color{#d91a1a}-1.94\%$
test_step_mdp_speed[False-True-False-False-False] 49.2810μs 17.1730μs 58.2311 KOps/s 58.2239 KOps/s $\color{#35bf28}+0.01\%$
test_step_mdp_speed[False-False-True-True-True] 73.6020μs 44.6804μs 22.3812 KOps/s 22.3902 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-True-True-False] 60.4910μs 28.6954μs 34.8488 KOps/s 35.2368 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-False-True-False-True] 52.2910μs 27.5470μs 36.3016 KOps/s 35.7433 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[False-False-True-False-False] 50.1010μs 17.1052μs 58.4617 KOps/s 58.0066 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-False-False-True-True] 82.7010μs 45.8093μs 21.8296 KOps/s 22.0384 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-False-False-True-False] 61.4610μs 30.8024μs 32.4650 KOps/s 32.7656 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-False-False-False-True] 55.1010μs 28.3093μs 35.3241 KOps/s 34.8112 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[False-False-False-False-False] 49.5610μs 18.7923μs 53.2134 KOps/s 52.8787 KOps/s $\color{#35bf28}+0.63\%$
test_values[generalized_advantage_estimate-True-True] 25.4687ms 24.7973ms 40.3269 Ops/s 40.7161 Ops/s $\color{#d91a1a}-0.96\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1061s 3.0156ms 331.6038 Ops/s 313.5152 Ops/s $\textbf{\color{#35bf28}+5.77\%}$
test_values[td0_return_estimate-False-False] 97.0320μs 66.6148μs 15.0117 KOps/s 15.1377 KOps/s $\color{#d91a1a}-0.83\%$
test_values[td1_return_estimate-False-False] 56.4226ms 55.2414ms 18.1024 Ops/s 17.7165 Ops/s $\color{#35bf28}+2.18\%$
test_values[vec_td1_return_estimate-False-False] 1.4173ms 1.0714ms 933.3809 Ops/s 933.8582 Ops/s $\color{#d91a1a}-0.05\%$
test_values[td_lambda_return_estimate-True-False] 89.8914ms 87.7626ms 11.3944 Ops/s 11.1868 Ops/s $\color{#35bf28}+1.86\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4198ms 1.0654ms 938.6318 Ops/s 933.1241 Ops/s $\color{#35bf28}+0.59\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.8732ms 25.3947ms 39.3783 Ops/s 40.3835 Ops/s $\color{#d91a1a}-2.49\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0115ms 0.7519ms 1.3299 KOps/s 1.3903 KOps/s $\color{#d91a1a}-4.34\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7720ms 0.6726ms 1.4869 KOps/s 1.5031 KOps/s $\color{#d91a1a}-1.08\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5388ms 1.4644ms 682.8871 Ops/s 684.1224 Ops/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7444ms 0.6936ms 1.4417 KOps/s 1.4766 KOps/s $\color{#d91a1a}-2.36\%$
test_dqn_speed[False-None] 7.1076ms 1.3163ms 759.7147 Ops/s 781.5715 Ops/s $\color{#d91a1a}-2.80\%$
test_dqn_speed[False-backward] 1.9289ms 1.8214ms 549.0180 Ops/s 565.6099 Ops/s $\color{#d91a1a}-2.93\%$
test_dqn_speed[True-None] 0.7707ms 0.5624ms 1.7779 KOps/s 1.7950 KOps/s $\color{#d91a1a}-0.95\%$
test_dqn_speed[True-backward] 1.3538ms 1.0148ms 985.4108 Ops/s 1.0087 KOps/s $\color{#d91a1a}-2.31\%$
test_dqn_speed[reduce-overhead-None] 0.8947ms 0.5549ms 1.8020 KOps/s 1.7750 KOps/s $\color{#35bf28}+1.52\%$
test_dqn_speed[reduce-overhead-backward] 1.0821ms 1.0120ms 988.1445 Ops/s 1.0063 KOps/s $\color{#d91a1a}-1.80\%$
test_ddpg_speed[False-None] 3.2114ms 2.6705ms 374.4640 Ops/s 375.2363 Ops/s $\color{#d91a1a}-0.21\%$
test_ddpg_speed[False-backward] 4.2168ms 3.9579ms 252.6600 Ops/s 260.5563 Ops/s $\color{#d91a1a}-3.03\%$
test_ddpg_speed[True-None] 1.3408ms 1.2444ms 803.5692 Ops/s 778.5316 Ops/s $\color{#35bf28}+3.22\%$
test_ddpg_speed[True-backward] 2.3270ms 2.2264ms 449.1587 Ops/s 451.1605 Ops/s $\color{#d91a1a}-0.44\%$
test_ddpg_speed[reduce-overhead-None] 1.9802ms 1.2634ms 791.4988 Ops/s 811.8215 Ops/s $\color{#d91a1a}-2.50\%$
test_ddpg_speed[reduce-overhead-backward] 2.2936ms 2.2185ms 450.7496 Ops/s 452.1184 Ops/s $\color{#d91a1a}-0.30\%$
test_sac_speed[False-None] 8.7263ms 7.4170ms 134.8262 Ops/s 136.0430 Ops/s $\color{#d91a1a}-0.89\%$
test_sac_speed[False-backward] 11.0470ms 10.5452ms 94.8303 Ops/s 94.2378 Ops/s $\color{#35bf28}+0.63\%$
test_sac_speed[True-None] 2.1491ms 2.0264ms 493.4885 Ops/s 474.4650 Ops/s $\color{#35bf28}+4.01\%$
test_sac_speed[True-backward] 5.5705ms 4.3202ms 231.4707 Ops/s 249.8287 Ops/s $\textbf{\color{#d91a1a}-7.35\%}$
test_sac_speed[reduce-overhead-None] 2.1829ms 2.0641ms 484.4634 Ops/s 482.3413 Ops/s $\color{#35bf28}+0.44\%$
test_sac_speed[reduce-overhead-backward] 4.1942ms 4.0434ms 247.3137 Ops/s 249.1083 Ops/s $\color{#d91a1a}-0.72\%$
test_redq_speed[False-None] 10.9880ms 10.1019ms 98.9917 Ops/s 98.3514 Ops/s $\color{#35bf28}+0.65\%$
test_redq_speed[False-backward] 18.1425ms 17.6295ms 56.7232 Ops/s 58.1690 Ops/s $\color{#d91a1a}-2.49\%$
test_redq_speed[True-None] 3.8181ms 3.4607ms 288.9551 Ops/s 277.3647 Ops/s $\color{#35bf28}+4.18\%$
test_redq_speed[True-backward] 8.6844ms 8.2998ms 120.4856 Ops/s 115.4555 Ops/s $\color{#35bf28}+4.36\%$
test_redq_speed[reduce-overhead-None] 3.8486ms 3.4832ms 287.0895 Ops/s 278.7695 Ops/s $\color{#35bf28}+2.98\%$
test_redq_speed[reduce-overhead-backward] 8.7128ms 8.3826ms 119.2948 Ops/s 118.3431 Ops/s $\color{#35bf28}+0.80\%$
test_redq_deprec_speed[False-None] 12.1896ms 10.2617ms 97.4501 Ops/s 97.2997 Ops/s $\color{#35bf28}+0.15\%$
test_redq_deprec_speed[False-backward] 15.4693ms 14.9686ms 66.8067 Ops/s 67.4541 Ops/s $\color{#d91a1a}-0.96\%$
test_redq_deprec_speed[True-None] 3.5070ms 3.2077ms 311.7488 Ops/s 318.1680 Ops/s $\color{#d91a1a}-2.02\%$
test_redq_deprec_speed[True-backward] 7.4356ms 6.9337ms 144.2222 Ops/s 133.1403 Ops/s $\textbf{\color{#35bf28}+8.32\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.4268ms 3.1866ms 313.8151 Ops/s 313.7836 Ops/s $\color{#35bf28}+0.01\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.2738ms 6.8559ms 145.8600 Ops/s 143.7857 Ops/s $\color{#35bf28}+1.44\%$
test_td3_speed[False-None] 7.6574ms 7.3703ms 135.6796 Ops/s 135.4518 Ops/s $\color{#35bf28}+0.17\%$
test_td3_speed[False-backward] 12.0316ms 10.5566ms 94.7277 Ops/s 98.2916 Ops/s $\color{#d91a1a}-3.63\%$
test_td3_speed[True-None] 2.0977ms 2.0667ms 483.8737 Ops/s 478.1413 Ops/s $\color{#35bf28}+1.20\%$
test_td3_speed[True-backward] 3.9855ms 3.8909ms 257.0084 Ops/s 238.5854 Ops/s $\textbf{\color{#35bf28}+7.72\%}$
test_td3_speed[reduce-overhead-None] 2.1696ms 2.0737ms 482.2221 Ops/s 485.7588 Ops/s $\color{#d91a1a}-0.73\%$
test_td3_speed[reduce-overhead-backward] 4.0382ms 3.8854ms 257.3743 Ops/s 256.1287 Ops/s $\color{#35bf28}+0.49\%$
test_cql_speed[False-None] 27.2258ms 24.4269ms 40.9385 Ops/s 40.9658 Ops/s $\color{#d91a1a}-0.07\%$
test_cql_speed[False-backward] 34.7803ms 33.4578ms 29.8884 Ops/s 30.1065 Ops/s $\color{#d91a1a}-0.72\%$
test_cql_speed[True-None] 11.4635ms 10.9995ms 90.9130 Ops/s 92.2468 Ops/s $\color{#d91a1a}-1.45\%$
test_cql_speed[True-backward] 17.4825ms 16.9525ms 58.9883 Ops/s 59.2979 Ops/s $\color{#d91a1a}-0.52\%$
test_cql_speed[reduce-overhead-None] 11.4336ms 11.0441ms 90.5460 Ops/s 90.7076 Ops/s $\color{#d91a1a}-0.18\%$
test_cql_speed[reduce-overhead-backward] 17.4192ms 16.8988ms 59.1758 Ops/s 59.4386 Ops/s $\color{#d91a1a}-0.44\%$
test_a2c_speed[False-None] 7.3398ms 5.4095ms 184.8600 Ops/s 187.0653 Ops/s $\color{#d91a1a}-1.18\%$
test_a2c_speed[False-backward] 11.9997ms 11.6569ms 85.7860 Ops/s 84.7976 Ops/s $\color{#35bf28}+1.17\%$
test_a2c_speed[True-None] 3.3208ms 3.0454ms 328.3656 Ops/s 330.5839 Ops/s $\color{#d91a1a}-0.67\%$
test_a2c_speed[True-backward] 9.0416ms 8.6449ms 115.6748 Ops/s 112.3398 Ops/s $\color{#35bf28}+2.97\%$
test_a2c_speed[reduce-overhead-None] 3.2855ms 3.0819ms 324.4785 Ops/s 324.0524 Ops/s $\color{#35bf28}+0.13\%$
test_a2c_speed[reduce-overhead-backward] 8.8699ms 8.6642ms 115.4175 Ops/s 117.4308 Ops/s $\color{#d91a1a}-1.71\%$
test_ppo_speed[False-None] 7.5377ms 5.5982ms 178.6293 Ops/s 178.3153 Ops/s $\color{#35bf28}+0.18\%$
test_ppo_speed[False-backward] 12.7646ms 12.1201ms 82.5075 Ops/s 82.7188 Ops/s $\color{#d91a1a}-0.26\%$
test_ppo_speed[True-None] 3.7053ms 3.4931ms 286.2824 Ops/s 287.1425 Ops/s $\color{#d91a1a}-0.30\%$
test_ppo_speed[True-backward] 8.6438ms 8.3889ms 119.2053 Ops/s 120.0002 Ops/s $\color{#d91a1a}-0.66\%$
test_ppo_speed[reduce-overhead-None] 3.6551ms 3.4786ms 287.4679 Ops/s 290.2738 Ops/s $\color{#d91a1a}-0.97\%$
test_ppo_speed[reduce-overhead-backward] 8.7479ms 8.3658ms 119.5344 Ops/s 119.8684 Ops/s $\color{#d91a1a}-0.28\%$
test_reinforce_speed[False-None] 4.6905ms 4.4154ms 226.4823 Ops/s 223.8175 Ops/s $\color{#35bf28}+1.19\%$
test_reinforce_speed[False-backward] 7.5553ms 7.2821ms 137.3239 Ops/s 135.7824 Ops/s $\color{#35bf28}+1.14\%$
test_reinforce_speed[True-None] 2.4587ms 2.2901ms 436.6637 Ops/s 446.2239 Ops/s $\color{#d91a1a}-2.14\%$
test_reinforce_speed[True-backward] 7.4511ms 7.0925ms 140.9941 Ops/s 141.1854 Ops/s $\color{#d91a1a}-0.14\%$
test_reinforce_speed[reduce-overhead-None] 2.4383ms 2.2711ms 440.3059 Ops/s 441.6406 Ops/s $\color{#d91a1a}-0.30\%$
test_reinforce_speed[reduce-overhead-backward] 7.5845ms 7.1168ms 140.5120 Ops/s 139.6939 Ops/s $\color{#35bf28}+0.59\%$
test_iql_speed[False-None] 19.3963ms 18.7931ms 53.2110 Ops/s 51.3612 Ops/s $\color{#35bf28}+3.60\%$
test_iql_speed[False-backward] 30.1801ms 29.2699ms 34.1648 Ops/s 33.7970 Ops/s $\color{#35bf28}+1.09\%$
test_iql_speed[True-None] 8.3792ms 7.9548ms 125.7103 Ops/s 123.8910 Ops/s $\color{#35bf28}+1.47\%$
test_iql_speed[True-backward] 17.1281ms 16.5977ms 60.2493 Ops/s 59.7007 Ops/s $\color{#35bf28}+0.92\%$
test_iql_speed[reduce-overhead-None] 8.4314ms 7.9606ms 125.6191 Ops/s 125.8100 Ops/s $\color{#d91a1a}-0.15\%$
test_iql_speed[reduce-overhead-backward] 17.0639ms 16.6972ms 59.8902 Ops/s 60.1372 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8498ms 6.4928ms 154.0172 Ops/s 153.5966 Ops/s $\color{#35bf28}+0.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7418ms 0.3410ms 2.9327 KOps/s 4.2282 KOps/s $\textbf{\color{#d91a1a}-30.64\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7112ms 0.3296ms 3.0340 KOps/s 4.6433 KOps/s $\textbf{\color{#d91a1a}-34.66\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.8319ms 6.4289ms 155.5476 Ops/s 156.4583 Ops/s $\color{#d91a1a}-0.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2507ms 0.3352ms 2.9829 KOps/s 4.2962 KOps/s $\textbf{\color{#d91a1a}-30.57\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5390ms 0.3142ms 3.1830 KOps/s 4.7044 KOps/s $\textbf{\color{#d91a1a}-32.34\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7280ms 1.5031ms 665.2997 Ops/s 814.1944 Ops/s $\textbf{\color{#d91a1a}-18.29\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 8.4366ms 1.4744ms 678.2424 Ops/s 880.2944 Ops/s $\textbf{\color{#d91a1a}-22.95\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8318ms 6.5418ms 152.8627 Ops/s 153.9625 Ops/s $\color{#d91a1a}-0.71\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0069ms 0.3792ms 2.6370 KOps/s 2.2781 KOps/s $\textbf{\color{#35bf28}+15.76\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6788ms 0.4837ms 2.0673 KOps/s 2.2218 KOps/s $\textbf{\color{#d91a1a}-6.96\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.7659ms 6.4667ms 154.6393 Ops/s 156.5543 Ops/s $\color{#d91a1a}-1.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1665ms 0.2364ms 4.2310 KOps/s 4.2477 KOps/s $\color{#d91a1a}-0.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5723ms 0.2185ms 4.5761 KOps/s 4.6581 KOps/s $\color{#d91a1a}-1.76\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.7770ms 6.4313ms 155.4892 Ops/s 158.3243 Ops/s $\color{#d91a1a}-1.79\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1303ms 0.2399ms 4.1682 KOps/s 4.2557 KOps/s $\color{#d91a1a}-2.06\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4280ms 0.2192ms 4.5615 KOps/s 3.7668 KOps/s $\textbf{\color{#35bf28}+21.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8602ms 6.6373ms 150.6631 Ops/s 154.0185 Ops/s $\color{#d91a1a}-2.18\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0683ms 0.5376ms 1.8601 KOps/s 2.5984 KOps/s $\textbf{\color{#d91a1a}-28.41\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6585ms 0.4321ms 2.3145 KOps/s 2.5177 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4084s 13.4129ms 74.5553 Ops/s 34.6778 Ops/s $\textbf{\color{#35bf28}+114.99\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8831ms 1.5101ms 662.1960 Ops/s 541.8269 Ops/s $\textbf{\color{#35bf28}+22.22\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.0743ms 1.2638ms 791.2887 Ops/s 1.0264 KOps/s $\textbf{\color{#d91a1a}-22.90\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.9457ms 5.3353ms 187.4294 Ops/s 184.5027 Ops/s $\color{#35bf28}+1.59\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.3536ms 2.0346ms 491.5003 Ops/s 682.0937 Ops/s $\textbf{\color{#d91a1a}-27.94\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.4275ms 1.1146ms 897.1864 Ops/s 813.9716 Ops/s $\textbf{\color{#35bf28}+10.22\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3660s 12.8312ms 77.9348 Ops/s 180.6727 Ops/s $\textbf{\color{#d91a1a}-56.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.9953ms 2.1035ms 475.4001 Ops/s 456.1461 Ops/s $\color{#35bf28}+4.22\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.0021ms 1.4354ms 696.6940 Ops/s 761.0185 Ops/s $\textbf{\color{#d91a1a}-8.45\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants