[Tutorial] LLM integration#2832
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2832
Note: Links to docs will display an error until the docs builds have been completed. ❌ 19 New Failures, 2 Cancelled Jobs, 1 Unrelated FailureAs of commit e019e25 with merge base 27d3680 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_simple | 0.6256s | 0.5288s | 1.8912 Ops/s | 1.9392 Ops/s | |
| test_transformed | 1.1186s | 1.0258s | 0.9749 Ops/s | 0.9726 Ops/s | |
| test_serial | 1.5132s | 1.5093s | 0.6626 Ops/s | 0.6583 Ops/s | |
| test_parallel | 1.3018s | 1.2974s | 0.7708 Ops/s | 0.7568 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.4243ms | 29.5971μs | 33.7871 KOps/s | 33.3754 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 80.4100μs | 17.6656μs | 56.6070 KOps/s | 56.7894 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 45.3250μs | 16.7446μs | 59.7208 KOps/s | 59.1619 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 72.4450μs | 9.8994μs | 101.0158 KOps/s | 99.6502 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 61.3150μs | 31.5505μs | 31.6952 KOps/s | 30.7474 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 69.8100μs | 19.3225μs | 51.7532 KOps/s | 50.4070 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 49.7130μs | 18.5442μs | 53.9254 KOps/s | 52.6756 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 63.7900μs | 11.6311μs | 85.9767 KOps/s | 83.0377 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.1064ms | 33.3027μs | 30.0276 KOps/s | 29.3757 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 82.4330μs | 21.0409μs | 47.5265 KOps/s | 45.8245 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 64.8410μs | 18.5898μs | 53.7929 KOps/s | 52.8864 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 36.2770μs | 11.6483μs | 85.8496 KOps/s | 83.7136 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 94.0660μs | 35.0178μs | 28.5569 KOps/s | 28.0199 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 64.9510μs | 22.9332μs | 43.6050 KOps/s | 42.5882 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 71.5940μs | 20.1078μs | 49.7319 KOps/s | 48.4301 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 0.5963ms | 13.2502μs | 75.4707 KOps/s | 72.8325 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 91.2830μs | 33.5847μs | 29.7755 KOps/s | 29.2575 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 91.0410μs | 21.2786μs | 46.9955 KOps/s | 46.5995 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.2468ms | 21.5351μs | 46.4359 KOps/s | 46.3320 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 40.0150μs | 13.0458μs | 76.6532 KOps/s | 74.9808 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 85.1600μs | 35.3040μs | 28.3254 KOps/s | 27.8805 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.1328ms | 23.8401μs | 41.9462 KOps/s | 42.2803 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 47.7290μs | 23.0797μs | 43.3282 KOps/s | 42.5809 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 38.9730μs | 14.7663μs | 67.7216 KOps/s | 65.7720 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 93.7250μs | 37.0172μs | 27.0145 KOps/s | 26.4105 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 54.3320μs | 24.7651μs | 40.3793 KOps/s | 38.8427 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 78.5070μs | 23.0187μs | 43.4429 KOps/s | 41.5411 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 65.1920μs | 14.7687μs | 67.7110 KOps/s | 66.5423 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.1208ms | 38.3495μs | 26.0759 KOps/s | 24.4769 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 79.6090μs | 26.1853μs | 38.1894 KOps/s | 36.7870 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.1448ms | 24.5789μs | 40.6853 KOps/s | 39.5246 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 63.3980μs | 16.2059μs | 61.7058 KOps/s | 58.9687 KOps/s | |
| test_values[generalized_advantage_estimate-True-True] | 14.3967ms | 10.0834ms | 99.1732 Ops/s | 101.0991 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 28.5023ms | 24.9208ms | 40.1271 Ops/s | 40.8182 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2248ms | 0.1755ms | 5.6989 KOps/s | 5.5841 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.7100ms | 25.0131ms | 39.9791 Ops/s | 39.6510 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 28.8841ms | 24.8880ms | 40.1800 Ops/s | 41.0845 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 36.0261ms | 35.3752ms | 28.2684 Ops/s | 28.4839 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 26.8356ms | 24.7192ms | 40.4544 Ops/s | 40.9387 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.7554ms | 8.5508ms | 116.9477 Ops/s | 117.3231 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.4434ms | 1.9437ms | 514.4846 Ops/s | 503.9644 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 1.5616ms | 0.3731ms | 2.6801 KOps/s | 2.6538 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 47.3556ms | 43.0265ms | 23.2415 Ops/s | 23.0826 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 4.2555ms | 3.4569ms | 289.2732 Ops/s | 283.8572 Ops/s | |
| test_dqn_speed[False-None] | 6.4537ms | 1.4056ms | 711.4607 Ops/s | 706.9900 Ops/s | |
| test_dqn_speed[False-backward] | 2.0781ms | 1.9193ms | 521.0313 Ops/s | 530.2117 Ops/s | |
| test_dqn_speed[True-None] | 0.7246ms | 0.5586ms | 1.7901 KOps/s | 1.7505 KOps/s | |
| test_dqn_speed[True-backward] | 1.0150ms | 0.9713ms | 1.0296 KOps/s | 709.5874 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6545ms | 0.5609ms | 1.7829 KOps/s | 1.7282 KOps/s | |
| test_dqn_speed[reduce-overhead-backward] | 1.1113ms | 0.9921ms | 1.0080 KOps/s | 1.0095 KOps/s | |
| test_ddpg_speed[False-None] | 3.7417ms | 2.8768ms | 347.6127 Ops/s | 338.6158 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1652ms | 4.0028ms | 249.8258 Ops/s | 246.1975 Ops/s | |
| test_ddpg_speed[True-None] | 1.9213ms | 1.4374ms | 695.7127 Ops/s | 678.5120 Ops/s | |
| test_ddpg_speed[True-backward] | 2.3571ms | 2.3141ms | 432.1418 Ops/s | 403.7110 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.6313ms | 1.4263ms | 701.0990 Ops/s | 688.0582 Ops/s | |
| test_ddpg_speed[reduce-overhead-backward] | 2.5603ms | 2.3655ms | 422.7422 Ops/s | 426.8446 Ops/s | |
| test_sac_speed[False-None] | 9.2601ms | 7.9819ms | 125.2832 Ops/s | 123.8634 Ops/s | |
| test_sac_speed[False-backward] | 12.3673ms | 10.6832ms | 93.6053 Ops/s | 91.1743 Ops/s | |
| test_sac_speed[True-None] | 3.7209ms | 2.5868ms | 386.5719 Ops/s | 370.1315 Ops/s | |
| test_sac_speed[True-backward] | 4.3859ms | 4.2386ms | 235.9259 Ops/s | 219.7123 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 3.3893ms | 2.5873ms | 386.5088 Ops/s | 381.4955 Ops/s | |
| test_sac_speed[reduce-overhead-backward] | 5.5734ms | 4.6868ms | 213.3655 Ops/s | 230.6139 Ops/s | |
| test_redq_speed[False-None] | 15.8373ms | 13.6951ms | 73.0190 Ops/s | 75.6552 Ops/s | |
| test_redq_speed[False-backward] | 27.0987ms | 23.4141ms | 42.7093 Ops/s | 43.8288 Ops/s | |
| test_redq_speed[True-None] | 8.2042ms | 7.3564ms | 135.9366 Ops/s | 131.4643 Ops/s | |
| test_redq_speed[True-backward] | 16.0858ms | 14.9068ms | 67.0837 Ops/s | 66.9438 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 8.9077ms | 7.2488ms | 137.9530 Ops/s | 142.9467 Ops/s | |
| test_redq_speed[reduce-overhead-backward] | 16.5203ms | 15.0965ms | 66.2406 Ops/s | 68.7592 Ops/s | |
| test_redq_deprec_speed[False-None] | 14.4449ms | 13.2478ms | 75.4842 Ops/s | 74.8820 Ops/s | |
| test_redq_deprec_speed[False-backward] | 20.7554ms | 19.3890ms | 51.5756 Ops/s | 51.6205 Ops/s | |
| test_redq_deprec_speed[True-None] | 5.9491ms | 5.2078ms | 192.0181 Ops/s | 186.8758 Ops/s | |
| test_redq_deprec_speed[True-backward] | 12.2764ms | 10.4243ms | 95.9297 Ops/s | 95.7382 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 6.3975ms | 5.3908ms | 185.5028 Ops/s | 181.1453 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-backward] | 11.0397ms | 10.4272ms | 95.9034 Ops/s | 89.8444 Ops/s | |
| test_td3_speed[False-None] | 8.6642ms | 8.1596ms | 122.5557 Ops/s | 115.3317 Ops/s | |
| test_td3_speed[False-backward] | 11.5850ms | 10.8264ms | 92.3669 Ops/s | 88.1611 Ops/s | |
| test_td3_speed[True-None] | 2.9089ms | 2.3782ms | 420.4807 Ops/s | 417.3540 Ops/s | |
| test_td3_speed[True-backward] | 4.8379ms | 4.5355ms | 220.4834 Ops/s | 227.6169 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 2.7348ms | 2.3113ms | 432.6481 Ops/s | 420.6859 Ops/s | |
| test_td3_speed[reduce-overhead-backward] | 4.6315ms | 4.3546ms | 229.6447 Ops/s | 241.7456 Ops/s | |
| test_cql_speed[False-None] | 39.9378ms | 37.8022ms | 26.4535 Ops/s | 27.1053 Ops/s | |
| test_cql_speed[False-backward] | 52.2866ms | 48.7959ms | 20.4935 Ops/s | 21.1288 Ops/s | |
| test_cql_speed[True-None] | 23.9700ms | 22.8902ms | 43.6869 Ops/s | 44.1481 Ops/s | |
| test_cql_speed[True-backward] | 31.2198ms | 30.1064ms | 33.2155 Ops/s | 33.1293 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 24.4098ms | 22.7202ms | 44.0137 Ops/s | 43.4849 Ops/s | |
| test_cql_speed[reduce-overhead-backward] | 31.6363ms | 30.1299ms | 33.1897 Ops/s | 32.9449 Ops/s | |
| test_a2c_speed[False-None] | 9.6886ms | 7.5630ms | 132.2220 Ops/s | 131.3522 Ops/s | |
| test_a2c_speed[False-backward] | 18.5042ms | 15.4144ms | 64.8743 Ops/s | 65.9245 Ops/s | |
| test_a2c_speed[True-None] | 5.3224ms | 4.8463ms | 206.3436 Ops/s | 205.4463 Ops/s | |
| test_a2c_speed[True-backward] | 12.8536ms | 11.5529ms | 86.5585 Ops/s | 88.2862 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 6.1863ms | 5.1611ms | 193.7560 Ops/s | 207.0729 Ops/s | |
| test_a2c_speed[reduce-overhead-backward] | 12.9676ms | 12.2361ms | 81.7251 Ops/s | 86.5253 Ops/s | |
| test_ppo_speed[False-None] | 9.1545ms | 8.1449ms | 122.7762 Ops/s | 129.7990 Ops/s | |
| test_ppo_speed[False-backward] | 18.1662ms | 16.0463ms | 62.3197 Ops/s | 66.4411 Ops/s | |
| test_ppo_speed[True-None] | 6.5429ms | 5.6113ms | 178.2105 Ops/s | 196.1468 Ops/s | |
| test_ppo_speed[True-backward] | 12.2576ms | 12.0009ms | 83.3268 Ops/s | 88.0905 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 6.5646ms | 5.2056ms | 192.1007 Ops/s | 194.6062 Ops/s | |
| test_ppo_speed[reduce-overhead-backward] | 12.1914ms | 11.1862ms | 89.3956 Ops/s | 87.3514 Ops/s | |
| test_reinforce_speed[False-None] | 7.7718ms | 6.6411ms | 150.5773 Ops/s | 148.8619 Ops/s | |
| test_reinforce_speed[False-backward] | 10.3162ms | 9.9108ms | 100.8998 Ops/s | 97.9612 Ops/s | |
| test_reinforce_speed[True-None] | 4.9204ms | 4.2607ms | 234.7008 Ops/s | 233.1382 Ops/s | |
| test_reinforce_speed[True-backward] | 11.1528ms | 10.2603ms | 97.4626 Ops/s | 94.8035 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 5.7003ms | 4.1606ms | 240.3479 Ops/s | 235.7231 Ops/s | |
| test_reinforce_speed[reduce-overhead-backward] | 11.1556ms | 10.7840ms | 92.7303 Ops/s | 96.4990 Ops/s | |
| test_iql_speed[False-None] | 40.3053ms | 34.3370ms | 29.1231 Ops/s | 29.9407 Ops/s | |
| test_iql_speed[False-backward] | 48.4938ms | 46.7109ms | 21.4083 Ops/s | 21.7381 Ops/s | |
| test_iql_speed[True-None] | 17.4683ms | 16.4006ms | 60.9735 Ops/s | 61.9703 Ops/s | |
| test_iql_speed[True-backward] | 29.3814ms | 28.3805ms | 35.2354 Ops/s | 36.3213 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 17.6696ms | 16.2808ms | 61.4220 Ops/s | 62.0210 Ops/s | |
| test_iql_speed[reduce-overhead-backward] | 29.5209ms | 28.4341ms | 35.1690 Ops/s | 36.6029 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 7.8792ms | 5.4008ms | 185.1591 Ops/s | 202.8195 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8581ms | 0.5264ms | 1.8996 KOps/s | 1.9088 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.8436ms | 0.5093ms | 1.9633 KOps/s | 1.9920 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 7.1965ms | 4.7304ms | 211.3975 Ops/s | 214.8862 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 3.1423ms | 0.5193ms | 1.9255 KOps/s | 1.9482 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.8435ms | 0.4929ms | 2.0288 KOps/s | 2.0346 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.3492ms | 1.6836ms | 593.9538 Ops/s | 597.9386 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 3.4839ms | 1.6226ms | 616.3129 Ops/s | 629.2653 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.8811ms | 5.5790ms | 179.2433 Ops/s | 204.7301 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.9425ms | 0.6860ms | 1.4577 KOps/s | 1.5241 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.1318ms | 0.6649ms | 1.5040 KOps/s | 1.5653 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.6724ms | 5.2128ms | 191.8365 Ops/s | 210.6736 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.9142ms | 0.5491ms | 1.8213 KOps/s | 1.9356 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7926ms | 0.5225ms | 1.9140 KOps/s | 1.9641 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.5111ms | 5.0286ms | 198.8633 Ops/s | 217.5886 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 3.6593ms | 0.5504ms | 1.8169 KOps/s | 1.9694 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7752ms | 0.5197ms | 1.9241 KOps/s | 2.0377 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 7.6678ms | 5.4132ms | 184.7340 Ops/s | 209.1359 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.7013ms | 0.6911ms | 1.4470 KOps/s | 1.4851 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.2177ms | 0.6675ms | 1.4981 KOps/s | 1.5662 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.8306ms | 4.6940ms | 213.0401 Ops/s | 23.9887 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 10.4090ms | 2.6027ms | 384.2146 Ops/s | 425.1800 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 6.5146ms | 1.4322ms | 698.2139 Ops/s | 809.6516 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.1910ms | 4.5713ms | 218.7559 Ops/s | 219.3957 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.8640s | 19.6506ms | 50.8890 Ops/s | 430.6518 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 5.2330ms | 1.3625ms | 733.9550 Ops/s | 702.8890 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.9578ms | 5.0041ms | 199.8356 Ops/s | 221.1958 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 9.0728ms | 2.6523ms | 377.0374 Ops/s | 399.0503 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 4.2677ms | 1.5882ms | 629.6592 Ops/s | 646.6836 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 13.8849ms | 12.5097ms | 79.9381 Ops/s | 78.7479 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 16.4625ms | 15.0539ms | 66.4282 Ops/s | 69.9334 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 22.2286ms | 21.4296ms | 46.6644 Ops/s | 46.5489 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 18.4366ms | 15.3269ms | 65.2446 Ops/s | 67.8967 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 22.9283ms | 21.1055ms | 47.3810 Ops/s | 46.9948 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 22.4239ms | 16.1647ms | 61.8631 Ops/s | 62.3030 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_simple | 0.9200s | 0.8288s | 1.2065 Ops/s | 1.2007 Ops/s | |
| test_transformed | 1.5427s | 1.4451s | 0.6920 Ops/s | 0.6509 Ops/s | |
| test_serial | 2.4513s | 2.3533s | 0.4249 Ops/s | 0.4139 Ops/s | |
| test_parallel | 1.8757s | 1.8533s | 0.5396 Ops/s | 0.5373 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1881ms | 39.9134μs | 25.0542 KOps/s | 25.0989 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 53.7510μs | 22.9810μs | 43.5142 KOps/s | 42.6081 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 48.0910μs | 22.0347μs | 45.3830 KOps/s | 45.3513 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 48.8510μs | 12.7992μs | 78.1301 KOps/s | 76.5372 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.1333ms | 41.1463μs | 24.3035 KOps/s | 23.6587 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 48.3510μs | 25.6944μs | 38.9190 KOps/s | 39.2078 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 0.1367ms | 24.5915μs | 40.6645 KOps/s | 41.7371 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 44.2210μs | 15.3686μs | 65.0679 KOps/s | 65.8326 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 77.9720μs | 44.8176μs | 22.3126 KOps/s | 22.5898 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 55.8710μs | 27.9687μs | 35.7542 KOps/s | 35.7170 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 51.4910μs | 24.7206μs | 40.4521 KOps/s | 41.7677 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 42.3610μs | 15.3435μs | 65.1742 KOps/s | 65.7151 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.2465ms | 46.7374μs | 21.3961 KOps/s | 21.1946 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 57.5620μs | 30.2393μs | 33.0695 KOps/s | 32.9882 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 60.3320μs | 26.3668μs | 37.9265 KOps/s | 38.1958 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 52.1510μs | 17.4080μs | 57.4448 KOps/s | 56.9967 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 78.4020μs | 44.5447μs | 22.4494 KOps/s | 22.5676 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 54.0810μs | 28.0748μs | 35.6192 KOps/s | 35.6872 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.7238ms | 28.6294μs | 34.9291 KOps/s | 35.1606 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 42.4110μs | 17.1968μs | 58.1503 KOps/s | 58.9887 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.1235ms | 46.3803μs | 21.5609 KOps/s | 21.6525 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 51.9310μs | 29.9872μs | 33.3475 KOps/s | 32.6419 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 54.4810μs | 30.0942μs | 33.2290 KOps/s | 32.9502 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 46.9210μs | 19.0917μs | 52.3788 KOps/s | 52.5851 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 73.9920μs | 48.5547μs | 20.5953 KOps/s | 20.6376 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 63.6510μs | 32.5379μs | 30.7334 KOps/s | 30.8774 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 53.3920μs | 30.6731μs | 32.6018 KOps/s | 33.0554 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 43.2910μs | 19.1990μs | 52.0860 KOps/s | 52.4005 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 78.4320μs | 50.8343μs | 19.6718 KOps/s | 19.6664 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 64.2520μs | 34.7963μs | 28.7387 KOps/s | 28.9007 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 60.6410μs | 32.1035μs | 31.1492 KOps/s | 30.9690 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 61.7710μs | 21.4313μs | 46.6607 KOps/s | 46.8194 KOps/s | |
| test_values[generalized_advantage_estimate-True-True] | 26.6240ms | 26.0504ms | 38.3872 Ops/s | 38.8062 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1228s | 3.3692ms | 296.8042 Ops/s | 303.9764 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1060ms | 81.1765μs | 12.3188 KOps/s | 11.9466 KOps/s | |
| test_values[td1_return_estimate-False-False] | 57.6462ms | 56.6564ms | 17.6503 Ops/s | 17.6754 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3669ms | 1.0956ms | 912.7741 Ops/s | 912.4706 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 90.2894ms | 89.4819ms | 11.1754 Ops/s | 11.0508 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.2947ms | 1.0954ms | 912.8686 Ops/s | 917.9172 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 25.6826ms | 25.4296ms | 39.3242 Ops/s | 39.7233 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0700ms | 0.7720ms | 1.2954 KOps/s | 1.3048 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8319ms | 0.6761ms | 1.4791 KOps/s | 1.4468 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5639ms | 1.4968ms | 668.0958 Ops/s | 671.1434 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.8165ms | 0.6894ms | 1.4506 KOps/s | 1.4523 KOps/s | |
| test_dqn_speed[False-None] | 7.6533ms | 1.5157ms | 659.7773 Ops/s | 640.7843 Ops/s | |
| test_dqn_speed[False-backward] | 2.2498ms | 2.1400ms | 467.2832 Ops/s | 456.7034 Ops/s | |
| test_dqn_speed[True-None] | 0.7121ms | 0.5491ms | 1.8213 KOps/s | 1.7902 KOps/s | |
| test_dqn_speed[True-backward] | 1.3766ms | 1.2312ms | 812.1832 Ops/s | 872.9948 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7301ms | 0.5720ms | 1.7483 KOps/s | 1.7141 KOps/s | |
| test_dqn_speed[reduce-overhead-backward] | 1.1093ms | 1.0708ms | 933.8496 Ops/s | 1.0122 KOps/s | |
| test_ddpg_speed[False-None] | 3.0751ms | 2.8046ms | 356.5535 Ops/s | 341.9411 Ops/s | |
| test_ddpg_speed[False-backward] | 4.3296ms | 4.2078ms | 237.6534 Ops/s | 234.1535 Ops/s | |
| test_ddpg_speed[True-None] | 1.7386ms | 1.3452ms | 743.3913 Ops/s | 735.2126 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6749ms | 2.6068ms | 383.6117 Ops/s | 408.1579 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4205ms | 1.3483ms | 741.6992 Ops/s | 737.1197 Ops/s | |
| test_ddpg_speed[reduce-overhead-backward] | 2.0983ms | 2.0464ms | 488.6554 Ops/s | 523.6999 Ops/s | |
| test_sac_speed[False-None] | 8.4323ms | 7.9614ms | 125.6062 Ops/s | 119.9837 Ops/s | |
| test_sac_speed[False-backward] | 11.7233ms | 11.2541ms | 88.8564 Ops/s | 88.5485 Ops/s | |
| test_sac_speed[True-None] | 1.8922ms | 1.8308ms | 546.2070 Ops/s | 531.5305 Ops/s | |
| test_sac_speed[True-backward] | 3.8126ms | 3.7009ms | 270.2011 Ops/s | 261.7256 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 20.8745ms | 11.8242ms | 84.5722 Ops/s | 82.8430 Ops/s | |
| test_sac_speed[reduce-overhead-backward] | 1.7206ms | 1.6127ms | 620.0825 Ops/s | 548.5464 Ops/s | |
| test_redq_speed[False-None] | 8.2034ms | 7.7385ms | 129.2244 Ops/s | 125.8613 Ops/s | |
| test_redq_speed[False-backward] | 12.2494ms | 11.6810ms | 85.6092 Ops/s | 81.8154 Ops/s | |
| test_redq_speed[True-None] | 2.5102ms | 2.3493ms | 425.6544 Ops/s | 425.8602 Ops/s | |
| test_redq_speed[True-backward] | 4.8510ms | 4.2420ms | 235.7386 Ops/s | 233.7593 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 2.7397ms | 2.3989ms | 416.8602 Ops/s | 420.4671 Ops/s | |
| test_redq_speed[reduce-overhead-backward] | 4.6335ms | 4.2449ms | 235.5745 Ops/s | 236.5972 Ops/s | |
| test_redq_deprec_speed[False-None] | 9.3344ms | 8.9714ms | 111.4658 Ops/s | 108.8680 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.1186ms | 12.5505ms | 79.6782 Ops/s | 80.0976 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.1025ms | 2.6886ms | 371.9453 Ops/s | 380.9892 Ops/s | |
| test_redq_deprec_speed[True-backward] | 5.0436ms | 4.5752ms | 218.5682 Ops/s | 219.3755 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.2501ms | 2.6681ms | 374.8049 Ops/s | 380.7223 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-backward] | 5.0470ms | 4.5863ms | 218.0408 Ops/s | 219.1180 Ops/s | |
| test_td3_speed[False-None] | 8.0836ms | 7.9636ms | 125.5721 Ops/s | 123.7984 Ops/s | |
| test_td3_speed[False-backward] | 11.1190ms | 10.7010ms | 93.4492 Ops/s | 93.6970 Ops/s | |
| test_td3_speed[True-None] | 1.6783ms | 1.6405ms | 609.5617 Ops/s | 591.3865 Ops/s | |
| test_td3_speed[True-backward] | 3.4281ms | 3.2922ms | 303.7472 Ops/s | 292.9892 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 51.4340ms | 26.4350ms | 37.8286 Ops/s | 38.5355 Ops/s | |
| test_td3_speed[reduce-overhead-backward] | 1.3843ms | 1.3151ms | 760.3832 Ops/s | 665.6663 Ops/s | |
| test_cql_speed[False-None] | 17.2662ms | 16.7920ms | 59.5522 Ops/s | 58.9448 Ops/s | |
| test_cql_speed[False-backward] | 22.9723ms | 22.1740ms | 45.0978 Ops/s | 43.6496 Ops/s | |
| test_cql_speed[True-None] | 3.5945ms | 3.4600ms | 289.0213 Ops/s | 300.3936 Ops/s | |
| test_cql_speed[True-backward] | 6.3506ms | 5.4945ms | 181.9990 Ops/s | 175.0162 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 0.6315s | 16.5041ms | 60.5911 Ops/s | 75.5630 Ops/s | |
| test_cql_speed[reduce-overhead-backward] | 2.0762ms | 1.9870ms | 503.2770 Ops/s | 558.0110 Ops/s | |
| test_a2c_speed[False-None] | 3.3826ms | 3.1887ms | 313.6094 Ops/s | 311.4874 Ops/s | |
| test_a2c_speed[False-backward] | 7.1452ms | 6.3925ms | 156.4321 Ops/s | 159.8159 Ops/s | |
| test_a2c_speed[True-None] | 1.5547ms | 1.3292ms | 752.3524 Ops/s | 738.2721 Ops/s | |
| test_a2c_speed[True-backward] | 3.2023ms | 3.0867ms | 323.9700 Ops/s | 310.2989 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 16.0910ms | 9.1298ms | 109.5316 Ops/s | 108.3556 Ops/s | |
| test_a2c_speed[reduce-overhead-backward] | 1.6932ms | 1.6155ms | 619.0043 Ops/s | 611.6893 Ops/s | |
| test_ppo_speed[False-None] | 3.7955ms | 3.7131ms | 269.3167 Ops/s | 253.1147 Ops/s | |
| test_ppo_speed[False-backward] | 7.5188ms | 7.1270ms | 140.3114 Ops/s | 136.6406 Ops/s | |
| test_ppo_speed[True-None] | 1.5494ms | 1.4106ms | 708.8983 Ops/s | 696.0407 Ops/s | |
| test_ppo_speed[True-backward] | 3.2533ms | 3.0741ms | 325.3008 Ops/s | 304.9969 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.0697ms | 0.9717ms | 1.0291 KOps/s | 1.0294 KOps/s | |
| test_ppo_speed[reduce-overhead-backward] | 1.5351ms | 1.4293ms | 699.6342 Ops/s | 687.3656 Ops/s | |
| test_reinforce_speed[False-None] | 2.6945ms | 2.2629ms | 441.9125 Ops/s | 432.7071 Ops/s | |
| test_reinforce_speed[False-backward] | 3.5194ms | 3.2769ms | 305.1695 Ops/s | 297.6386 Ops/s | |
| test_reinforce_speed[True-None] | 1.7439ms | 1.2857ms | 777.7862 Ops/s | 757.3494 Ops/s | |
| test_reinforce_speed[True-backward] | 3.0230ms | 2.9325ms | 341.0057 Ops/s | 339.6085 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 19.2423ms | 10.5026ms | 95.2141 Ops/s | 94.4777 Ops/s | |
| test_reinforce_speed[reduce-overhead-backward] | 1.5766ms | 1.4978ms | 667.6371 Ops/s | 649.5968 Ops/s | |
| test_iql_speed[False-None] | 9.5898ms | 9.1671ms | 109.0861 Ops/s | 105.0309 Ops/s | |
| test_iql_speed[False-backward] | 13.1371ms | 12.7923ms | 78.1719 Ops/s | 75.1519 Ops/s | |
| test_iql_speed[True-None] | 2.6374ms | 2.2110ms | 452.2897 Ops/s | 443.5266 Ops/s | |
| test_iql_speed[True-backward] | 5.0182ms | 4.8123ms | 207.8026 Ops/s | 205.3569 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 0.5670s | 13.1488ms | 76.0528 Ops/s | 90.4850 Ops/s | |
| test_iql_speed[reduce-overhead-backward] | 2.0521ms | 1.9676ms | 508.2240 Ops/s | 475.9005 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 7.8008ms | 6.2408ms | 160.2369 Ops/s | 158.3157 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6708ms | 0.3452ms | 2.8970 KOps/s | 3.6211 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6566ms | 0.3227ms | 3.0985 KOps/s | 3.9150 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.2604ms | 5.9558ms | 167.9042 Ops/s | 166.1615 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.0700ms | 0.2911ms | 3.4348 KOps/s | 3.7335 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5693ms | 0.3172ms | 3.1531 KOps/s | 4.0640 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5869ms | 1.3209ms | 757.0733 Ops/s | 768.4616 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6154ms | 1.2177ms | 821.2404 Ops/s | 809.7507 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.5361ms | 6.2954ms | 158.8462 Ops/s | 160.9793 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2760ms | 0.4350ms | 2.2987 KOps/s | 1.9527 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6370ms | 0.3931ms | 2.5437 KOps/s | 2.2220 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.2119ms | 6.0186ms | 166.1527 Ops/s | 163.5525 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8185ms | 0.3892ms | 2.5691 KOps/s | 2.8243 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5794ms | 0.3987ms | 2.5082 KOps/s | 2.9500 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 10.0019ms | 6.0188ms | 166.1473 Ops/s | 165.8619 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.0056ms | 0.3286ms | 3.0430 KOps/s | 3.1391 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7120ms | 0.3110ms | 3.2152 KOps/s | 3.4566 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.5530ms | 6.2769ms | 159.3135 Ops/s | 159.9408 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0303ms | 0.4444ms | 2.2503 KOps/s | 2.3004 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7449ms | 0.4559ms | 2.1933 KOps/s | 2.4638 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 7.0985ms | 5.5665ms | 179.6459 Ops/s | 175.6411 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 6.4206ms | 2.0380ms | 490.6881 Ops/s | 423.1499 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 8.8084ms | 1.2304ms | 812.7237 Ops/s | 893.7075 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 8.0905ms | 5.6387ms | 177.3462 Ops/s | 174.8450 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 7.8720ms | 2.0122ms | 496.9588 Ops/s | 451.1241 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 7.7924ms | 1.2387ms | 807.2841 Ops/s | 805.4285 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.5513s | 16.6774ms | 59.9614 Ops/s | 29.2625 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 9.4933ms | 2.2075ms | 452.9960 Ops/s | 456.9185 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 7.2414ms | 1.3520ms | 739.6558 Ops/s | 854.7364 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 14.0392ms | 13.6774ms | 73.1130 Ops/s | 72.5608 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 18.5226ms | 17.0210ms | 58.7508 Ops/s | 59.8228 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 18.1217ms | 17.8956ms | 55.8795 Ops/s | 54.2292 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 18.7276ms | 17.3620ms | 57.5972 Ops/s | 59.9435 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 18.9238ms | 18.4604ms | 54.1700 Ops/s | 54.3850 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 20.6504ms | 18.8646ms | 53.0093 Ops/s | 54.6387 Ops/s |
| policy, | ||
| frames_per_batch=args.steps_per_batch, | ||
| total_frames=1_000_000, | ||
| local_weights_updater=HF2vLLMLocalWeightUpdater( |
There was a problem hiding this comment.
I'm curious to see how this would look when train_model and inference_model are sharded 😛
There was a problem hiding this comment.
Yeah there is still some work to be done!
IIUC your implementation you do a full_tensor() then you send that to the vllm weights right?
| generate=False, | ||
| return_log_probs=True, | ||
| ) | ||
| env.append_transform( |
There was a problem hiding this comment.
with the .append_transform API, is it possible to run ShapedCorrectnessReward and KLRewardTransform in parallel?
There was a problem hiding this comment.
not currently but we can think about it.
The difficulty is that in some cases you may have a transform that requires another one to do it's thing before.
We could imagine
env.append_transform(MyTransform0(async=True))
env.append_transform(MyTransform1(async=True)) # does not require MyTransform0
env.append_transform(MyTransform2(blocking=True)) # blocking tells env that you need to have completed the other async before running this onewdyt?
There was a problem hiding this comment.
With just reward model and ref_model I think this API would work
But I don't think this API would be encompassing (or at least it might be tricky) for the case where there's a more complex graph of dependencies between transforms
n00b qn: with this API, who is responsible for consolidating the result tds from MyTransform0 and MyTransform1), would all the communications involved be wrapped in a transform?
|
May combe back to this at a later stage, closing for now |
Stack from ghstack (oldest at bottom):