Experiment Sets
BreakoutNoFrameskip-v4 Experiments


Description
BreakoutNoFrameskip-v4 expeirments. Evaluate on separate environments every 250k timesteps in parallel (see code for details), run for 5M timesteps (roughly 23.15 hrs of experience).
Executive Summary
Experiment AverageReturnPerkWh AverageReturn AsymptoticReturn total_power exp_len_hours cpu_hours gpu_hours estimated_carbon_impact_kg
1 PPO2 (stable_baselines, default settings) 496.773 +/- 23.83 105.210 +/- 4.54 239.472 +/- 20.04 0.212 +/- 0.00 1.634 +/- 0.01 6.263 +/- 0.10 0.211 +/- 0.00 0.071 +/- 0.00
4 A2C+Vtrace (cule, default settings) 172.988 +/- 16.48 6.288 +/- 0.53 15.440 +/- 1.15 0.036 +/- 0.00 0.232 +/- 0.00 0.612 +/- 0.00 0.176 +/- 0.00 0.013 +/- 0.00
2 A2C (stable_baselines, default settings) 90.269 +/- 7.23 14.648 +/- 1.17 56.816 +/- 4.72 0.162 +/- 0.00 1.200 +/- 0.01 6.458 +/- 0.02 0.105 +/- 0.00 0.057 +/- 0.00
3 DQN (stable_baselines, default settings) 7.024 +/- 0.77 15.264 +/- 1.30 30.648 +/- 3.46 2.195 +/- 0.06 19.233 +/- 0.36 61.496 +/- 0.71 1.870 +/- 0.05 0.727 +/- 0.02
Graphs