[question] EvalCallback using MPI #1069

davidADSP · 2021-01-15T19:13:18Z

When running a training loop using MPI, the EvalCallback doesn't seem to make use of the parallelisation:

for example, in this train function:

https://github.com/hardmaru/slimevolleygym/blob/master/training_scripts/train_ppo_mpi.py

it seems that the EvalCallback will be called once per instance, after a combined total of eval_freq timesteps across all of the instances. This appears to be problematic if you want to use the callback decide whether to save out a new best model, as there will be multiple attempts at calculating the average reward and therefore the best_model file will be overwritten potentially several times on the same update. The best score will be naturally inflated the more instances you have, due to some reward calculations being slightly higher than average due to favourable random fluctuations.

Is this correct?

If so, is there a way to instead split the n_eval_episodes across the workers and aggregate into a single score?

The text was updated successfully, but these errors were encountered:

araffin · 2021-01-18T09:44:41Z

Hello,

When running a training loop using MPI, the EvalCallback doesn't seem to make use of the parallelisation:

yes, the EvalCallback does not support MPI parallelization.

I would recommend you to try to switch to the VecEnv version of PPO (PPO2) if this is possible.
And even to switch to Stable-Baselines3 ;) : https://github.com/DLR-RM/stable-baselines3

If so, is there a way to instead split the n_eval_episodes across the workers and aggregate into a single score?

this is possible but non-trivial and would require you to define a custom callback (cf doc).

davidADSP · 2021-01-18T11:28:48Z

Hi @araffin - thanks for the reply!

Does VecEnv parallelise the gradient computation, or just the env part?

https://twitter.com/hardmaru/status/1260852988475658242

I've got PPO + MPI working really well on a multicore machine with a custom callback to handle the parallelisation of the evaluation. I'm also hesitant to switch to SB3 as it doesn't support Tensorflow, which is a shame.

Thanks for your help!

araffin · 2021-01-18T16:24:10Z

Does VecEnv parallelise the gradient computation, or just the env part?

Just the env part.

I've got PPO + MPI working really well on a multicore machine with a custom callback to handle the parallelisation of the evaluation.

SB3 do not support MPI by default, but we would be happy to have an implementation of PPO MPI in our contrib repo ;)

See Stable-Baselines-Team/stable-baselines3-contrib#11

I'm also hesitant to switch to SB3 as it doesn't support Tensorflow, which is a shame.

The decision to move to PyTorch and drop MPI (for the default install) was not arbitrary, see #733 and #366 ;)

danielstankw · 2021-10-20T11:20:44Z

@araffin has anything changed with regards to SB3 supporting the MPI or its still not supported?

araffin · 2021-10-20T11:23:34Z

@araffin has anything changed with regards to SB3 supporting the MPI or its still not supported?

It is not (Stable-Baselines-Team/stable-baselines3-contrib#11, Stable-Baselines-Team/stable-baselines3-contrib#45), but contribution is welcomed ;)

But with SB3, you can use multiple envs for evaluation which provide a great speed up.

davidADSP changed the title ~~EvalCallback using MPI [question]~~ EvalCallback using MPI :question Jan 15, 2021

davidADSP changed the title ~~EvalCallback using MPI :question~~ [question] EvalCallback using MPI Jan 15, 2021

araffin added the question Further information is requested label Jan 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] EvalCallback using MPI #1069

[question] EvalCallback using MPI #1069

davidADSP commented Jan 15, 2021 •

edited

Loading

araffin commented Jan 18, 2021

davidADSP commented Jan 18, 2021

araffin commented Jan 18, 2021

danielstankw commented Oct 20, 2021

araffin commented Oct 20, 2021

[question] EvalCallback using MPI #1069

[question] EvalCallback using MPI #1069

Comments

davidADSP commented Jan 15, 2021 • edited Loading

araffin commented Jan 18, 2021

davidADSP commented Jan 18, 2021

araffin commented Jan 18, 2021

danielstankw commented Oct 20, 2021

araffin commented Oct 20, 2021

davidADSP commented Jan 15, 2021 •

edited

Loading