-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: sceua gets stuck with MPI after burn-in #226
Comments
After some bug tracking I think, the problem is in this line: Line 200 in 269a5a7
where But I've run out of ideas at this point. |
Hi Sebastian, sorry for the long silence - vacation period. We "fixed" some SCE-UA bugs with the last version, I have to check the changes together with @thouska - who is still out of office. Can you check another sampler, if you have the same problems there? (e.g. ROPE or LHS). Just to make sure it is in the SCE-UA implementation (which is tricky) and not a general |
@philippkraft : Thanks for the reply. I checked the FAST routine, which worked as expected. |
Something new on this topic? |
Hi Sebastian, |
Ok, now it should be fixed. Somehow this in spotpy version 1.5.0 introduced new design of the _RunStatistic class in _algorithm.py was not pickable under mpi4py. This resulted your described stuck after the burn-in phase. I removed the use of the _RunStatistic class while spotpy is running on cpu-slaves. This fixes the problem (at least in my mpi environment). The change might result in a bit longer runtimes at the end of the sampling (will be fixed), but for now it is at least running again. |
PS: If you want to test this, the corresponding new version (1.5.3) of spotpy is available on pypi. |
I installed spotpy 1.5.4 and now I am getting the following error:
The submodule Line 16 in 0d55074
you should use this instead:
with this on the first line:
But after commenting out the |
Maybe you could shift the unittests folder to a toplevel folder named tests, as mentioned in the exclude pattern, which is a common way, Than you have to adopt the .travis.yml file. I dont think the unit tests need to be in the package when there is a separate example folder. |
Moves tests on toplevel, partly removes jit from hymod_python.py #226
I had similar problems but I just saw @thouska just updated but I mean [I have not] test it out the newest version. :D . I will do it now. :D |
Many thanks @MuellerSeb that you directly tested everything and reported such a detailed way how to fix the new problems. As you recommended, I removed the unittest import, renamed the unittests folder to tests and moved the whole thing to the toplevel. I like the new structure and think this makes totaly sense. |
Sorry for my rush comment. I want to say I have not tested it yet. But now I tested it and it is not working for me. May be it is my mistake in the model but my mpi is working properly as I tested it with Telemac2d. What could be the possible error. Anyway, @thouska thank you very much for help. |
@hpsone : maybe you have to give some details on your problem to get an answer. |
@MuellerSeb Thank you so much. I am not quite sure what is the error. But I did run using "mpc" instead of "mpi" and it worked. Anyway I will try again but it probably might be my insufficient knowledge. |
I guess this issue is solved, if not feel free to reopen. |
Hey there,
from spotpy 1.5.0 on, sce optimization with MPI get stuck after the burn in phase. Here is a minimal example:
Running with
Gives the following output:
And from there on, nothing more happens. With
parallel="seq"
it takes about 5 seconds to finish.Do you know what the problem could be?
I've got
mpi4py 3.0.2
installed and I am using Python 3.6.8. With spotpy 1.4.6 everything is working. From 1.5.0 on the above mentioned behavior occurs.Cheers,
Sebastian
The text was updated successfully, but these errors were encountered: