Dream sampler. Parallel computation #266

baramousa · 2021-04-09T09:18:36Z

Hi, this is not really an issue. i just want to know which version of dream this package is?. Is it the basic dream or dreamzs or mt-dreamzs.. Am asking because i want to know if parallel computation is possible . As far as i know basic dream can be run only sequential while the others can be run in parallel.

Thanks @thouska

thouska · 2021-04-09T09:44:51Z

Hi @baramousa
The dream algorithm version implemented in spotpy corresponds to the Algorithm 6 as presented in this publication:
https://www.sciencedirect.com/science/article/pii/S1364815215300396?casa_token=gCl00Qy8ymsAAAAA:BcW90XS8GyI2Rwi7sJnunxAUOAhfQMz9eEHTWSbjgvPflnUxF5DI7cm3qq1OzXro01_bdf3Pyz4
So it can be run in parallel. If you want, you can check out this example, which is set-up to run n=4 chains.
https://github.com/thouska/spotpy/blob/master/spotpy/examples/tutorial_dream_hymod.py
Chaning the parallel keyword to 'mpi' and hand this setting to the sampler, would result that spotpy start each of the chains runs on an individual cpu core. Some details about this are given here:
https://github.com/thouska/spotpy/blob/master/spotpy/examples/tutorial_parallel_computing_hymod.py

baramousa · 2021-04-09T10:29:12Z

thanks for the quick reply. just another question let us say i want to implement the hymod example below.
https://github.com/thouska/spotpy/blob/master/spotpy/examples/tutorial_dream_hymod.py

if i want to implement it it parallel then i need to set parellel to equal "mpi" for linux machine and to "mpc" for windows machine. am i getting it right ?

thouska · 2021-04-09T10:32:42Z

Yes, thats correct.

baramousa · 2021-04-09T10:34:11Z

ok. thanks a lot . I wil try it and give my feedback

baramousa · 2021-04-12T15:55:39Z

Hi @thouska . i tried to run your hymod_dream example in parallel. it seams to run but then i get a warning. First this is the code:
import numpy as np
import spotpy
import matplotlib.pyplot as plt
from spotpy.likelihoods import gaussianLikelihoodMeasErrorOut as GausianLike
from spotpy.analyser import plot_parameter_trace
from spotpy.analyser import plot_posterior_parameter_histogram
import sys
if name == "main":
parallel ='mpc'
from spotpy.examples.spot_setup_hymod_unix import spot_setup
spot_setup=spot_setup(GausianLike)
sampler=spotpy.algorithms.dream(spot_setup, dbname='DREAM_hymod',parallel=parallel, dbformat='csv')
rep=5000
nChains = 4
convergence_limit = 1.2
nCr = 3
eps = 10e-6
runs_after_convergence = 100
acceptance_test_option = 6
r_hat = sampler.sample(rep, nChains, nCr, eps, convergence_limit)
results = spotpy.analyser.load_csv_results('DREAM_hymod')

Then i get this warning:

Convergence rates =1.5744 4.8378 1.4476 1.3106 1.5791
1003 of 5000, maximal objective function=-8270.54, time remaining: 00:04:34
Acceptance rates [%] =15.08 13.89 11.51 25.
Convergence rates =1.5756 5.1658 1.4241 1.3066 1.5492
1021 of 5000, maximal objective function=-8270.54, time remaining: 00:04:36
Acceptance rates [%] =15.12 13.95 11.63 25.19
Convergence rates =1.5518 5.793 1.4008 1.298 1.5164

IndexError Traceback (most recent call last)
in
11 runs_after_convergence = 100
12 acceptance_test_option = 6
---> 13 r_hat = sampler.sample(rep, nChains, nCr, eps, convergence_limit)
14 results = spotpy.analyser.load_csv_results('DREAM_hymod')
15

c:\users\albaraalmawazreh\appdata\local\programs\python\python37\lib\site-packages\spotpy\algorithms\dream.py in sample(self, repetitions, nChains, nCr, eps, convergence_limit, runs_after_convergence, acceptance_test_option)
274 while self.iter < self.repetitions:
275 param_generator = ((curChain,self.get_new_proposal_vector(curChain,newN,nrN)) for curChain in range(int(self.nChains)))
--> 276 for cChain,par,sim in self.repeat(param_generator):
277 pCr = np.random.randint(0,nCr)
278 ids=[]

c:\users\albaraalmawazreh\appdata\local\programs\python\python37\lib\site-packages\spotpy\parallel\mproc.py in call(self, jobs)
52 def call(self, jobs):
53 results = self.pool.imap(self.f, jobs)
---> 54 for i in results:
55 yield i

c:\users\albaraalmawazreh\appdata\local\programs\python\python37\lib\site-packages\multiprocess\pool.py in next(self, timeout)
746 if success:
747 return value
--> 748 raise value
749
750 next = next # XXX

IndexError: list index out of range

thouska · 2021-04-13T11:26:55Z

Hi @baramousa
thank you for your message and the detailed error describtion. I can confirm an error there and will look into this together with @philippkraft. Sorry for any inconvience this may cause to you. I will keep you posted about the progress.

thouska · 2021-04-16T13:39:16Z

Hi @baramousa,
it turns out to be quiete a task to solve this issue. We will work on this at #268 and also on local machines. It might take a while and I cannot gurantee final success atm. Meanwhile, would 'mpi' parallelization be a solution for you? This should work fine :)

baramousa · 2021-05-26T11:23:13Z

Hi @thouska . Sorry for the late reply. I downloaded Anaconda which has the pythom 3.8 or newer one, and tried the parallelisation of dream on windows and it seems to work. However my issue now is that my model write input and output data as text files. And in order for the parallelisation to work effictively, each chain should has it is own directory were write and read input and output files. My question is if there is a way to extract the id/number of the currently running chains so i can insert them in my model to create a directory for each of them.
Since i am also trying to use SCE-UA, would you suggest a way to do the same with it.
thanks in advance :)

thouska · 2021-05-26T11:35:47Z

Hi @baramousa,
ok I have not tested yet with the newest Anaconda, would be great if it solves the problem!
Regarding the parlallel writing/reading, you are perfectly right. One needs to handle that this is done individually for each core. I wrote a short example for that, which you can find here.

Basically under 'mpi' you can access the cpu_id this way:

cpu_id = str(int(os.environ['OMPI_COMM_WORLD_RANK']))

Under 'mpc' it is done like this:

cpu_id = str(os.getpid())

I would recommend to work with these, instead of usinf the chain_id (in case of dream) or complex_id (in case of sce-ua), as the above example works independent of the choice of the algorithm in spotpy.

baramousa · 2021-05-28T13:04:28Z

Hi @thouska ,
thanks for your answer. now it works. input and output files are being written and read in individual directories corresponding to the core name. however now the csv summary file which should include the whole results of all simulations is only having the very last carried out simulations of each chain for dream and no data for sceua. The simulations run and summary is shown in the console but the csv files are not written properly.
Can you tell where is the problem?

i am guessing it has to do with this script in _algorithm.py :

        def save(self, like, randompar, simulations, chains=1):
  
  
            # Initialize the database if no run was performed so far
  
  
            self._init_database(like, randompar, simulations)
  
  
            # Test if like and the save threshold are float/list and compare accordingly
  
  
            if self.__is_list_type(like) and self.__is_list_type(self.save_threshold):
  
  
                if all(i &gt; j for i, j in zip(like, self.save_threshold)): #Compares list/list
  
  
                    self.datawriter.save(like, randompar, simulations, chains=chains)
  
  
            if (not self.__is_list_type(like)) and (not self.__is_list_type(self.save_threshold)):
  
  
                if like&gt;self.save_threshold: #Compares float/float
  
  
                    self.datawriter.save(like, randompar, simulations, chains=chains)
  
  
            if self.__is_list_type(like) and (not self.__is_list_type(self.save_threshold)):
  
  
                if like[0]&gt;self.save_threshold: #Compares list/float
  
  
                    self.datawriter.save(like, randompar, simulations, chains=chains)
  
  
            if (not self.__is_list_type(like)) and self.__is_list_type(self.save_threshold): #Compares float/list
  
  
                if (like &gt; self.save_threshold).all:

pathos multiprocessing imap was resulting in broke spotpy database. Switching to map solces the issue

thouska · 2021-05-31T12:42:50Z

Hi @baramousa,
thank you for the update! And indeed, the broken file was the point where I got stuck at #268. To be honest I did not fully understand why this did not work, as results are internally perfectly fine, but where not in the final output file.

However, I looked again into this, played a lot around and can finally come up with a fix (see commit above).
Basically I change line 53 in mproc.py

from:

results = self.pool.imap(self.f, jobs)

into:

results = self.pool.map(self.f, jobs)

Now it works fine for me. At least in 90% of the cases. From time to time the header is broken, but the rest should be fine. @baramousa: Could you test for your case and give your feedback here?

baramousa · 2021-06-01T10:59:39Z

Hi @thouska ,
indeed when i change the dbfformat to 'ram', they seem fine. Well i tried now your solution and it worked with sceua algorthim but dream still have the same problem, only the last runs are saved in csv file.
On the other hand mpi on linux machine seems to work.

thouska · 2021-06-15T13:19:36Z

Hi @baramousa
sorry for the late response, but at least I can come up with good news, I hope :)
I worked in #268 on the issue. You were right, somehow only the dream algorithm did not work proberly under the pathos multprocessing settings. This was due to to many pools that were generated during the Markov Chains. I tried to fix it, but at the end I had the feeling that this is a problem in the pathos package. So I changed the package to joblib. With that the parallelization works with dream on my computer. Could you test too? I changed the tutorial_dream_hymod.py in a way, so that it is directly using multiprocessing.

philippkraft assigned thouska and philippkraft Apr 13, 2021

This was referenced Apr 13, 2021

CPU count is not configurable in mproc/umproc implementations #120

Open

Solving #266 with a new multiprocessing design #268

Open

thouska added a commit that referenced this issue May 31, 2021

Fix for #266

80435f9

pathos multiprocessing imap was resulting in broke spotpy database. Switching to map solces the issue

ZhiqiangD mentioned this issue Mar 28, 2023

sceua & mpc : writing data in txt error for mpc #304

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dream sampler. Parallel computation #266

Dream sampler. Parallel computation #266

baramousa commented Apr 9, 2021

thouska commented Apr 9, 2021 •

edited

Loading

baramousa commented Apr 9, 2021

thouska commented Apr 9, 2021

baramousa commented Apr 9, 2021

baramousa commented Apr 12, 2021 •

edited

Loading

thouska commented Apr 13, 2021

thouska commented Apr 16, 2021

baramousa commented May 26, 2021

thouska commented May 26, 2021

baramousa commented May 28, 2021 •

edited

Loading

thouska commented May 31, 2021

baramousa commented Jun 1, 2021 •

edited

Loading

thouska commented Jun 15, 2021

Dream sampler. Parallel computation #266

Dream sampler. Parallel computation #266

Comments

baramousa commented Apr 9, 2021

thouska commented Apr 9, 2021 • edited Loading

baramousa commented Apr 9, 2021

thouska commented Apr 9, 2021

baramousa commented Apr 9, 2021

baramousa commented Apr 12, 2021 • edited Loading

thouska commented Apr 13, 2021

thouska commented Apr 16, 2021

baramousa commented May 26, 2021

thouska commented May 26, 2021

baramousa commented May 28, 2021 • edited Loading

thouska commented May 31, 2021

baramousa commented Jun 1, 2021 • edited Loading

thouska commented Jun 15, 2021

thouska commented Apr 9, 2021 •

edited

Loading

baramousa commented Apr 12, 2021 •

edited

Loading

baramousa commented May 28, 2021 •

edited

Loading

baramousa commented Jun 1, 2021 •

edited

Loading