-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Gibbs ensemble scripts #3903
Conversation
So far, this contains disentangling and readability work.
|
I think that the Gibbs ensemble implementation would be more helpful if (in each box) MD would be used in the step "MOVE_PART" to propose new states instead of a single MC move -- and then accept/reject the new state (after running MD) according to the metropolis criterion. This way also polymeric systems could be simulated. |
I think that the Gibbs ensemble implementation would be more helpful if (in each box) MD would be used in the step "MOVE_PART" to propose new states >instead of a single MC move -- and then accept/reject the new state (after running MD) according to the metropolis criterion. This way also polymeric systems >could be simulated.
I agree. For the tutorial, I’m going to leave the MC variant, though.
|
In general sampling the configurational space via MD steps yields "more physical" configurations (whatever that means), yet sometimes this is exactly what you want to avoid when you do MC (like e.g. if there is an energy barrier, you can do collective moves etc.). I.e. we should leave the interface as flexible as possible and let the user decide, together with a few typical samples/tutorials. I discussed with @RudolfWeeber that in a mid-termed project we could enhance ESPResSo's MC capabilities. |
@@ -53,7 +53,7 @@ | |||
parser.add_argument('-C', '--client-script', nargs=1) | |||
parser.add_argument('-n', '--number-of-particles', type=int, nargs=1) | |||
parser.add_argument('-l', '--box-length', type=float, nargs=1) | |||
parser.add_argument('-T', '--temperature', type=float, nargs=1) | |||
parser.add_argument('-T', '--temperature', type=float, nargs=1, required=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's kind of odd to declare an optional argument and make it required, why not use a positional argument?
A cool. I gave up un enums I last checked, because there were two competing enum packages…
|
it's kind of odd to declare an optional argument and make it required, why not use a positional argument?
I don’t use positional arguments, because the order can be mistaken easily. That’s what I also preach in my “managing simulation data” lecture.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general i think we want to use the multiprocessing module for the communication instead of plain sockets in the future. Besides my comments it looks okay to me.
def random_particle(self): | ||
id = np.random.choice(self._system.part[:].id) | ||
return self._system.part[id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not return np.random.choice(self._system.part)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc states it requires an 1d array, which ParticleList is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
class Client(): | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should consistently use """
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really want to do such things manually. I personally use """, whereas the previous author used '''. I don't really find it important. If autopep8 can be convinced to do this automatically, I'm fine with adding that rule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just about consistency... just thought that python code in the samples is much more prominent to users than code in the python interface that is not read directly too often. Search and replace is also trivial in this case but if you think that code style is not that important in the samples you can leave it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, always use """triple double quotes""" around docstrings
samples/gibbs_ensemble/gibbs.py
Outdated
while msg[0] != MessageId.END: | ||
# receive command to execute next step | ||
msg = self._recv_data() | ||
if msg[0] == MessageId.END: | ||
break | ||
elif msg[0] == MessageId.MOVE_PART: | ||
self.move_particle() | ||
self._send_data(pickle.dumps( | ||
[MessageId.ENERGY, self.energy()])) | ||
elif msg[0] == MessageId.CHANGE_VOLUME: | ||
box_l = msg[1] | ||
self.change_volume_and_rescale_particles(box_l) | ||
self._send_data(pickle.dumps( | ||
[MessageId.ENERGY, self.energy()])) | ||
elif msg[0] == MessageId.CHANGE_VOLUME_REVERT: | ||
box_l = msg[1] | ||
self.change_volume_and_rescale_particles(box_l) | ||
elif msg[0] == MessageId.EXCHANGE_PART_ADD: | ||
n = msg[1] | ||
self.insert_particle(n) | ||
self._send_data(pickle.dumps( | ||
[MessageId.ENERGY, self.energy()])) | ||
elif msg[0] == MessageId.EXCHANGE_PART_REMOVE: | ||
self.remove_particle() | ||
self._send_data(pickle.dumps( | ||
[MessageId.ENERGY, self.energy()])) | ||
elif msg[0] == MessageId.MOVE_PART_REVERT: | ||
self.move_particle_revert() | ||
elif msg[0] == MessageId.EXCHANGE_PART_ADD_REVERT: | ||
self.remove_last_added_particle() | ||
elif msg[0] == MessageId.EXCHANGE_PART_REMOVE_REVERT: | ||
self.remove_particle_revert() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole block is probably something like a MessageHandler
that can react on messages of type MessageId
and is more or less just a dict from MessageId
to a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this. Tutors will have to do a bit more explaining when teching Python beginners, though.
parser = argparse.ArgumentParser() | ||
parser.add_argument('-s', '--seed', type=int) | ||
parser.add_argument('-H', '--host', type=str, default='localhost') | ||
parser.add_argument('-p', '--port', type=int, required=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, make it a positional argument
If you have so many positional arguments that you fear to mix them up, you are probably doing something wrong. Here, for examples it seems you only have one required argument which should also be positional just as with any tool you use in your console. |
I agree about using some higher-level communication infrastrucutre, but will not have time to do it in this iteration. Before this becoms an Espresso module rather than a sample/tutorial, move classes should be introduced as well. I.e. move classse should hold the detail of a specific move and the steps to apply and revert it. It might be discussed whether acceptance criteria should be member functions or free functions (which would be ugly, because one cannot overload w/r to argument type in Python). Also not in this iteration, though. |
This is as far as I want to take the refactoring of the Gibbs ensemble scripts. The task I actually intended to do was makeing a tutorial in which this is used to calculate critical temperatures... Let's see how that goes... |
boxes[0].box_l**3, boxes[1].box_l**3, | ||
(tock - tick) * 1000 / REPORT_INTERVAL)) | ||
if len(densities[0]): | ||
print("step %d, densities %.3f %.3f, volumes %.2f, %.2f %.f ms / move" % |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this snippet gets a lot more readable if you use format strings ala print(f'step {i})
. I also think that in general it's good practice to not directly print stuff to stdout but instead use the logging infrastructure of python.
import pickle | ||
import scipy.optimize | ||
import matplotlib.pyplot as plt | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstrings are missing
# First index to consider | ||
start = np.where(np.array(indices) > WARMUP_LENGTH)[0][0] | ||
|
||
# Determin which is teh dilute phase from the last density recorded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use a spellchecker for the comments
parser = argparse.ArgumentParser() | ||
parser.add_argument('-s', '--seed', type=int) | ||
parser.add_argument('-H', '--host', type=str, default='localhost') | ||
parser.add_argument('-p', '--port', type=int, required=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for a single required argument it should be avoided to mark it as optional. If you provide proper help messages with the help
keyword argument to add_argument
the meaning of the positional argument is not ambiguous.
for a single required argument it should be avoided to mark it as optional. If you provide proper help messages with the help keyword argument to >add_argument the meaning of the positional argument is not ambiguous.
Can we please agree to disagree about positional arguments. I find their use in simulations scripts error-prone as I mentioned before.
|
> + def random_particle(self):
+ id = np.random.choice(self._system.part[:].id)
+ return self._system.part[id]
https://docs.python.org/3/library/random.html#random.choice
I am aware of that function but I don’t trust it in this case, because from the docs it is not apparent to me, wheter the implementation relies on [] (or __getitem__()) to retrieve individual elements from the popoluation. If it does, or will in the future, this would give wrong results, because system.part[] behaves somewhat like a dictionary.
So if len(system.part) is N that does not mean that all valid entries are accessible via [0] .. [N-1].
Safety fist.
|
I still cannot follow your reasoning. Correct me if I'm wrong: you are saying that the standard way of calling any command line script should not be done here because it is error prone? What makes our sample scripts so special? And how can you use a single positional argument incorrectly? |
so you are saying that our |
The most important issue with this is that it doesn't reproduce the scaling of |
Yes. It should probably be renamed to something not containing `list`.
|
or maybe make it behave like a sequence? |
it's probably a good idea to have some discussion about the particle container in the interface. the current one seems to have some surprising features. |
>it's probably a good idea to have some discussion about the particle container in the interface. the current one seems to have some surprising features.
Better think of it like a dictionary with some index slicing capabilities.
We had this discussion already. The existing behaviour was deemed the least supririnsg, since the argument of the [] always matches the particle id.
|
fair enough. i still think there is room for improvement, e.g. by totally disentangling container index and particle id. It's not clear to me why they should be connected. However, that would mean to introduce a breaking change to a lot of user scripts. |
From the first look at it it seems like the system is not very well equilibrated. The equilibration can be sped up by adjusting the probability of the different attempt types. was there any reason to set them to have equal probability? |
Running MD for the displacement moves and then accepting rejecting the resulting conformations with the MC criterion would help a lot in speeding up and would avoid inventing hand taylored MC moves for every system (think of polymers etc). |
From the first look at it it seems like the system is not very well equilibrated. The equilibration can be sped up by adjusting the probability of the different >attempt types. was there any reason to set them to have equal probability?
That’s what The Book suggested. We didn’t have more success with other ratios, though. Also, the number of cycles was increased considerably, so even if the ratio is not optimal, a reasonable result would be expected.
The results aren’t just noisy, the scaling is incorrect (fitted via aggregate.py).
|
Jonas, I agree, but there is no point making it more complex until it is correct.
|
Applying the hybrid MD/MC scheme would also help making it correct and avoid convergence issues in MC due to long correlations. |
Ok. I mean it should not matter as long as detailed balance is in place.
It is not that the data is noisy, but looking at the densities, they still drift apart after those 300000 steps (at least for the simulations i looked at). Which would lead to a smaller density difference when averaging begins at 100000 steps. |
I don't quite get the strategy here. How do you plan to find the issue? Is the only test for correctness a statistical test? |
This is an updated version of #3903 Several updates were performed: - Multiprocessing and Communication via the internal Pipes was used instead of subprocessing and communicating via socket - The sample system no longer uses correction terms but a truncation of 5 sigma instead This updated way of communication between processes is far more efficient than using TCP connections which is either a result of an increase in latency or the multiprocessing module uses internal methods to efficiently use nonblocking communication between the processes. The simulation has the following structure: - The run_sim.py script starts the clients and controls them (same functionality as the socket script in #3903) - The gibbs_client_class.py script contains the client class with the methods needed - The client_system.py script initializes the system in every box
This refactors the Gibbs ensemble scripts.
The final state, to my understanding, should be the following:
There is a server script, which, via socket communication, controls two clients.
Each client represents one of the two systems of the Gibbs ensemble.
The server knows nothing about the details of the simulation system.
It controls the Gibbs Monte Carlo protocol and issues commands to the clients for
It also receives updated energies from the clients.
A client base class implements these operations for a standard Espressso system.
Users can then derive from that for their individual cases. For the LJ case which is currently present, the only thing which needs to be overridden is the energy, where tail and shift corrections have to be added.