Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Gibbs ensemble scripts #3903

Closed
wants to merge 38 commits into from
Closed

Conversation

RudolfWeeber
Copy link
Contributor

This refactors the Gibbs ensemble scripts.
The final state, to my understanding, should be the following:
There is a server script, which, via socket communication, controls two clients.
Each client represents one of the two systems of the Gibbs ensemble.

The server knows nothing about the details of the simulation system.
It controls the Gibbs Monte Carlo protocol and issues commands to the clients for

  • particle insertion
  • particle removal
  • move a single particle
  • volume change
  • confirm/revert last operation
    It also receives updated energies from the clients.

A client base class implements these operations for a standard Espressso system.
Users can then derive from that for their individual cases. For the LJ case which is currently present, the only thing which needs to be overridden is the energy, where tail and shift corrections have to be added.

@RudolfWeeber
Copy link
Contributor Author

So far, this contains disentangling and readability work.

  • Box indices have been replaced by references to the box in most cases
  • Energy corrections are moved to the client side as they are system-dependent
  • Readability

@schlaicha

@jonaslandsgesell
Copy link
Member

I think that the Gibbs ensemble implementation would be more helpful if (in each box) MD would be used in the step "MOVE_PART" to propose new states instead of a single MC move -- and then accept/reject the new state (after running MD) according to the metropolis criterion. This way also polymeric systems could be simulated.

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Sep 22, 2020 via email

@schlaicha
Copy link
Contributor

I think that the Gibbs ensemble implementation would be more helpful if (in each box) MD would be used in the step "MOVE_PART" to propose new states >instead of a single MC move -- and then accept/reject the new state (after running MD) according to the metropolis criterion. This way also polymeric systems >could be simulated.
I agree. For the tutorial, I’m going to leave the MC variant, though.

In general sampling the configurational space via MD steps yields "more physical" configurations (whatever that means), yet sometimes this is exactly what you want to avoid when you do MC (like e.g. if there is an energy barrier, you can do collective moves etc.). I.e. we should leave the interface as flexible as possible and let the user decide, together with a few typical samples/tutorials. I discussed with @RudolfWeeber that in a mid-termed project we could enhance ESPResSo's MC capabilities.

@@ -53,7 +53,7 @@
parser.add_argument('-C', '--client-script', nargs=1)
parser.add_argument('-n', '--number-of-particles', type=int, nargs=1)
parser.add_argument('-l', '--box-length', type=float, nargs=1)
parser.add_argument('-T', '--temperature', type=float, nargs=1)
parser.add_argument('-T', '--temperature', type=float, nargs=1, required=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's kind of odd to declare an optional argument and make it required, why not use a positional argument?

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Sep 29, 2020 via email

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Sep 29, 2020 via email

Copy link
Member

@KaiSzuttor KaiSzuttor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general i think we want to use the multiprocessing module for the communication instead of plain sockets in the future. Besides my comments it looks okay to me.

samples/gibbs_ensemble/gibbs.py Outdated Show resolved Hide resolved
samples/gibbs_ensemble/gibbs.py Outdated Show resolved Hide resolved
Comment on lines +31 to +33
def random_particle(self):
id = np.random.choice(self._system.part[:].id)
return self._system.part[id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not return np.random.choice(self._system.part)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc states it requires an 1d array, which ParticleList is not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

samples/gibbs_ensemble/gibbs.py Show resolved Hide resolved


class Client():
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should consistently use """

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really want to do such things manually. I personally use """, whereas the previous author used '''. I don't really find it important. If autopep8 can be convinced to do this automatically, I'm fine with adding that rule

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it just about consistency... just thought that python code in the samples is much more prominent to users than code in the python interface that is not read directly too often. Search and replace is also trivial in this case but if you think that code style is not that important in the samples you can leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, always use """triple double quotes""" around docstrings

pep8

Comment on lines 105 to 136
while msg[0] != MessageId.END:
# receive command to execute next step
msg = self._recv_data()
if msg[0] == MessageId.END:
break
elif msg[0] == MessageId.MOVE_PART:
self.move_particle()
self._send_data(pickle.dumps(
[MessageId.ENERGY, self.energy()]))
elif msg[0] == MessageId.CHANGE_VOLUME:
box_l = msg[1]
self.change_volume_and_rescale_particles(box_l)
self._send_data(pickle.dumps(
[MessageId.ENERGY, self.energy()]))
elif msg[0] == MessageId.CHANGE_VOLUME_REVERT:
box_l = msg[1]
self.change_volume_and_rescale_particles(box_l)
elif msg[0] == MessageId.EXCHANGE_PART_ADD:
n = msg[1]
self.insert_particle(n)
self._send_data(pickle.dumps(
[MessageId.ENERGY, self.energy()]))
elif msg[0] == MessageId.EXCHANGE_PART_REMOVE:
self.remove_particle()
self._send_data(pickle.dumps(
[MessageId.ENERGY, self.energy()]))
elif msg[0] == MessageId.MOVE_PART_REVERT:
self.move_particle_revert()
elif msg[0] == MessageId.EXCHANGE_PART_ADD_REVERT:
self.remove_last_added_particle()
elif msg[0] == MessageId.EXCHANGE_PART_REMOVE_REVERT:
self.remove_particle_revert()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole block is probably something like a MessageHandler that can react on messages of type MessageId and is more or less just a dict from MessageId to a function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this. Tutors will have to do a bit more explaining when teching Python beginners, though.

parser = argparse.ArgumentParser()
parser.add_argument('-s', '--seed', type=int)
parser.add_argument('-H', '--host', type=str, default='localhost')
parser.add_argument('-p', '--port', type=int, required=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, make it a positional argument

samples/gibbs_ensemble/gibbs_ensemble_client.py Outdated Show resolved Hide resolved
samples/gibbs_ensemble/gibbs_ensemble_client.py Outdated Show resolved Hide resolved
samples/gibbs_ensemble/gibbs_ensemble_client.py Outdated Show resolved Hide resolved
@KaiSzuttor
Copy link
Member

KaiSzuttor commented Sep 29, 2020

it's kind of odd to declare an optional argument and make it required, why not use a positional argument?
I don’t use positional arguments, because the order can be mistaken easily. That’s what I also preach in my “managing simulation data” lecture.

If you have so many positional arguments that you fear to mix them up, you are probably doing something wrong. Here, for examples it seems you only have one required argument which should also be positional just as with any tool you use in your console.

@RudolfWeeber
Copy link
Contributor Author

I agree about using some higher-level communication infrastrucutre, but will not have time to do it in this iteration.

Before this becoms an Espresso module rather than a sample/tutorial, move classes should be introduced as well. I.e. move classse should hold the detail of a specific move and the steps to apply and revert it. It might be discussed whether acceptance criteria should be member functions or free functions (which would be ugly, because one cannot overload w/r to argument type in Python). Also not in this iteration, though.

@RudolfWeeber
Copy link
Contributor Author

This is as far as I want to take the refactoring of the Gibbs ensemble scripts.

The task I actually intended to do was makeing a tutorial in which this is used to calculate critical temperatures...

Let's see how that goes...

boxes[0].box_l**3, boxes[1].box_l**3,
(tock - tick) * 1000 / REPORT_INTERVAL))
if len(densities[0]):
print("step %d, densities %.3f %.3f, volumes %.2f, %.2f %.f ms / move" %
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this snippet gets a lot more readable if you use format strings ala print(f'step {i}). I also think that in general it's good practice to not directly print stuff to stdout but instead use the logging infrastructure of python.

import pickle
import scipy.optimize
import matplotlib.pyplot as plt

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstrings are missing

# First index to consider
start = np.where(np.array(indices) > WARMUP_LENGTH)[0][0]

# Determin which is teh dilute phase from the last density recorded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a spellchecker for the comments

parser = argparse.ArgumentParser()
parser.add_argument('-s', '--seed', type=int)
parser.add_argument('-H', '--host', type=str, default='localhost')
parser.add_argument('-p', '--port', type=int, required=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a single required argument it should be avoided to mark it as optional. If you provide proper help messages with the help keyword argument to add_argument the meaning of the positional argument is not ambiguous.

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 19, 2020 via email

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 19, 2020 via email

@KaiSzuttor
Copy link
Member

for a single required argument it should be avoided to mark it as optional. If you provide proper help messages with the help keyword argument to >add_argument the meaning of the positional argument is not ambiguous.
Can we please agree to disagree about positional arguments. I find their use in simulations scripts error-prone as I mentioned before.

I still cannot follow your reasoning. Correct me if I'm wrong: you are saying that the standard way of calling any command line script should not be done here because it is error prone? What makes our sample scripts so special? And how can you use a single positional argument incorrectly?

@KaiSzuttor
Copy link
Member

  • def random_particle(self): + id = np.random.choice(self._system.part[:].id) + return self._system.part[id] https://docs.python.org/3/library/random.html#random.choice
    I am aware of that function but I don’t trust it in this case, because from the docs it is not apparent to me, wheter the implementation relies on [] (or getitem()) to retrieve individual elements from the popoluation. If it does, or will in the future, this would give wrong results, because system.part[] behaves somewhat like a dictionary. So if len(system.part) is N that does not mean that all valid entries are accessible via [0] .. [N-1]. Safety fist.

so you are saying that our ParticleList does not behave like a python sequence? That's at least a surprising interface, right?

@RudolfWeeber
Copy link
Contributor Author

The most important issue with this is that it doesn't reproduce the scaling of
(rho_dens-rho_dilute) \sim (t-t_c)^beta with beta=0.32
reported in the litrature.

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 19, 2020 via email

@KaiSzuttor
Copy link
Member

Yes. It should probably be renamed to something not containing list.

or maybe make it behave like a sequence?

@KaiSzuttor
Copy link
Member

it's probably a good idea to have some discussion about the particle container in the interface. the current one seems to have some surprising features.

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 19, 2020 via email

@KaiSzuttor
Copy link
Member

it's probably a good idea to have some discussion about the particle container in the interface. the current one seems to have some surprising features.
Better think of it like a dictionary with some index slicing capabilities. We had this discussion already. The existing behaviour was deemed the least supririnsg, since the argument of the [] always matches the particle id.

fair enough. i still think there is room for improvement, e.g. by totally disentangling container index and particle id. It's not clear to me why they should be connected. However, that would mean to introduce a breaking change to a lot of user scripts.

@itischler
Copy link
Contributor

itischler commented Oct 22, 2020

The most important issue with this is that it doesn't reproduce the scaling of
(rho_dens-rho_dilute) \sim (t-t_c)^beta with beta=0.32
reported in the litrature.

From the first look at it it seems like the system is not very well equilibrated. The equilibration can be sped up by adjusting the probability of the different attempt types. was there any reason to set them to have equal probability?

@jonaslandsgesell
Copy link
Member

From the first look at it it seems like the system is not very well equilibrated. The equilibration can be sped up by adjusting the probability of the different attempt types. was there any reason to set them to have equal probability?

Running MD for the displacement moves and then accepting rejecting the resulting conformations with the MC criterion would help a lot in speeding up and would avoid inventing hand taylored MC moves for every system (think of polymers etc).

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 22, 2020 via email

@RudolfWeeber
Copy link
Contributor Author

RudolfWeeber commented Oct 22, 2020 via email

@jonaslandsgesell
Copy link
Member

Jonas, I agree, but there is no point making it more complex until it is correct.

Applying the hybrid MD/MC scheme would also help making it correct and avoid convergence issues in MC due to long correlations.

@itischler
Copy link
Contributor

That’s what The Book suggested. We didn’t have more success with other ratios, though.

Ok. I mean it should not matter as long as detailed balance is in place.

Also, the number of cycles was increased considerably, so even if the ratio is not optimal, a reasonable result would be expected. The results aren’t just noisy, the scaling is incorrect (fitted via aggregate.py).

It is not that the data is noisy, but looking at the densities, they still drift apart after those 300000 steps (at least for the simulations i looked at). Which would lead to a smaller density difference when averaging begins at 100000 steps.

@KaiSzuttor
Copy link
Member

Jonas, I agree, but there is no point making it more complex until it is correct.

I don't quite get the strategy here. How do you plan to find the issue? Is the only test for correctness a statistical test?

@RudolfWeeber RudolfWeeber added this to the Espresso 4.2 milestone Oct 24, 2020
@jngrad jngrad removed this from the Espresso 4.2 milestone Jun 7, 2021
kodiakhq bot added a commit that referenced this pull request Jun 7, 2021
This is an updated version of #3903 

Several updates were performed:
- Multiprocessing and Communication via the internal Pipes was used instead of subprocessing and communicating via socket
- The sample system no longer uses correction terms but a truncation of 5 sigma instead

This updated way of communication between processes is far more efficient than using TCP connections which is either a result of an increase in latency or the multiprocessing module uses internal methods to efficiently use nonblocking communication between the processes.

The simulation has the following structure:
 - The run_sim.py script starts the clients and controls them (same functionality as the socket script in #3903)
 - The gibbs_client_class.py script contains the client class with the methods needed
 - The client_system.py script initializes the system in every box
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants