-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ensemble_wrapper decorator function #199
base: develop
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #199 +/- ##
===========================================
+ Coverage 78.93% 79.20% +0.26%
===========================================
Files 12 12
Lines 1709 1731 +22
Branches 254 256 +2
===========================================
+ Hits 1349 1371 +22
Misses 276 276
Partials 84 84
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a bunch of comments inline but my major issue is that the wrapper class has nothing to do with EnsembleAnalysis
— isinstance(wrapped_analysis, EnsembleAnalysis)
will be False
. This is not only confusing and a bit of a quick hack but it also leads to code duplication. It's problematic that the run()
methods are different because as soon as you change it in EnsembleAnalysis
you also must remember changing it in the wrapper. This becomes unmaintainable quickly.
Rethink your approach. Perhaps you can use this code as a basis to rewrite EnsembleAnalysis
? Either way, there can be only one.
|
||
ensemble_wrapper Decorator | ||
__________________________ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more text here that shows a developer how to use it. Perhaps based on a real world example using some of the simpler MDAnalysis base classes?
There should be enough information for another developer (or future you) to learn how to use it. This means answers to the following questions:
- What problem is the code going to solve?
- What is the advantage of doing it this way?
- What do I need to provide?
- How do I do it?
- What are the limitations?
|
||
|
||
def ensemble_wrapper(cls): | ||
"""A decorator for :class:`MDAnalysis.Universe <MDAnalysis.analysis.base.AnalysisBase>` subclasses modifying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a long 1-line string – shorten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and add this in the body
pass | ||
|
||
Ens = Ensemble(dirname='mol_dir) | ||
ExRun = Example(Ens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call the instance ex
or something else in lower case for an instance.
class Example(AnalysisBase): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A pass class is too trivial, it does not show how you deal with AtomGroups and whatever else one might want to know. It just shows how to use a decorator. Less trivial example here and perhaps a proper example in the text outside the function.
def _prepare_ensemble(self): | ||
# Defined separately so user can modify behavior | ||
self._results_dict = {x: None for x in self._ensemble.keys()} | ||
|
||
def _conclude_system(self): | ||
# Defined separately so user can modify behavior | ||
self._results_dict[self._key] = self._SystemRun.results | ||
|
||
def _conclude_ensemble(self): | ||
self.results = self._results_dict | ||
|
||
def run(self, start=0, stop=0, step=1): | ||
self._prepare_ensemble() | ||
for self._key in self._ensemble.keys(): | ||
self._SystemRun = self._Analysis(self._ensemble[self._key], *self._args, **self._kwargs) | ||
self._SystemRun.run(start=start, step=step, stop=stop) | ||
self._conclude_system() | ||
self._conclude_ensemble() | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like code duplication – my bigger conceptual problem is that a wrapped AnalysisBase is not an instance of EnsembleAnalysis.
@@ -161,3 +162,29 @@ def test_value_error(self): | |||
dh4 = ens.select_atoms('name C4 or name C17 or name S2 or name N3') | |||
with pytest.raises(ValueError): | |||
dh_run = DihedralAnalysis([dh1, dh2, dh4, dh3]).run(start=0, stop=4, step=1) | |||
|
|||
def test_ensemble_wrapper1(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the "1" ?
|
||
def _single_frame(self): | ||
self._res_arr.append(len(self.system.select_atoms('not resname SOL'))) | ||
assert self._res_arr[-1] == 42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there an assert
in AnalysisBase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to say that you know that there are 42 atoms in the solute. However, this assert looks as if you wanted it to be part of the test so it's confusing. At a minimum, add a comment.
But perhaps you could just make it a more realistic analysis? People also look at tests to see how to use code so giving a good example here is useful. And you could use it for the docs as well.
class BaseTest(AnalysisBase): | ||
def __init__(self, system: mda.Universe): | ||
super(BaseTest, self).__init__(system.trajectory) | ||
self.system = system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call it universe
instead of system
? That would be clearer.
In order to re-use MDAnalysis analysis classes, I suggest to change the EnsembleAnalysis run method so that it either just runs def _single_universe(self):
analysis = MDAnalysis.analysis.foo.BarAnalysis(....)
analysis.run(**kwargs)
# store results right away
self.result_dict[self._key[0]][self._key[1]][self._key[2]] = analysis.result (We don't even need What do you think @ALescoulie ? Would this simplify things sufficiently without breaking any of the existing code? |
I think that could work, I could also do a runtime check to see if |
Maybe have not implemented methods raise NotImplementedError and try/except? not sure if this is a good pattern… |
Thats what I was thinking then at runtime calling either one of two different run methods, either one that works on each frame, or one that works on each |
I am not sure how much work will get done on this PR. I converted it to draft so that we have something to look at when we think about extending EnsembleAnalysis. |
ensemble_wrapper decorator
To simplify the process of extending AnalysisBase sub-classes to Ensembles I wrote a class decorator function.
Example use:
It just adds new
__init__
,_prepare_ensemble
._conclude_system
,_conclude_ensemble
, andrun
methods that run the base class and collect the results. The prepare and concluded are defined so that the user can modify the data processing of results.TODO