Replies: 1 comment
-
Thanks for the explanation @muhrin, FYI I have copied some of this to a design rationale page on plumpy (aiidateam/plumpy#202). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
After some discussions with @ltalirz and @chrisjsewell regarding their excellent debugging of some recent memory leak issues (#4699, #4603, #4698, and possibly others) I thought it might be useful to share my view on part of the reason why they were difficult to track down (and perhaps why they happend in the first place) while providing historical context.
Plumpy and its interactions with AiiDA implement a mixed functions/coroutines model where many of the of plumpy's actions are handled by coroutine calls allowing us to use
awaits
andyields
to effectively have cooperative multitasking between AiiDA processes. This is all well and good, andasyncio
is designed precisely to handle such workloads.The complexity of the system arises (at least in major part) because of an early design decision not to push asynchronous code up to the user (and in the early days I didn't even want to push this on AiiDA developers as this was a fairly new concept). This means that things like
WorkChain
steps are regular functions (that were called via a coroutine further up the call stack). The fact thatasyncio
doesn't support re-entrancy by design means that once you've 'gone function' there's no going back and that branch of the call stack is now forever synchronous (@unkcpz 's excellent reentrancy work notwithstanding).My desire not to push too much async code into AiiDA has led to additional complexity in certain places that tries to get around the, intentional, limitations of
asyncio
#4699 being a case in point. HereTransportQueue.request_transport
is a regular function (well generator) that tries to give you transport by scheduling a callback in the loop (after the next safe open interval) and yields aFuture
which will resolve to the transport when ready. This led to a difficult to identify leak involving the process stack being 'held' because it is needed when thedo_open
call finally gets reached.For me, the lesson here is that, at least for AiiDA developers it's possibly better to ask them to go through the task of getting to know
asyncio
(a well documented, now fairly widely adopted paradigm) rather than dealing with custom code that tries to hide this from them. In this caserequest_transport
would become acoroutine
thatawait
sdo_open
which would itself deal with the safe open interval.It's probably too much to ask users to change
WorkChain
s andworkfunctions
to be coroutines but it may be worthwhile changing parts of AiiDA's (and plumpy's) internals to be moreasyncio
friendly. The general approach would be to:call_later
,call_soon
Future
was returned, it may be that these can directly comeawait
s or not, some judgement may be necessary.await
s and can be dealt with usingensure_future
Finally, it's worth considering cases when coroutines shouldn't be used. Well, if I've understood it correctly Guido's reasoning for not having re-entrancy is that any asynchronicity is obvious to see (you just look for the
await
s). Therefore if there are actions that should be atomic a regular function should be used (I have to point out, though, that this very absolutist approach doesn't always work because you may want an atomic set of operations to occur that themselves require something asynchronous such as transport in which case they, too, are forced to become coroutines to highlight the fact that internally they do some async stuff).If it's useful some of this can be rolled into an AEP and be systematically adopted.
Happy coding!
Beta Was this translation helpful? Give feedback.
All reactions