AMM high level documentation #5456

crusaderky · 2021-10-22T11:59:45Z

No description provided.

jcrist · 2021-10-22T14:08:42Z

docs/source/active_memory_manager.rst

+   distributed:
+     scheduler:
+       active-memory-manager:
+         start: true


Minor nit - is it too late to change these config fields? I'd find enable: true to be a clearer config name.

IMHO I find enable to be ambiguous. You can call run_once while it's not started.

jcrist · 2021-10-22T14:11:03Z

docs/source/active_memory_manager.rst

+
+.. note::
+   This policy is incompatible with :meth:`~distributed.Client.replicate` and with the
+   ``broadcast`` parameter of :meth:`~distributed.Client.scatter`.


Can you comment more on what this incompatibility entails? What happens if AMM is enabled and scatter(..., broadcast=True) is called in user code?

jcrist · 2021-10-22T14:19:07Z

docs/source/active_memory_manager.rst

+    - Replicate a task that is not yet in memory
+    - Create more replicas of a task than there are workers
+    - Create replicas of a task on workers that already hold them
+    - Create replicas on paused or retiring workers


Is there a way to enable debug logs for issues like these? When developing a new policy, should you try to avoid generating unacceptable suggestions (in which case you'd want to be able to debug these issues), or is generating unacceptable suggestions (and relying on the AMM to ignore them) in a policy fine and intended if it makes the policy code simpler?

The latter. Policies should be simple and readable. The subtle edge case management is left to the AMM.

Can you add a note to this section clarifying that policies shouldn't worry about these cases?

jcrist

Overall this looks good to me.

jcrist · 2021-10-22T15:46:13Z

docs/source/index.rst

   ipython
-   prometheus


Why are these rows deleted?

those pages don't exist anywhere

Good catch -- these are over in the dask/dask docs now (e.g. https://docs.dask.org/en/stable/setup/prometheus.html)

crusaderky added 2 commits October 22, 2021 10:47

sphinx maintenance

08f25ef

AMM high level documentation

67f9672

crusaderky self-assigned this Oct 22, 2021

crusaderky requested review from jrbourbeau, fjetter and gjoseph92 October 22, 2021 12:01

crusaderky added the documentation Improve or add to documentation label Oct 22, 2021

jcrist reviewed Oct 22, 2021

View reviewed changes

code review

5e1786e

crusaderky force-pushed the AMM/docs branch from 951f925 to 5e1786e Compare October 22, 2021 14:35

jcrist approved these changes Oct 22, 2021

View reviewed changes

crusaderky merged commit cdc68cc into dask:main Oct 22, 2021

crusaderky deleted the AMM/docs branch October 22, 2021 16:15

zanieb pushed a commit to zanieb/distributed that referenced this pull request Oct 28, 2021

AMM high level documentation (dask#5456)

9e3c828

crusaderky added the memory label Mar 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMM high level documentation #5456

AMM high level documentation #5456

crusaderky commented Oct 22, 2021

jcrist Oct 22, 2021

crusaderky Oct 22, 2021

jcrist Oct 22, 2021

crusaderky Oct 22, 2021

jcrist Oct 22, 2021

crusaderky Oct 22, 2021

jcrist Oct 22, 2021

crusaderky Oct 22, 2021

jcrist left a comment

jcrist Oct 22, 2021

crusaderky Oct 22, 2021 •

edited

Loading

jrbourbeau Oct 22, 2021

AMM high level documentation #5456

AMM high level documentation #5456

Conversation

crusaderky commented Oct 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcrist left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crusaderky Oct 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crusaderky Oct 22, 2021 •

edited

Loading