Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Adding and documenting configs in nx-parallel #75

Merged
merged 22 commits into from
Aug 26, 2024

Conversation

Schefflera-Arboricola
Copy link
Member

@Schefflera-Arboricola Schefflera-Arboricola commented Aug 19, 2024

This PR makes nx-parallel compatible with networkx's config system and documents how to configure nx-parallel using joblib.parallel_config.

In PR #68, I attempted to create a unified configuration system in nx-parallel by wrapping the joblib.Parallel()(inside nx-parallel algorithms) within a with joblib.parallel_config(configs) context, here configs are extracted from nx.config.backends.parallel. This approach made NetworkX’s config closely mirror joblib’s, giving the appearance of synchronization between the two systems. However, in the last meeting, Dan mentioned that this approach complicates things.

In this PR, I tried to simplify things by clearly documenting both configuration methods for nx-parallel and also added implementation to make nx-parallel compatible with NetworkX’s config. The _set_nx_config decorator wrap the nx-parallel function call within a joblib.parallel_config context only when NetworkX configs are enabled. If not, then configs set via joblib take precedence. (to make nx-parallel compatible with joblib.parallel_config I just had to remove n_jobs from the internal joblib.Parallel)

In the last meeting, Dan also questioned why we need to make nx-parallel compatible with networkx’s config and why can’t we simply use joblib to configure nx-parallel. And it is necessary to keep nx-parallel compatible with other networkx backends and backend functionalities. Like we would need this compatibility if we would want to change backend_priority and some backend configs in a context like this:

with nx.config(backend_priority=[‘cugraph`, ‘parallel’], backends[“parallel”][“n_jobs”] = 5):
	nx.square_clustering(G)

refer Config.md file in this PR for more.

In the issue #76, I have clearly outlined the challenges in creating a unified configuration system in nx-parallel.

ref. PR #68 for more context

Thank you :)

@Schefflera-Arboricola Schefflera-Arboricola added the type: Enhancement New feature or request label Aug 19, 2024
@Schefflera-Arboricola
Copy link
Member Author

Notes for reviewers(@dschult):

  • TODOs(not done yet to avoid over-crowding this PR; for ease of review; only done for square_clustering right now):
    • add the _set_nx_config decorator to all algorithms
    • rename cpu_count() to get_n_jobs()
    • rename total_cores to n_jobs
  • Please LMK if it would be possible to get this PR also reviewed and merged before GSoC ends(on August 26th) based on your schedule, so that I can accordingly mention it in my GSoC final report. I'm open to scheduling a meeting to discuss this PR if that would help you review this better and faster.

_nx_parallel/config.py Outdated Show resolved Hide resolved
_nx_parallel/__init__.py Outdated Show resolved Hide resolved
]


def _configure_if_nx_active():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed on this name in the last meeting. (ref. Notes)

@@ -86,7 +86,7 @@ line-length = 88
target-version = 'py310'

[tool.ruff.lint.per-file-ignores]
"__init__.py" = ['I', 'F403']
"__init__.py" = ['I', 'F403', 'F401']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'F401' is added here to not get "module imported but unused." while running pre-commit on the nx_parallel/__init__.py because we just import _configure_if_nx_active in the file to include it in the public API but we don't use it.

nx_parallel/__init__.py Show resolved Hide resolved
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good -- I have a couple questions below. Probably tweaks in wording will clear it up. Thanks...

Config.md Outdated Show resolved Hide resolved
Config.md Outdated Show resolved Hide resolved
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at the whole PR and the files rather than the diffs. So I've got a few more suggestions to think about. Adapt or reject them as you see fit.

Config.md Outdated Show resolved Hide resolved
Config.md Outdated Show resolved Hide resolved
Config.md Outdated Show resolved Hide resolved
Config.md Outdated Show resolved Hide resolved
Config.md Outdated Show resolved Hide resolved
Config.md Outdated

In this example, if `another_nx_backend` also internally utilizes `joblib.Parallel` (without exposing it to the user) within its implementation of the `square_clustering` algorithm, then the `nx-parallel` configurations set by `joblib.parallel_config` will influence the internal `joblib.Parallel` used by `another_nx_backend`. To prevent unexpected behavior, it is advisable to configure these settings through the NetworkX configuration system.

If you're using `nx-parallel` independently, either the NetworkX or `joblib` configuration system can be used, depending on your preference.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressing the question of what "independently" means: one attempt

Suggested change
If you're using `nx-parallel` independently, either the NetworkX or `joblib` configuration system can be used, depending on your preference.
If your code using `nx-parallel` does not simultaneously use joblib (including inside another networkx backend) then either the `networkx.config` system or the `joblib.parallel_config` system can be used, depending on your preference.

Also, I think this paragraph should appear at the top of this section, as I think most users will be in this situation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would be a great idea to mention this here because there might be other conflicts too that we don't know of right now. This was just one of the cases where a conflict can occur. LMK what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we don't want to exclude other possible conflicts -- so let's not be too specific about them.

The first point I'm looking for here is a better phrase than "using nx-parallel independently". (Independently of what?)

The second point is that this seems like a nice first paragraph for this section rather than coming down here. A reader who is using nx-parallel without other joblib code will be able to read this sentence and skip the rest of this section if it doesn't apply to them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the recent commit resolves this concern :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance of trouble when the other backend doesn't use joblib?
For example, could the config using joblib.parallel_config interfere in any way with nx-cugraph or graphblas-algorithms?

I think the only way to have a "conflict" is if the other code is using joblib too. Right?

Copy link
Member Author

@Schefflera-Arboricola Schefflera-Arboricola Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your question. I'm not familiar with the codebases of other backends, so I can't say for certain. However, the occurrence of the unexpected behaviours also largely depends on the specific joblib backend(loky, threading, multiprocessing, dask, etc.) that the user is employing.

Also, it's beneficial to use NetworkX's configuration when setting backend_priority and configurations within a context for multiple networkx backends, like this:

with nx.config(backend_priority = ['parallel', 'another_nx_backend`], backends=Config(parallel=<nx_parallel_configs>, another_nx_backend=<another_nx_backend_configs>)):
    ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't matter what the codebases of the other backends are: the only config options nx-parallel uses are related to joblib. And the only way for another backend to be influenced by those choices is if they look at or use the config options in joblib. So only backends which use joblib will have a possible conflict over config values.

In the interest of putting this into place, I am going to merge this PR. And I'll open another one to make further suggestions.

@@ -107,6 +107,13 @@ Note that for all functions inside `nx_code.py` that do not have an nx-parallel
import networkx as nx
import nx_parallel as nxp

# enabling networkx's config for nx-parallel
Copy link
Member

@dschult dschult Aug 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wonder if putting config issues into this section is a distraction. They are trying to understand backend usage, not how to configure. Maybe a comment line pointing to the config docs here is sufficient. If you want to keep this here, there should at least be a word indicating that the config stuff is optional.

Suggested change
# enabling networkx's config for nx-parallel
# optional enabling networkx's config for nx-parallel

Copy link
Member Author

@Schefflera-Arboricola Schefflera-Arboricola Aug 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is optional. We don't get any errors when we run with the default configs. But, we are also not running the algorithms on multiple cpu cores or threads because n_jobs=None by default. So, to actually run an algorithm as a parallel algorithm a user would have to set at least the n_jobs config to something non-None or an integer(excluding 0 and 1). I have updated the README.md accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait -- the default value gives no parallel computation? That doesn't seem right to me. Shouldn't the default be to use all the cpus we can get?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait -- the default value gives no parallel computation? That doesn't seem right to me.

I understand your concern. However, this behavior is consistent with joblib.parallel_config, where parallel computation isn’t enabled by default (i.e. n_jobs=None by default). Since modifying the defaults within joblib.parallel_config is outside my control(bcoz that would mean changing joblib's codebase), changing the default n_jobs in NetworkX’s configuration from None to -1 will introduce inconsistencies between the two configuration systems, and increase differences between the two config systems, leading to user confusion and potential pitfalls.

Shouldn't the default be to use all the cpus we can get?

That was indeed the previous default, but I've since updated the documentation to reflect the current behavior. Reverting this change would necessitate another extensive documentation update across the entire package, which might not be feasible before the GSoC deadline.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great

@Schefflera-Arboricola Schefflera-Arboricola added the Infrastructure Related to the general infrastructure and organisation of code in the repo label Aug 24, 2024
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve of this PR.

I will make some suggested wording changes about when to use which config system once I figure out what they should be. But there is no point in delaying this PR for those changes.

Thanks very much @Schefflera-Arboricola for putting this together and guiding it through!!

@dschult dschult merged commit 15de782 into networkx:main Aug 26, 2024
11 checks passed
@jarrodmillman jarrodmillman added this to the 0.3 milestone Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Infrastructure Related to the general infrastructure and organisation of code in the repo type: Enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

4 participants