Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect Linux cgroup CPU number for --threads value #2083

Merged
merged 6 commits into from
Jan 23, 2024

Conversation

angelapwen
Copy link
Contributor

@angelapwen angelapwen commented Jan 12, 2024

We want to respect cgroup constraints so that when we run in a container, we respect the limits set for the container rather than use the host OS's number of cores.

We check both /sys/fs/cgroup/cpuset.cpus.effective (cgroup v2) and /sys/fs/cgroup/cpuset.cpus (cgroup v1) to find the number of cores available. We also check sys/fs/cgroup/cpu.max (v1, v2) to calculate the number of cores from the limits set in this file.

The max threads value is set to the minimum of these values, and if no values were found in these files, we default to the original value of the host OS.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

@angelapwen angelapwen force-pushed the use-cgroup-for-threads branch 2 times, most recently from d93d38b to 75788e9 Compare January 22, 2024 12:58
We want to respect `cgroup` constraints so that when we run in a container, the default max threads value is appropriately set to the maximum number of cores in this container rather than the host OS's.

We check both `/sys/fs/cgroup/cpuset.cpus.effective` (`cgroup v2`) and `/sys/fs/cgroup/cpuset.cpus` (`cgroup v1`) to find the number of cores.
@angelapwen angelapwen marked this pull request as ready for review January 22, 2024 13:11
@angelapwen angelapwen requested a review from a team as a code owner January 22, 2024 13:11
Copy link
Contributor

@henrymercer henrymercer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A couple of comments:

src/util.ts Outdated
}

const cpuMaxString = fs.readFileSync(cpuMaxFile, "utf-8");
const cpuLimit = cpuMaxString.split(" ")[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: log a warning and return undefined if there are more than two values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I logged as a debug message rather than warning because I don't think the user necessarily needs to be warned in this case.

src/util.ts Outdated Show resolved Hide resolved
src/util.ts Outdated
Comment on lines 458 to 459
const cpuStartIndex = parseInt(token.charAt(0));
const cpuEndIndex = parseInt(token.charAt(2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider splitting by - in case we have a range like 0-15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 done! Thanks!

maxThreadsCandidates.push(
...["/sys/fs/cgroup/cpuset.cpus.effective", "/sys/fs/cgroup/cpuset.cpus"]
.map((file) => getCgroupCpuCountFromCpus(file, logger))
.filter((count) => count !== undefined && count > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to record my understanding, this will filter out any NaNs we get since NaN is not greater than 0.

Copy link
Contributor

@henrymercer henrymercer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants