-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider not restricting level of parallelism in a container when user did not ask for it. #13224
Comments
We have a real world scenario where application (which would fall under |
It seems we can:
|
I wonder if there are enough scenarios when this knob would be useful. If limiting parallelism is the goal, it seems better to limit it on OS level - by using affinity settings or |
So you are saying that we should revert the change we've made in the past that added CPU count clipping by the quota completely? |
Yes. We should undo CPU count clipping. - Always use the count of how many threads can be running at the same time. For example reducing the number of heaps or even switching to workstation GC when OS may schedule threads on a lot more cores may lead to unnecessary contentions. It kind of makes sense to do that when you really can only be OS-scheduled on a reduced number of cores. Otherwise underestimating cores may get us into suboptimal behaviors. |
We should also consider CPU quota as a public API. But that is not a high-pri. It is only useful when there is a way to check the current CPU load and we do not have an API for that either. Internally we need access to quota in a few scenarios where we compare current CPU load against maximum possible - for example in thread pool starvation detection. However, I think all cases are in coreclr, so that alone is not a reason for public API, other than "if we need it, perhaps someone else needs too", but that, again, equally holds for "current CPU load" API and we do not have that. |
For worker threads, this is configurable already through
Please be aware that |
@mrmartan - from what I read in https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/ , limiting CPU count is possible although is not considered common. You would need to have
|
I already knew about that feature. Did not want to bring in here a beta feature that seems to deal more with CPU exclusivity. In such a case one can assume you have the entire CPU for yourself, quotas could be considered meaningless and I assume .NET would work well even now, wouldn't it? It also seemed to me, and I might be mistaken, that the feature is targeted more at single-threaded designs (like NodeJS or Redis) and NUMA use cases. I guess saying that "Kubernetes does not allow for setting that on purpose" was misleading. Rather in most .NET use cases you would want the default behavior of sharing CPUs, especially in bursty workloads, and letting the orchestrator and OS do its thing. |
+1. The CPU clipping change was justified by ASP.NET micro-benchmarks improvements in constrained containers, but it is not architecturally sound. We should find a better way to get the improvements we got from the original change.
FWIW, we have a few APIs for that |
I stand corrected :-) The issue at hand may be of higher importance though. Perhaps even 3.0 or 3.1 |
+1, This would be a useful thing to fix. Putting everything bucket number 1 in this issue is causing problems for our scenario as well. |
Would this affect heaps/memory consumption in any way? In my case, not having CPU quota caused much higher memory consumption. Details here |
You should still be able to get those benefits by adjusting the |
Not only. We experimented for a while with quotas for latency-sensitive applications, with really bad results. Low-latency applications try to process most requests with a low response time, then if overloaded let a few of them timeout. If you're throttled, then for the remainder of this time share, all the requests are going to timeout, which impacts negatively the response time distribution. Having dedicated cores instead of a quota makes the application behavior more predictable. |
Yeah. Aren't we are losing the "works great out of the box" though? I don't think |
In a container scenario, depending on orchestration used, it is generally possible for a user to limit CPU quota and level of parallelism.
One does not imply another. That is why there are distinct knobs.
Example: on a machine with 4 cores, it is conceivable to have 8 containers limited to 1/8 of total CPU time for fairness/over-committing, but
OR
It looks like coreclr may see no distinction between these cases and treat them both as
#1
.There is a common pattern where once the quota falls below user-set Core count, we reduce the available Core count to match the quota.
User could ask for reduced parallelism, but did not.
Even if user might be doing this by accident, perhaps that should be respected? Otherwise, if there is a legit need for
#2
, it cannot be done.==
https://github.com/dotnet/coreclr/blob/1d78ed88ff9078febdbb5c37173b313da5c023a3/src/utilcode/util.cpp#L1294
https://github.com/dotnet/coreclr/blob/31acad70cae93caf0afe753775f14ebe80db31ed/src/classlibnative/bcltype/system.cpp#L358
https://github.com/dotnet/coreclr/blob/31acad70cae93caf0afe753775f14ebe80db31ed/src/classlibnative/bcltype/system.cpp#L358
The text was updated successfully, but these errors were encountered: