-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect memory utilization under cgroup2 on Amazon Linux 2022 #3323
Comments
Hi, Thanks for reporting this! Looking into it. |
I have instrumented the ecs-agent with logging to confirm that counting "inactive_file" instead of "cache" does in fact report a more accurate memory usage to cloudwatch, or at least it more accurately reflects what the docker-cli reports. See memory usage from docker cli, and the "FIXED" memory usage below, which is using inactive_file instead of cache: on AL2022:
|
FWIW we also seem to have the same issue on AL2 (with respect to docker-cli memory usage accuracy). @ltm can you confirm if you only see this issue on AL2022? Does your application maybe create some "cache" memory that offsets the difference on AL2? on AL2:
|
"cache" memory stat no longer exists in cgroupv2. docker cli subtracts "inactive_file" for the overall mem usage calculation, so do the same for cgroupv2. closes aws#3323
"cache" memory stat no longer exists in cgroupv2. docker cli subtracts "inactive_file" for the overall mem usage calculation, so do the same for cgroupv2. closes aws#3323
"cache" memory stat no longer exists in cgroupv2. docker cli subtracts "inactive_file" for the overall mem usage calculation, so do the same for cgroupv2. closes #3323
Summary
The memory utilization metric reported to CloudWatch doesn't match the metric reported by the Docker CLI when running ECS containers on Amazon Linux 2022. This is a regression of #280.
Description
Amazon Linux 2022 now uses cgroup2 and as such the memory stats reported by Docker have changed from statsV1 to statsV2. Notably the statsV2 memory stats no longer include the
cache
property.Since #582 the memory utilization reported to CloudWatch has been calculated as
(memory_stats.usage - memory_stats.stats.cache) / memory_stats.limit
which matched the Docker CLI at the time. However, with thecache
property missing from the statsV2 memory stats, this calculation is no longer accurate.The Docker CLI currently calculates the memory utilization as
(memory_stats.usage - memory_stats.stats.total_inactive_file) / memory_stats.limit
under cgroup1 and as(memory_stats.usage - memory_stats.stats.inactive_file) / memory_stats.limit
under cgroup2.Expected Behavior
The CloudWatch memory utilization metric should match the metric reported by the Docker CLI.
Observed Behavior
Container with 20.97% memory utilization according to the Docker CLI is reported as 54.30% memory utilization according to CloudWatch.
Environment Details
Supporting Log Snippets
Docker stats JSON:
The text was updated successfully, but these errors were encountered: