Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create NodeMemoryMajorPagesFaults.md #66

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions content/runbooks/node/NodeMemoryMajorPagesFaults.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title: NodeMemoryMajorPagesFaults
weight: 20
---

# NodeMemoryMajorPagesFaults

## Meaning

The `NodeMemoryMajorPagesFaults` alert is triggered when a Kubernetes node experiences a significant number of major page faults, indicating issues with memory access. This could be due to excessive swapping of memory pages to the swap area or general memory problems.

As shown here:
[Kubernetes-Mixin](https://monitoring.mixins.dev/node-exporter/)
> Memory major pages are occurring at very high rate at {{ $labels.instance }}, 500 major page faults per second for the last 15 minutes, is currently at {{ printf "%.2f" $value }}.
>
> Please check that there is enough memory available at this instance.

## Impact

- Possible performance degradation for applications running on the affected Kubernetes node.
- Increased latency for memory accesses.
- Risk of application crashes or errors due to memory overload.

## Diagnosis

1. Check the utilization of physical memory (RAM) and swap space on the affected Kubernetes node.
2. Examine the memory profiles of running applications to determine which processes are consuming memory.
3. Monitor memory usage over time to identify trends and peak loads.


## Mitigation

1. Optimize the resource utilization of running applications by stopping unnecessary processes or adjusting their resource requirements.
2. Review Kubernetes resource requests and limits configuration to ensure they match the actual requirements of the applications.
3. Scale the resources of the Kubernetes node as needed by adding additional memory or increasing node capacity.
4. Optimize swap configuration to ensure efficient utilization while minimizing the impact of swapping on performance.