You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am having a similar issue using Redis ElasticCache instances.
The difference is that the check does work but periodically will show "UNK Unable to get ElastiCache details and statistics" which results in nagios alerting.
It seems to happen randomly about twice a week on a replica node. when this happens i run the command
and it bounces between returning a result and showing unknown for a couple minutes then returns to normal. Im not sure if the issue is with the AWS api, or if the start and end time for getting the metric is the cause.
I noticed that the cpu metric takes into account a delay for the metric updating on CloudWatch and was wondering if adding this for the memory metric would fix the issue?
When executing the following, the the script returns "UNK Unable...":
check_elasticache.py --region us-east-1 -i cluster1 -m memory -w 10 -c 5
I traced this to an issue with the metrics dict (used by the function get_cluster_stats):
metrics = {'status': 'ElastiCache availability',
'cpu': 'CPUUtilization',
'memory': 'BytesUsedForCache', <==== Problem
'swap': 'SwapUsage'}
I wasn't able to find a metric entitled "BytesUsedForCache" within the list of available Cloudwatch metrics.
I did find BytesUsedForCacheItems and FreeableMemory, however.
The text was updated successfully, but these errors were encountered: