Log something about OOMKilled containers #69676

sylr · 2018-10-11T12:07:13Z

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:

Container gets killed because it tries to use more memory than allowed.

What you expected to happen:

Have an OOMKilled event tied to the pod and logs about this

/sig node

The text was updated successfully, but these errors were encountered:

sylr · 2018-10-11T12:28:36Z

@dims in #sig-sheduling mentionned this https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/docker_container.go#L356-L359

dims · 2018-11-14T19:47:14Z

related discussion in sig-schedling slack https://kubernetes.slack.com/messages/C09TP78DV/

sylr · 2018-11-14T21:43:24Z

https://kubernetes.slack.com/archives/C09TP78DV/p1539255149000100 to be exact but it might fall outside the slack allowed history for free tier pretty soon.

nikopen · 2018-11-15T09:38:58Z

FYI there is a related Prometheus metric that is able to grep OOMKilled:
it's kube_pod_container_status_terminated_reason and kube_pod_container_status_last_terminated_reason

https://github.com/kubernetes/kube-state-metrics/blob/master/Documentation/pod-metrics.md

reason=<OOMKilled|Error|Completed|ContainerCannotRun>

It might be a bit tricky to implement as an alert, thus kube_pod_container_status_last_terminated_reason was created to grep the previous status as the OOMKilled can be so fast that is not grepped.
kubernetes/kube-state-metrics#344

Still, it would be nice if a logline was printed by k8s by default.

nikopen · 2018-11-15T10:55:14Z

kube_pod_container_status_last_terminated_reason is not yet released in a version, I'll ping to see if they're able to cut a release

anderson4u2 · 2018-11-18T19:31:55Z

@nikopen Hi, I just tried kube_pod_container_status_last_terminated_reason in version 1.4.0 and it seems to work? i.e. use sum_over_time(kube_pod_container_status_terminated_reason{reason!="Completed"}[5m]) > 0 to detect a recent non-graceful termination.

nikopen · 2018-11-19T17:52:32Z

Indeed, it seems to work :)

@brancz do you know why this happens? also tried it in 1.3.1.

    - alert: OOMKilled
      expr: sum_over_time(kube_pod_container_status_terminated_reason{reason="OOMKilled"}[5m]) > 0
      for: 1m
      labels:
        severity: warning
      annotations:
        description:  Pod {{$labels.pod}} in {{$labels.namespace}} got OOMKilled

brancz · 2018-11-20T10:55:15Z

Because the kube_pod_container_status_terminated_reason metric was only introduced in 1.4: https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.4.0-rc.0

nikopen · 2018-11-20T11:25:02Z

@brancz my test is in 1.3.1 and kube_pod_container_status_terminated_reason works, was it already there perhaps?

brancz · 2018-11-20T12:51:48Z

Strange, you're right, looks like we have a mistake in the changelog. The code seems to be there in 1.3.1: https://github.com/kubernetes/kube-state-metrics/blob/v1.3.1/collectors/pod.go#L125

boosty · 2018-11-22T16:57:18Z

@anderson4u2 I am a bit confused by your last comment. You wrote:

just tried kube_pod_container_status_last_terminated_reason in version 1.4.0

But in the example below you use kube_pod_container_status_terminated_reason, not kube_pod_container_status_last_terminated_reason.

So as far as I see, the new (very useful) metric kube_pod_container_status_last_terminated_reason is still unreleased.

BrianChristie · 2018-11-28T10:08:38Z

This has been discussed in #sig-instrumentation on Slack and was brought up on the sig-node call yesterday to determine a path forward.

There are two requests:

Have an OOMKilled event tied to the Pod (as noted by @sylr)
Have a count of termination reason by Pod in the Kubelet (or cAdvisor?), exposed to Prometheus as a monotonically increasing counter

To summarize what's currently available in kube-state-metrics:

kube_pod_container_status_terminated_reason
This is a (binary) gauge which has a value of 1 for the current reason, and 0 for all other reasons. As soon as the Pod restarts, all reasons go to 0.
kube_pod_container_status_last_terminated_reason
Same as above for the prior reason, so it's available after the Pod restarts.
kube_pod_container_status_restarts_total
A count of the restarts, with no detail on the reason.

The issues are:

There is no way to get a count of the reasons over time (for alerting and debugging).
Some termination reasons will never be recorded by Prometheus when the reason changes before the next Prometheus scrape.

For example, given a Pod that is sometimes being OOMKilled, and sometimes crashing, it's desired to be able to view the historical termination reasons over time.

As a note: it was discussed and it appears the design of kube-state-metrics prevents aggregating the reason gauge into counters, and it's preferred if this happens at the source.

Implementing both of the above requests will significantly improve the ability of cluster-users and monitoring vendors to debug when Pods are failing.

Can @kubernetes/sig-node-feature-requests provide some guidance on the next steps here?

CC: @dchen1107

dims · 2018-11-28T11:29:12Z

long-term-issue (note to self)

brancz · 2018-11-28T12:53:01Z

cc @kubernetes/sig-node-feature-requests

WanLinghao · 2018-12-03T06:51:56Z

/cc

hypnoglow · 2019-01-18T12:41:37Z

This query combines container restart and termination reason:

sum by (pod, container, reason) (kube_pod_container_status_last_terminated_reason{})
* on (pod,container) group_left
sum by (pod, container) (changes(kube_pod_container_status_restarts_total{}[1m]))

boosty · 2019-01-19T13:13:43Z

Our team came up with a custom controller to implement the idea of having an OOMKilled event tied to the Pod. Please find it here: https://github.com/xing/kubernetes-oom-event-generator

From the README:
The Controller listens to the Kubernetes API for "Container Started" events and searches for those claiming they were OOMKilled previously. For matching ones an Event is generated as Warning with the reason PreviousContainerWasOOMKilled.

We would be very happy to get feedback on it.

bjhaid · 2019-01-22T16:17:56Z

Meta point, there are several times we have had OOM events show up in dmesg that cadvisor missed them and they did not end up as events...

brancz · 2019-01-23T08:54:23Z

@bjhaid fwiw you can use mtail against dmesg to produce metrics about oomkill messages.

bjhaid · 2019-01-23T14:52:38Z

@brancz thanks! I just wanted to point out that cadvisor misses OOM events and that should be considered when relying on it...

brancz · 2019-01-23T14:59:12Z

I see. If not already that should be reported to cAdvisor :)

fejta-bot · 2019-04-30T15:05:55Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

k8s-triage-robot · 2022-12-12T16:30:15Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

panicbit · 2022-12-12T16:44:04Z

/remove-lifecycle stale

dgrisonnet · 2023-02-03T17:40:49Z

Is this still relevant after #108004? It seems to me that it is covering the gaps kube-state-metrics has with OOMKilled events.

lukeschlather · 2023-02-05T19:08:22Z

The problem here is that a pod can disappear and there's no record of why. A metric is useful in that it lets you know something is wrong but it doesn't actually tell you what is wrong. K8s shouldn't be killing pods without leaving a record of why it killed which pod in an obvious place.

dims · 2023-02-05T21:02:28Z

@lukeschlather for the record, the kernel kills pods, not k8s. that's the whole problem with this issue :(

please google for ( "oom kill kernel" )

lukeschlather · 2023-02-06T01:30:17Z

Are the memory requests and limits just cgroups under the hood?

dims · 2023-02-06T02:18:08Z

https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource-limits-are-run

@lukeschlather ⬆️

michelefa1988 · 2023-03-15T12:25:07Z

Following, same issue here

k8s-triage-robot · 2023-06-13T12:28:00Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

pierluigilenoci · 2023-06-13T14:57:16Z

/remove-lifecycle stale

lukeschlather · 2024-01-18T17:48:10Z

In GKE the only evidence I can find that there's been a problem is often that I can find a log like

Memory cgroup out of memory: Killed process 1472246 (node) total-vm:22290752kB, anon-rss:1000160kB, file-rss:47604kB, shmem-rss:0kB, UID:0 pgtables:17840kB oom_score_adj:985

Looking at the kubernetes side I can see that the pod was restarted, and I'm sure that it was restarted because the process was OOMkilled. But even in the metrics, I think it was killed before any metric could be created that showed it was using a concerning amount of memory. Can that "Memory cgroup out of memory" be modified to emit a log that gives the actual pod name? Or does k8s need to look for Memory cgroup out of memory events? Or is there some other piece of data I'm missing that makes it clear why that specific container was restarted?

k8s-triage-robot · 2024-04-17T18:04:56Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

pierluigilenoci · 2024-04-19T16:02:42Z

/remove-lifecycle stale

dgrisonnet · 2024-04-19T16:48:59Z

I briefly looked through cadvisor's code responsible for reporting the metric and it is looking directly into /dev/kmsg to surface the OOM Kill events performed by the Kernel oom killer https://github.com/google/cadvisor/blob/master/utils/oomparser/oomparser.go.

As far as I can tell it should catch all the OOM kill events, so maybe you are facing a corner case? The parser is relying on a couple of regex that only works for Linux 5.0+ https://github.com/google/cadvisor/blob/master/utils/oomparser/oomparser.go#L30-L33 so it would be worth checking your kernel logs and verifying if and how the log message was produced there.

dgrisonnet · 2024-04-19T17:00:20Z

A metric is useful in that it lets you know something is wrong but it doesn't actually tell you what is wrong.

A log wouldn't point you to the culprit either, the oom killer just knows about processes oom score and will take decisions based on that when memory is overloaded.

If you want to know why the memory usage increased to the point that some processes had to be killed, there are other metrics you can look at. For example the container-level memory usage metrics can tell you which pods on a particular node had a spike in memory utilization.

The goal of the OOM killed metric is to tell you exactly which container got OOM killed in case you want to know what happened to your containers, nothing more. But there are other metrics that are meant to answer your other questions.

k8s-triage-robot · 2024-07-18T17:27:40Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

frittentheke · 2024-07-18T20:39:44Z

/remove-lifecycle stale

dgrisonnet · 2024-07-25T16:12:03Z

@frittentheke are you seeing any gaps in how we report OOMKill that should still be covered by Kubernetes?

Based on my previous comments, I would be inclined to close this issue as fixed by #108004.

frittentheke · 2024-07-25T16:52:57Z

@frittentheke are you seeing any gaps in how we report OOMKill that should still be covered by Kubernetes?
Based on my previous comments, I would be inclined to close this issue as fixed by #108004.

Thanks a bunch for asking @dgrisonnet!

I believe the OP (@sylr) originally was asking for an event to be "logged" or rather emitted by the Kubelet:

What you expected to happen:
Have an OOMKilled event tied to the pod and logs about this

I somehow believe this is not an entirely unreasonable idea. Going OOM is quite an event for a container ;-)

But the request / idea was likely due to the reasons behind the issue kubernetes/kube-state-metrics#2153 and the ephemeral nature of the last terminated reason (being the source for the kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}).

Yes, I believe #108004 does indeed fix the issue of loosing any OOM event by count.

But coming back to the idea of (also) having an event "logged", I believe it's still a good idea to NOT rely on metrics and counters that would need to be scraped, but to also expose them via the API somehow to they are available for debugging via kubectl. This could be an event, as suggested, or some status field containing the timestamp of last restart for each reason, maybe lastRestart_{REASON}.

Currently it's just that the Pod resource only contains the current state and lastState. Any information about the incarnation before that experiencing an OOM event is lost and not available straight via the API.

  containerStatuses:                                                                                                                                                                                                                  
  - containerID: containerd://e667b1666416ab5a463bd0deb6cfcee36122fe2ff5ea0552cae0ac5f15338941                                                                                                                                        
    image: docker.io/polinux/stress:latest                                                                                                                                                                                            
    imageID: docker.io/polinux/stress@sha256:b6144f84f9c15dac80deb48d3a646b55c7043ab1d83ea0a697c09097aaad21aa                                                                                                                         
    lastState:                                                                                                                                                                                                                        
      terminated:                                                                                                                                                                                                                     
        containerID: containerd://7fb5dd4e73d6412472f9f1db5f18a3bec289a50b675922c63fcd68b1e2959e16                                                                                                                                    
        exitCode: 137                                                                                                                                                                                                                 
        finishedAt: "2024-07-25T16:42:47Z"                                                                                                                                                                                            
        reason: OOMKilled                                                                                                                                                                                                             
        startedAt: "2024-07-25T16:42:47Z"                                                                                                                                                                                             
    name: memory-demo-ctr                                                                                                                                                                                                             
    ready: false                                                                                                                                                                                                                      
    restartCount: 2                                                                                                                                                                                                                   
    started: false                                                                                                                                                                                                                    
    state:                                                                                                                                                                                                                            
      terminated:                                                                                                                                                                                                                     
        containerID: containerd://e667b1666416ab5a463bd0deb6cfcee36122fe2ff5ea0552cae0ac5f15338941                                                                                                                                    
        exitCode: 137                                                                                                                                                                                                                 
        finishedAt: "2024-07-25T16:43:02Z"                                                                                                                                                                                            
        reason: OOMKilled                                                                                                                                                                                                             
        startedAt: "2024-07-25T16:43:02Z"

kevinnoel-be · 2024-07-25T16:59:44Z

Or it doesn't really (didn't actually check, only following the links): google/cadvisor#3015

BrianChristie mentioned this issue Nov 28, 2018

Add alerts for containers terminated by OOM (or other reasons) m-lab/prometheus-support#213

Open

BrianChristie mentioned this issue Dec 3, 2018

Enable alerting on OOMs kubernetes-monitoring/kubernetes-mixin#112

Open

brancz mentioned this issue Dec 3, 2018

Expose CronJob FailedNeedsStart state in some way kubernetes/kube-state-metrics#530

Closed

gaorong mentioned this issue Jan 30, 2019

send pod OOM events #73515

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 12, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 12, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2023

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2023

afharo mentioned this issue Nov 24, 2023

Provide last_terminated_reason_timestamp for when last_terminated_reason event happens in K8s Integration elastic/elastic-agent#3802

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 18, 2024

frittentheke mentioned this issue Aug 16, 2024

fix container_oom_events_total always returns 0. google/cadvisor#3278

Open

Log something about OOMKilled containers #69676

Log something about OOMKilled containers #69676

Comments

sylr commented Oct 11, 2018 • edited Loading

sylr commented Oct 11, 2018

dims commented Nov 14, 2018

sylr commented Nov 14, 2018

nikopen commented Nov 15, 2018 • edited Loading

nikopen commented Nov 15, 2018

anderson4u2 commented Nov 18, 2018

nikopen commented Nov 19, 2018

brancz commented Nov 20, 2018

nikopen commented Nov 20, 2018

brancz commented Nov 20, 2018

boosty commented Nov 22, 2018

BrianChristie commented Nov 28, 2018 • edited Loading

dims commented Nov 28, 2018

brancz commented Nov 28, 2018

WanLinghao commented Dec 3, 2018

hypnoglow commented Jan 18, 2019

boosty commented Jan 19, 2019 • edited Loading

bjhaid commented Jan 22, 2019

brancz commented Jan 23, 2019

bjhaid commented Jan 23, 2019

brancz commented Jan 23, 2019

fejta-bot commented Apr 30, 2019

k8s-triage-robot commented Dec 12, 2022

panicbit commented Dec 12, 2022

dgrisonnet commented Feb 3, 2023

lukeschlather commented Feb 5, 2023 • edited Loading

dims commented Feb 5, 2023

lukeschlather commented Feb 6, 2023

dims commented Feb 6, 2023

michelefa1988 commented Mar 15, 2023

k8s-triage-robot commented Jun 13, 2023

pierluigilenoci commented Jun 13, 2023

lukeschlather commented Jan 18, 2024

k8s-triage-robot commented Apr 17, 2024

pierluigilenoci commented Apr 19, 2024

dgrisonnet commented Apr 19, 2024

dgrisonnet commented Apr 19, 2024 • edited Loading

k8s-triage-robot commented Jul 18, 2024

frittentheke commented Jul 18, 2024

dgrisonnet commented Jul 25, 2024

frittentheke commented Jul 25, 2024 • edited Loading

kevinnoel-be commented Jul 25, 2024

sylr commented Oct 11, 2018 •

edited

Loading

nikopen commented Nov 15, 2018 •

edited

Loading

BrianChristie commented Nov 28, 2018 •

edited

Loading

boosty commented Jan 19, 2019 •

edited

Loading

lukeschlather commented Feb 5, 2023 •

edited

Loading

dgrisonnet commented Apr 19, 2024 •

edited

Loading

frittentheke commented Jul 25, 2024 •

edited

Loading