Understanding Linux OOM Killer and avoiding perticular process from being killed in case of Out of Memory

What is Out of Memory Killer ?

Major distribution kernels set the default value of /proc/sys/vm/overcommit_memory to zero, which means that processes can request more memory than is currently free in the system. This is done based on the heuristics that allocated memory is not used immediately, and that processes, over their lifetime, also do not use all of the memory they allocate. Without overcommit, a system will not fully utilize its memory, thus wasting some of it. Overcommiting memory allows the system to use the memory in a more efficient way, but at the risk of OOM situations. Memory-hogging programs can deplete the system’s memory, bringing the whole system to a grinding halt. This can lead to a situation, when memory is so low, that even a single page cannot be allocated to a user process, to allow the administrator to kill an appropriate task, or to the kernel to carry out important operations such as freeing memory. In such a situation, the OOM-killer kicks in and identifies the process to be the sacrificial lamb for the benefit of the rest of the system.

So, the OOM Killer or Out of Memory killer is a linux kernel functionality ( refer to kernel source code mm/oom_kill.c ) which is executed only when the system starts going out of memory.

How to Control which process to avoid getting Killed ?

Users and system administrators have often asked for ways to control the behavior of the OOM killer. To facilitate control, the /proc//oom_adj knob was introduced to save important processes in the system from being killed, and define an order of processes to be killed. The possible values of oom_adj range from -17 to +15. The higher the score, more likely the associated process is to be killed by OOM-killer. If oom_adj is set to -17, the process is not considered for OOM-killing.

Lets try to create a simple process as,

 vim main.c 
[c] #include <stdio.h> int main(int argc, char **argv) { while(1); } [/c]
 $ gcc simple_process main.c 
 $ ./simple_process & 

This will create a simple process in background on this terminal, Now lets check the process ID of this simple process as,

 $ pgrep simple_process
 $ cd /proc/16350 
 $ cat oom_adj 

$ sudo echo -17 > oom_adj

 $ cat oom_adj 

How linux decides which process should get killed first ?

The process to be killed in an out-of-memory situation is selected based on its badness score. The badness score is reflected in /proc//oom_score. This value is determined on the basis that the system loses the minimum amount of work done, recovers a large amount of memory, doesn’t kill any innocent process eating tons of memory, and kills the minimum number of processes (if possible limited to one). The badness score is computed using the original memory size of the process, its CPU time (utime + stime), the run time (uptime – start time) and its oom_adj value. The more memory the process uses, the higher the score. The longer a process is alive in the system, the smaller the score.

We have chrome browser running which consumes more memory compared to our process simple_process so lets check the PID of chromium-browse as,

 $ top | grep chromium-browse
17720 myuser   20   0  480400 154296  91384 R  11.8  3.8   0:54.33 chromium-browse 

Lets check oom_score of chromium browser as,

 $ cat /proc/17720/oom_score

and lets check the oom_score of the simple_process we created above as,

 $ cat /proc/16350/oom_score

Which shows that chromium-browsers oom_score is more than our process simple_process so browser has more chances of getting killed when OOM Killer gets executed.

How to invoke OOM Killer manually for understanding which process gets killed first
For this, please refer to our post at How to invoke OOM Killer manually for understanding which process gets killed first

Reference – https://lwn.net/Articles/317814/

Leave a Comment