Grep goes through all logs under the directory and therefore will show at least the just ran command itself from the \/var\/log\/auth.log. Actual log marks of OOM killed processes would look something like the following.<\/p>\n\n\n\n
kernel: Out of memory: Kill process 9163 (mysqld) score 511 or sacrifice child<\/pre>\n\n\n\nThe log note here shows the process killed was mysqld with pid 9163 and an OOM score of 511 at the time it was killed. Your log messages may vary depending on Linux distribution and system configuration.<\/p>\n\n\n\n
If for example a process crucial to your web application was killed as a result of an out-of-memory situation, you have a couple of options, reduce the amount of memory asked by the process, disallow processes to overcommit memory, or simply add more memory to your server configuration.<\/p>\n\n\n\n
Current resource usage<\/h2>\n\n\n\n
Linux comes with a few handy tools for tracking processes that can help with identifying possible resource outages. You can track memory usage for example with the command below.<\/p>\n\n\n\n
free -h<\/pre>\n\n\n\nThe command prints out current memory statistics, for example in a 1 GB system the output is something along the lines of the example underneath.<\/p>\n\n\n\n
total used free shared buffers cached\nMem: 993M 738M 255M 5.7M 64M 439M\n-\/+ buffers\/cache: 234M 759M\nSwap: 0B 0B 0B<\/pre>\n\n\n\nHere it is important to make the distinction between application-used memory, buffers and caches. On the Mem line of the output it would appear nearly 75% of our RAM is in use, but then again over half of the used memory is occupied by cached data.<\/p>\n\n\n\n
The difference is that while applications reserve memory for their own use, the cache is simply commonly used hard drive data that the kernel stores temporarily in RAM space for faster access, which on the application level is considered free memory.<\/p>\n\n\n\n
Keeping that in mind, it\u2019s easier to understand why used and free memory is listed twice, on the second line is conveniently calculated the actual memory usage when taking into account the amount of memory occupied by buffers and cache.<\/p>\n\n\n\n
In this example, the system is using merely 234MB of the total available 993MB, and no process is in immediate danger of being killed to save resources.<\/p>\n\n\n\n
Another useful tool for memory monitoring is \u2018top\u2019, which displays useful continuously updated information about processes\u2019 memory and CPU usage, runtime and other statistics. This is particularly useful for identifying resource exhaustive tasks.<\/p>\n\n\n\n
top<\/pre>\n\n\n\nYou can scroll the list using Page Up and Page Down buttons on your keyboard. The program runs in the foreground until cancelled by pressing \u2018q\u2019 to quit. The resource usage is shown in percentages and gives an easy overview of your system\u2019s workload.<\/p>\n\n\n\n
top - 17:33:10 up 6 days, 1:22, 2 users, load average: 0.00, 0.01, 0.05\nTasks: 72 total, 2 running, 70 sleeping, 0 stopped, 0 zombie\n%Cpu(s): 0.3 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st\nKiB Mem: 1017800 total, 722776 used, 295024 free, 66264 buffers\nKiB Swap: 0 total, 0 used, 0 free. 484748 cached Mem\n\n PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND\n 1 root 20 0 33448 2784 1448 S 0.0 0.3 0:02.91 init\n 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd\n 3 root 20 0 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd\/0\n 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker\/0:0H\n 6 root 20 0 0 0 0 S 0.0 0.0 0:01.92 kworker\/u2:0\n 7 root 20 0 0 0 0 S 0.0 0.0 0:05.48 rcu_sched\n<\/pre>\n\n\n\nIn the example output shown above, the system is idle and the memory usage is nominal.<\/p>\n\n\n\n
Check if your process is at risk<\/h2>\n\n\n\n
If your server\u2019s memory gets used up to the extent that it can threaten system stability, the Out-of-memory killer will choose which process to eliminate based on many variables such as the amount of work done that would be lost and total memory freed. Linux keeps a score for each running process, which represents the likelihood at which the process would be killed in an OOM situation.<\/p>\n\n\n\n
This score is stored on file in \/proc\/<pid>\/oom_score, where pid is the identification number for the process you are looking into. The pid can be easily found using the following command.<\/p>\n\n\n\n
ps aux | grep <process name><\/pre>\n\n\n\nThe output of the command when searching for mysql, for example, would be similar to the example below.<\/p>\n\n\n\n
mysql 5872 0.0 5.0 623912 51236 ? Ssl Jul16 2:42 \/usr\/sbin\/mysqld<\/pre>\n\n\n\nHere the process ID is the first number on the row, 5872 in this case, which then can be used to get further information on this particular task.<\/p>\n\n\n\n
cat \/proc\/5872\/oom_score<\/pre>\n\n\n\nThe readout of this gives us a single numerical value for the chance of the process getting axed by the OOM killer. The higher the number the more likely the task is to be chosen if an out-of-memory situation should arise.<\/p>\n\n\n\n
If your important process has a very high OOM score, it is possible the process is wasting memory and should be looked into. However just a high OOM score, if the memory usage otherwise remains nominal, is no reason for concern. OOM killer can be disabled, but this is not recommended as it might cause unhandled exceptions in out-of-memory situations, possibly leading to a kernel panic or even a system halt.<\/p>\n\n\n\n
Disable over commit<\/h2>\n\n\n\n
In major Linux distributions, the kernel allows by default for processes to request more memory than is currently free in the system to improve memory utilization. This is based on the heuristics that the processes never truly use all the memory they request. However, if your system is at risk of running out of memory, and you wish to prevent losing tasks to OOM killer, it is possible to disallow memory overcommit.<\/p>\n\n\n\n
To change how the system handles overcommit calls Linux has an application called \u2018sysctl\u2019 that is used to modify kernel parameters at runtime. You can list all sysctl controlled parameters using the following.<\/p>\n\n\n\n
sudo sysctl -a<\/pre>\n\n\n\nThe particular parameters that control memory are very imaginatively named vm.overcommit_memory and vm.overcommit_ratio. To change the overcommit mode, use the below command.<\/p>\n\n\n\n
sudo sysctl -w vm.overcommit_memory=2<\/pre>\n\n\n\nThis parameter has 3 different values:<\/p>\n\n\n\n