Oom

From Halfface
Jump to navigation Jump to search

dmesg

Use to get oom output

dmesg

analyze dmesg output

Find the original "Out of memory" line in one of the files that also contains total_vm. Thirty second to a minute (could be more, could be less) before that line you'll find something like:

kernel: foobar invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0

You should also find a table somewhere between that line and the "Out of memory" line with headers like this:

[ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name

This may not tell you much more than you already know, but the fields are:

pid The process ID.
uid User ID.
tgid Thread group ID.
total_vm Virtual memory use (in 4 kB pages)
rss Resident memory use (in 4 kB pages)
nr_ptes Page table entries
swapents Swap entries
oom_score_adj Usually 0; a lower number indicates the process will be less likely to die when the OOM killer is invoked.

You can mostly ignore nr_ptes and swapents although I believe these are factors in determining who gets killed. This is not necessarily the process using the most memory, but it very likely is. For more about the selection process, see here. Basically, the process that ends up with the highest oom score is killed -- that's the "score" reported on the "Out of memory" line; unfortunately the other scores aren't reported but that table provides some clues in terms of factors.

Again, this probably won't do much more than illuminate the obvious: the system ran out of memory and mysqld was choosen to die because killing it would release the most resources. This does not necessary mean mysqld is doing anything wrong. You can look at the table to see if anything else went way out of line at the time, but there may not be any clear culprit: the system can run out of memory simply because you misjudged or misconfigured the running processes.

Increase resistance to oom

echo -200 | sudo tee /proc/42/oom_score_adj

overcommit

http://engineering.pivotal.io/post/virtual_memory_settings_in_linux_-_the_problem_with_overcommit/