Skip to content

[batch] Use earlyoom to terminate a process using too much memory#1402

Draft
giordano wants to merge 1 commit intoUCL:masterfrom
giordano:mg/earlyoom
Draft

[batch] Use earlyoom to terminate a process using too much memory#1402
giordano wants to merge 1 commit intoUCL:masterfrom
giordano:mg/earlyoom

Conversation

@giordano
Copy link
Copy Markdown
Member

Linux's built-in system to manage out-of-memory (OOM) errors is unreliable and non-informative, even if it kicks in (and if often it does too late, when the machine is already unresponsive) there's no easy way to tell that a process on a remote machine was terminated because it was using too much memory. With earlyoom we bypass Linux's OOM management and have better control over what's printed to screen in case of error.

Ref: #1333 (comment).

Note: opening as draft because I haven't tested this in production yet.

Linux's built-in system to manage out-of-memory (OOM) errors is unreliable and
non-informative, even if it kicks in (and if often it does too late, when the
machine is already unresponsive) there's no easy way to tell that a process on a
remote machine was terminated because it was using too much memory.  With
`earlyoom` we bypass Linux's OOM management and have better control over what's
printed to screen in case of error.
Comment thread src/tlo/cli.py
# Terminate the `earlyoom` process
kill "${{{{EARLYOOM_PID}}}}"
# Print line(s) in the log file referencing terminated processes, if any.
grep "^sending SIGTERM to process" "${{{{EARLYOOM_LOG}}}}"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catches also SIGKILL:

Suggested change
grep "^sending SIGTERM to process" "${{{{EARLYOOM_LOG}}}}"
grep "^sending \w+ to process" "${{{{EARLYOOM_LOG}}}}"

Comment thread src/tlo/cli.py
Comment on lines +252 to +253
# Start the `earlyoom` daemon.
nohup earlyoom -m 95 -s 100 &> "${{{{EARLYOOM_LOG}}}}" &
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the argument -m 5 backward, the argument is the amount of memory available, not used:

Suggested change
# Start the `earlyoom` daemon.
nohup earlyoom -m 95 -s 100 &> "${{{{EARLYOOM_LOG}}}}" &
# Start the `earlyoom` daemon. Ignore swap usage, terminate process if less than 5% of main memory available.
nohup earlyoom -m 5 -s 100 &> "${{{{EARLYOOM_LOG}}}}" &

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants