[04:45:06] --- jaltman/FrogsLeap has left: Disconnected [04:53:34] --- jaltman/FrogsLeap has become available [07:34:30] --- deason has become available [07:56:42] --- rra has become available [08:08:34] --- Simon Wilkinson has become available [12:43:44] Apparently AFS gets very unhappy when the OOM killer kills afsd. [12:43:47] That's entertaining. [12:44:13] I'm shocked, shocked I say. [12:44:34] we could probably make that less likely with oom_adj, or whatever it's called [12:45:58] Yeah. IIRC, there's a syscall that a process can do that asks not to be killed. [12:46:36] Might be a good idea, although of course by the time the system is so far gone that afsd looks like an attractive target, it's probably pretty much hosed anyway. [12:46:56] I would have thought most afsd processes can't be killed, though... know if it was the afsdb process? [12:47:47] Not sure -- what we have is just the log message: [12:47:49] Apr 6 09:53:03 barley01 kernel: [294106.423207] Killed process 1630 (afsd) total-vm:6480kB, anon-rss:4kB, file-rss:0kB [12:48:19] Shortly thereafter the system rebooted. [12:48:19] > so far gone that afsd looks like an attractive target well, without oom_adj, I thought the algorithm oom used to find the proc to kill was "kill the most important process in the system" :) [12:48:56] Yeah, that's what I always tell people. In theory it kills the process that frees up the most resources; in practice, it starts by killing everything that would let you fix the problem and then continues by killing whatever you'll miss the most. :) [12:49:33] In this particular case, it looks like it started by killing upstart and rsyslogd, which is entertaining. [12:49:47] It had already killed portmap, getty, cron, and xinetd before it got to afsd. [12:50:15] do you know what was using up all the memory? [12:50:23] matlab! [12:50:41] "I've got an idea, how 'bout we put init in an unkillable cgroup and then fork off new ones from login, cron, etc." [12:51:07] ...which was probably the only process(es) left by the time it rebooted? [12:51:16] That's my guess. :) [12:51:56] Actually, turns out the syscall is experimental. [12:52:00] well, I suppose it is more important [12:52:14] the other programs probably wouldn't exist were it not for, uh, math [12:52:16] We should just change the init script to echo −17 > /proc/$pid/oom_adj [12:52:26] This is Debian, right? [12:52:39] You have to be somewhat careful about how you do that because it can do weird things in some vserver containers. [12:52:43] I should go figure out what sshd does. [12:53:22] do we even know the pid for afsd from the init script? we spawn a bunch of them; afsd might have an easier time of knowing with pids to affect [12:53:33] Oh, hm, sshd now does something internal. [12:53:36] * Remove SSHD_OOM_ADJUST configuration. sshd now unconditionally makes itself non-OOM-killable, and doesn't require configuration to avoid log spam in virtualisation containers (closes: #555625). [12:53:47] Oooh. I wonder how sshd does that. [12:54:50] See oom_adjust_setup in openbsd-compat/port-linux.c [12:55:11] It looks like it effectively writes -17 into /proc/self/oom_adj. :) [12:55:31] It has some other goo to double-check some other stuff first, though. [12:55:57] Is that a Debian patch, or upstream [12:56:06] Appears to be upstream. [12:56:25] oom_adj is inherited, so if afsd did it before fork, it would only have to do so once. [12:59:15] Nice. The interface changes at 2.6.36 [13:00:36] Looks like it would be easy to implement the same for afsd, though. In a vserver container, the write to /proc will just fail. [13:01:17] Yeah, and you just silently ignore the failure. [13:01:55] I think the sshd logic is checking whether another parameter exists first, because if it does it's safe to write the adjustment. I think what happens otherwise is that it doesn't fail, it syslogs an error, which was what that was trying to avoid. [13:05:24] It also saves the previous value so that it can restore the OOM state once it is done forking child processes. [13:08:25] --- deason has left [15:39:11] --- rra has left: Disconnected [15:58:11] --- Russ has become available [16:45:40] --- meffie has left [19:04:34] --- Russ has left: Disconnected