[02:07:10] --- haba has left [03:37:19] --- haba has become available [06:41:33] --- reuteras has become available [08:07:05] --- deason/gmail has become available [09:01:22] --- rra has become available [09:02:00] --- reuteras has left [10:01:53] do we still need syscall 65 in /etc/name_to_sysnum on solaris 10 with 1.6? [10:03:08] yes, the ioctl-based stuff is only on for solaris 11 and beyond [10:03:24] alright, thought so. [10:03:54] still unsafe to stop/start the client too? [10:05:15] as far as I know that's fixed [10:05:54] umount /afs then unload the kernel module? [10:09:46] yeah, modunload -i , with the number from modinfo [10:14:53] and if afsd gets started without -dynroot and there's no root.afs, is it expected that the machine will panic on shutdown? [10:16:51] well, it's less unexpected than otherwise; I would've thought it would panic on trying to mount /afs first [10:17:01] me too [10:17:04] and that doesn't mean it's acceptable; can you provide details? [10:17:10] not something i'm worried about [10:17:51] http://pastebin.com/UAbGQdep [10:18:05] supposedly I have the crash dump as well [10:26:57] well, happy to report that on solaris 10 update 9 sparc, starting/stopping the 1.6.0pre7 client seems to be ok [10:27:41] rra: you around? having some k5start issues [10:27:53] not immediately clear why that crashes; it's not in our code, implying we've left something in a bad state or something... I'll see if I can reproduce it later [10:28:32] yeah, no worries. only time this should ever happen in our environment is if afsd is started without that option, which should only happen if the client isn't configured correctly [10:29:19] what happens when you started afsd, anyway? it starts up, but accessing anything in /afs returns an error? [10:29:27] yeah [10:29:39] connection timed out I think [10:31:01] my screen scrollback doesn't go back that far. it only panic'd during shutdown. [10:32:19] -bash: /afs/bx.psu.edu/user/phalenor/.bash_logout: Connection timed out [10:32:44] oh, hmm, we do have a root.afs volume [10:33:21] but on top of that, we distribute a CellServDB with just the cell name in it, so without -afsdb the client would not have known about that volume's location. [10:36:06] --- mfelliott has become available [13:11:43] --- steven.jenkins has left [15:42:53] --- deason/gmail has left [15:51:50] --- haba has left [17:02:24] --- meffie has left [17:15:43] --- rra has left: Disconnected [17:33:09] --- Russ has become available [18:05:04] --- Russ has left [18:57:05] what are the chances that there's a memory leak in pre7 on linux? [18:58:44] greater than zero. but without details, that's all i can tell you. [18:59:56] I'll have to pay closer attention to what processes are running, but... http://www.bx.psu.edu/~phalenor/graph.png [19:00:52] and console starts spitting out hung process message. will have to wait for it to happen again (has been happening every few days, so not just with pre7, happened with pre4 as well) [19:01:21] fnord [19:01:24] well, that's not really details so much as you are running low on memory. [19:01:33] weird. this is showing up in my G+ chat window. [19:01:43] awesome [19:02:24] ran out of memory, actually. not sure what else was running on that machine. feel free to ignore for now. [19:03:16] ah, if you click on the chat link, I get a little popup with this conversation [19:03:27] and it has sound, oh joy [19:22:35] say one does a find /afs/some/path, assuming reasonable values passed to afsd, how much memory will that consume? [19:23:19] that's a very open ended question. depends on things like how much rx traffic. you're going about this wrong [19:23:54] afs will try to free everything. if you think you leaked memory, shut down afs. does it come back, or does afs report leaking anything? that's a good place to start. random guessing? bad place to start [19:24:42] well, it's a guess based on a previous bug that exhibited similar symptoms, so it was worth a shot [19:27:12] so, memory used increased 2GB over a few minutes, restarting the client and it freed slightly less than 1GB [19:29:41] well, it's conceivable we are allocating too many rx packets but i am surprised no one else has complained [19:30:24] i'll keep an eye on it [19:33:35] this machine crashed after an increase from ~9GB to 32GB memory + 6GB of swap over a period of 2 hours. [19:34:12] well, if it's really Rx, logging on allocs can be added. [19:39:39] rxdebug can report the number of allocated packets in case it is a packet leak [19:39:48] that too [19:44:31] let me reproduce the symptoms first, then go from there