[00:02:01] --- abo has left [00:04:12] --- abo has become available [00:05:58] --- Russ has left: Disconnected [00:53:41] --- lars.malinowsky has become available [06:38:12] --- todd has left [06:52:58] --- reuteras has left [07:32:07] --- deason/gmail has become available [07:41:05] --- jaltman/FrogsLeap has left: Disconnected [07:45:18] --- jaltman/FrogsLeap has become available [07:51:08] --- meffie has left: Lost connection [07:53:22] --- lars.malinowsky has left [08:06:18] --- meffie has become available [09:34:19] --- rra has become available [10:21:54] Hmm, first fbsd panic in quite some time. (At shutdown, who is surprised?) panic: mtx_trylock() of destroyed mutex @ /usr/ports/net/openafs/work/openafs-1.6.0pre6/src/afs/FBSD/osi_vm.c:87 Can't look now, but preliminarily blaming (an interaction of) "always flush all vcaches" (with fbsd doing something else wrong). [10:53:14] something destroyed a vcache without deleting it from the global list.... but we could just not flushallvcaches on fbsd I guess [10:54:49] Well, more testing is in order, of course. But I'm at work, now. [10:58:56] The full console output is: afs_vop_reclaim: afs_FlushVCache failed code 16 vnode vc 0xffffff800122c000 vp 0xffffff0088555000 tag afs, fid: 0.1.1.1, opens 0, writers 0 states statd readonly afs: WARM shutting down of: vcaches... panic: mtx_trylock() of destroyed mutex @ /usr/ports/net/openafs/work/openafs-1.6.0pre6/src/afs/FBSD/osi_vm.c:87 Though the reclaim failure for root.afs is not uncommon (though not fully understood). [12:28:54] --- lars.malinowsky has become available [15:20:50] --- deason/gmail has left [15:21:00] --- jaltman/FrogsLeap has left: Disconnected [15:25:04] --- jaltman/FrogsLeap has become available [15:44:43] --- Simon Wilkinson has become available [15:52:04] --- shadow@gmail.com/barnowl754E0B64 has left [15:52:15] --- shadow@gmail.com/barnowl754E0B64 has become available [16:14:35] --- lars.malinowsky has left [16:14:40] --- lars.malinowsky has become available [16:40:36] --- deason_gmail has become available [17:35:25] --- rra has left: Disconnected [17:41:32] --- lars.malinowsky has left [17:41:38] --- lars.malinowsky has become available [17:51:04] --- Russ has become available [17:54:46] --- lars.malinowsky has left [17:54:51] --- lars.malinowsky has become available [17:55:54] --- steven.jenkins has left [18:00:35] --- steven.jenkins has become available [18:37:08] --- pod has left [18:49:55] Got a chance to take a coredump and look at it -- this is in FlushAllVCaches, trying to flush the root.afs vcache, which has a VBAD vnode attached to it. [18:51:01] So we probably do want to disable FlushAllVCaches for FBSD, though there is definitely another bug here. [19:03:56] well yeah, I didn't think the fact that we're calling flushallvcaches for it is a bug, but if you wanna disable it while other stuff is worked out, then sure [19:06:30] Well, the kernel will close and [a bunch of stuff which] call reclaim on all vnodes associated with the filesystem before it calls our unmount routine, and since reclaim flushes the associated vcache, our unmount calling flushall would just be duplicating work unless we had an active vcache that had no vnode or a vnode not associated with the filesystem, which would likely panic anyway. I think. [19:15:54] --- mfelliott has left [19:31:07] the thing is, that current panic shows that something is destroying the vnode without the vcache getting deleted from the global list [19:31:41] we could also make an fbsd-specific FlushAllVCaches that calls that vflush function or whatever [19:32:46] > something is destroying the vnode without the vcache deleted Yes, the reclaim that the kernel did while walking the list of vnodes associated with /afs/. The FlushVCache from within reclaim failed, but reclaim is not allowed to error. [19:33:49] (This is the "other bug there" I referenced above -- why does flushvcache fail there?) [19:41:41] if it's easily repeatable, you could panic on error from reclaim and see why; the only error I see that's possible is that the vcache has already been flushed [19:42:00] the fact that reclaim cannot fail suggests that may be desirable anyway [19:44:48] I did that once quite some time ago, but I don't remember the result. Not sure if I'll have time to do so again in the next week. [19:54:46] --- deason_gmail has left [19:55:19] --- deason_gmail has become available [19:55:20] --- jakllsch/wormulon has left [19:56:20] --- jakllsch/wormulon has become available [20:29:12] --- deason_gmail has left [20:30:02] --- deason_gmail has become available [20:40:07] --- summatusmentis has left [21:19:09] --- phalenor has left [21:19:15] --- phalenor has become available [21:34:27] --- meffie has left [21:39:06] --- meffie has become available [21:51:27] --- meffie has left [22:00:37] --- meffie has become available [22:05:24] --- ktdreyer has left [22:05:37] --- ktdreyer has become available [22:52:29] --- steven.jenkins has left [22:55:06] --- steven.jenkins has become available [22:58:37] --- deason_gmail has left [23:31:55] --- lars.malinowsky has left [23:41:13] --- reuteras has become available