[00:39:06] --- kaj has become available
[01:39:29] --- Simon Wilkinson has become available
[02:16:43] --- abo has become available
[02:17:59] --- kaj has left
[02:20:19] --- Rasmus Kaj has become available
[02:21:48] --- dev-zero@jabber.org has become available
[02:28:19] --- dev-zero@jabber.org has left
[02:31:38] --- dev-zero@jabber.org has become available
[02:43:11] --- Jeffrey Altman has become available
[02:44:20] --- Simon Wilkinson has left: Lost connection
[05:07:32] --- Jeffrey Altman has left
[05:15:10] --- Simon Wilkinson has become available
[05:33:41] --- haba has left
[05:49:34] --- dev-zero@jabber.org has left: Lost connection
[06:26:28] <Simon Wilkinson> Looks like we need to take another look at cda65cda6e60e76be3b546adf9096cb25a7de14e, if the report on openafs-info is anything to go by.
[06:40:17] --- abo has left
[06:40:31] --- abo has become available
[06:54:21] <jaltman> did you mean the discussion on openafs-devel "Permission bug"?
[06:54:29] <Simon Wilkinson> Yes.
[06:58:49] --- jaltman has left: Disconnected
[07:06:34] --- jaltman has become available
[07:22:15] --- deason has become available
[07:22:54] <deason> I think the client-side check for the 'li + owner' case isn't quite correct, but I'm not sure yet
[07:23:08] <deason> so our permissions check for the CStatd case is wrong
[07:25:38] --- reuteras has left
[07:29:02] <Simon Wilkinson> I've never understood li+owner ...
[07:31:09] <deason> what I believe it is (and what appears to be intended in afs_AccessOK) is that if you have li on a file and are owner, you get r and w
[07:31:25] <deason> 
       /* for files, throw in R and W if have I and A (owner).  This makes
         * insert-only dirs work properly */
[07:31:49] <deason> it looks like it assumes you get 'A' if you're the owner from some check, but I don't see how it happens
[07:32:16] --- haba has become available
[07:40:11] <Simon Wilkinson> deason: Does the fileserver not give you the A?
[07:42:28] <deason> well, it gets the rights from the parent dir; I don't have a on it, I have l and i
[07:42:48] <deason> oh, wait; sorry, in a meeting, attention is kinda split
[07:46:22] <deason> (the logic in afs_AccessOK seems odd to me, but I'll need to look later when I can pay more attention)
[07:49:59] --- mho has become available
[08:30:24] --- Rasmus Kaj has left
[08:50:45] --- jaltman has left: Disconnected
[09:00:19] --- haba has left
[09:11:04] <Simon Wilkinson> Before I go read code, does anyone here know how much of the "fewer fsyncs" logic is enabled in 1.4 series fileservers?
[09:18:46] --- deason has left
[09:49:16] --- Russ has become available
[09:55:52] --- dev-zero@jabber.org has become available
[09:56:27] --- dev-zero@jabber.org has left: Lost connection
[11:06:25] <Simon Wilkinson> Actually, it looks like ext3 has the ZFS problem. If you fsync(), then everything that's pending gets sync'd, not just the thing you want to sync.
[11:06:34] <Simon Wilkinson> Russ: What are you guys running on your Linux fileservers?
[11:13:29] <Russ> 1.4.11+dfsg-6 or -5, Debian lenny.
[11:13:52] <Simon Wilkinson> Which filesystem? ext2?
[11:14:28] <Russ> Oh, sorry.  ext3.
[11:19:36] <Simon Wilkinson> With journals? In ordered mode?
[11:20:10] <shadow@gmail.com/owl81CD615B> you're asking russ i assume
[11:20:40] <Russ> With journals, yes, otherwise it's not ext3.
[11:20:48] <shadow@gmail.com/owl81CD615B> (sorry, i missed a couple i think)
[11:20:51] <Russ> How do I know whether it's in ordered mode?  We don't do any special configuration.
[11:26:00] <Simon Wilkinson> Hmmm. I wonder if Debian has ordered or writeback as the default.
[11:26:16] <Russ> I don't see anything obvious in tune2fs -l to tell me.
[11:26:19] --- haba has become available
[11:28:07] <jhutz@jis.mit.edu/owl> look in /proc/mounts
[11:28:26] <Russ> Ah, yes.  ordered.
[11:29:01] --- jaltman has become available
[11:29:30] <Simon Wilkinson> Ordered has the interesting property that fsync() syncs everything.
[11:30:21] <Simon Wilkinson> Which, in combination with firefox-3's sqllite journals, and some slow underlying storage, is killing our fileservers ...
[11:32:32] <Simon Wilkinson> (you can use ext3 without a journal, for example to take advantage of the different directory hashing)
[11:33:16] <Simon Wilkinson> jhutz: Have you pushed firefox3 out yet?
[11:34:40] <jhutz@jis.mit.edu/owl> Depends on the platform.
But our user volumes are on solaris inode
[11:35:08] <Simon Wilkinson> Ah. Okay.
[11:47:20] --- jaltman has left: Disconnected
[11:50:47] --- Kevin Sumner has become available
[11:52:05] --- jaltman has become available
[11:55:35] --- Jeffrey Altman has become available
[11:59:37] --- kula has left
[12:18:17] --- kaj has become available
[12:20:44] --- kula has become available
[12:58:51] --- deason has become available
[13:28:43] --- kaj has left
[13:29:23] --- haba has left
[13:52:07] <Russ> So in theory if I pull 21cbf7fee0a089d94f62baa7df2422e7bc8293f7 from Derrick into a 1.5.69 tree, should I no longer have a USE_FH mismatch on Linux?
[13:52:21] <Russ> Or is there another change still waiting for review?
[13:52:24] <shadow@gmail.com/owl81CD615B> that would be my theory. 
[13:52:57] <Russ> Okay, maybe I'll test that this afternoon, although I should probably roll 1.4.12pre1 + patches packages first.
[13:57:42] --- haba has become available
[14:18:54] --- Kevin Sumner has left
[14:45:13] --- dev-zero@jabber.org has become available
[14:49:43] --- mdionne has become available
[15:13:13] <Simon Wilkinson> So, here's something I'd welcome some feedback on.
[15:16:20] <Simon Wilkinson> In 1.4, in StoreAllSegments, if we get an error from the write operation, and an error from rx_EndCall, then the rx_EndCall error replaces the original write error. This seems wrong to me, and is the opposite of the behaviour in GetDCache. Any thoughts?
[15:18:54] <jaltman> you need to maintain the error from the rx_Write(), otherwise over quota, out of space, access denied, etc. errors are overwritten
[15:22:10] <Simon Wilkinson> Okay, so that's a bug then. But not the cause of our 1.4 problem. That one is more fun.
[15:24:13] --- dev-zero@jabber.org has left: Lost connection
[15:24:20] <jaltman> what is the 1.4 problem?
[15:48:25] --- jaltman has left: Replaced by new connection
[15:48:25] --- jaltman has become available
[15:48:33] --- Jeffrey Altman has left: Replaced by new connection
[15:54:26] <Simon Wilkinson> If StoreMini gets an error, then it invalidates all of the segments of the current vcache. But, when StoreAllSegments is called through DoPartialWrite, that error gets swallowed, and userspace never knows that the store failed. I suspect that this can lead to unexpected data loss.
[15:55:17] * Russ finishes an aklog patch to re-enable DES enctypes if needed. Now, to figure out how to test.
[16:10:46] --- jaltman has left: Disconnected
[17:00:49] <Russ> Pushing changes to Gerrit is kind of pokey.  Does Gerrit's Git repository need a git gc maybe?
[17:02:59] --- jaltman has become available
[17:03:40] <Russ> aklog fix now in Gerrit.
[17:03:49] <Russ> We'll need this for 1.4.12.
[17:04:19] <Russ> Hm, I should probably do the same thing for klog.krb5, huh?
[17:05:07] <Russ> I'll do that later -- time to go play cards.
[17:17:48] <jaltman> can we guarantee that the version of krb5 that we build against will match the version installed on the machine?
[17:22:35] <jaltman> I'm concerned that we build binaries against 1.8 and then try to install on a machine with 1.4 or 1.6?
[17:23:05] <jaltman> In other news, I screwed up and 1.5.69 does not have working AFSDB lookups on Windows.
[17:56:17] <deason> you already need a correct (enough) version for linking to work anyway, right?
[18:28:07] --- mdionne has left
[18:28:50] --- mdionne has become available
[19:41:01] --- mdionne has left
[20:23:44] --- deason has left
[21:14:44] --- jaltman has left: Disconnected
[21:52:17] --- jaltman has become available
[22:06:22] --- kaj has become available
[22:25:47] --- reuteras has become available
[22:42:34] --- kaj has left
[23:50:23] --- kaj has become available