[04:57:15] --- Jeffrey Altman has become available [05:12:31] --- meffie has become available [06:18:01] --- Simon Wilkinson has become available [07:55:51] --- matt has become available [08:26:07] --- haba has become available [08:39:51] --- meffie has left [09:25:47] --- haba has left [09:45:15] --- shadow@gmail.com/owlA067D2A1 has become available [09:50:43] --- jaltman has become available [09:56:49] --- Derrick Brashear has become available [09:57:52] --- edgester has become available [10:00:12] for those of you who didn't read it in the mail, the list of items of interest is also in /afs/andrew.cmu.edu/usr/shadow/1.4.12-notes.txt [10:00:28] --- meffie has become available [10:00:59] thank you for all of the time you spent on pullups [10:01:22] sifting took as much time as doing it, sadly. oh well. [10:01:34] that was my fear [10:02:00] one hopes we are caught up, and that 1.4.13 will be easier, since i assume *at least* 1.4.13 will happen [10:02:13] --- jhutz@jis.mit.edu/owl has become available [10:02:46] Maybe we should put a tag on 1.5 at this point to mark it as 'pullups up to date' [10:02:57] when we're done [10:03:35] BTW, thanks for the compile_et fix. that bit me. [10:03:40] Out of the whole set of stuff that flew past last night, the only thing I wondered about was the rx RTT calculation patches. [10:04:48] the rx RTT calculations were wrong before, which had side effects beyond just jake's computations [10:05:15] Just as long as we're happy they're right now! [10:05:59] --- deason has become available [10:06:40] the RTT calculations are certainly less wrong. There is still a hard minimum timeout of 350ms which in many environments is too large [10:07:17] basically the RTT calculation patch was "a bugfix", at least [10:07:24] (since, well, that's true) [10:10:22] Are we waiting for anyone else? [10:10:33] Are we expecting a russ? [10:10:33] no. [10:10:59] russ: acually, maybe... [10:11:20] --- tkeiser has become available [10:11:32] not on jabber tho [10:12:23] we might as well let fly; he can read the logs [10:12:42] as a start: for those of you who read the list, do you see anything obvious missing? [10:13:03] for those of you who did not read the list: please kick yourself in the nuts until you finish reading the list. [10:14:00] I don't think there's anything obviously not there - besides everything that flew past last night! [10:14:16] You mean the list you sent this morning? I read it, but I'm not sure I could say what's missing. [10:14:22] jeff: yes. [10:14:51] simon: i assume the bugfixes aren't really a question. i could go back and pick out the ones possibly controversial i guess [10:15:23] Well, I think they're all fine. They're just the things that I've been going "hang on, shouldn't we pull up ... oh, that happened last night" [10:15:35] would the panic watchdog timer thing be open to discussion? [10:16:08] Oh, the other thing would be volume tags. Should we support the new volume tag syntax in a 1.4 release [10:16:53] of those, i would say the salvager chdir changes (so cores don't get left everywhere); the userland ENETUNREACH handling in Rx (but it already works in the kernel, so...);and the rest are all things like bugfixes or don't crash, or added stuff for macos 10.6 [10:17:10] ah right, we should talk about the panic watchdog timer [10:17:22] without the thread count update it's not usable, but that's one more line [10:18:01] By "new volume tags", you mean volume dump tags? [10:18:11] volume tag syntax: it's new functionality, so i didn't pull it. it's new functionality that doesn't change behavior, i guess, tho [10:18:17] he does [10:19:07] oh, i suppose one more possibly controversial thing: making linux file locks not broken [10:19:13] I don't quite understand why the thread count update is necessary, but it's a trivial change; and not the controversial part [10:19:41] I thought "making linux file locks not broken" went in? [10:19:43] the thread count change is needed so we allow 128 minus service threads number [10:19:45] In which case, I definitely think we should have support in a 1.4 release for correctly handling unknown tags per the new spec. That will make it much easier to move and restore volumes to 1.4 servers in a partially converted cell. [10:19:50] simon: it did go in [10:20:00] question is, should it not have? [10:20:18] well yes, but why is that necessary? [10:20:20] jhutz: that's not a big deal to do, and as you are the person i'd most expect to object, .... [10:20:22] ==jhutz - that's my thoughts. If we're preparing for an update, then we should be preparing 1.4 to live alongside 1.6. [10:20:25] I mean, for such a short-lived thread [10:20:40] I think we will want the volume tag support to make it easier for sites to test 1.6 and roll back to 1.4 [10:20:51] but we're talking about 3 different things at once at the moment. So I'll let you and deason sort that one out, then return to this and locking. [10:20:56] deason: if the goal is to collect a clean core, making things risky wrt hold bits seems unwise? [10:21:03] does the voume tag change break moving volumes back to a 1.4.11 server? [10:21:08] i'll take the volume tag as a todo and do it [10:21:09] --- haba has become available [10:21:13] edgester: no [10:21:19] ok, them I'm fine [10:21:21] it just means we accept more tags [10:21:42] hold bits? but the thread doesn't touch the host package; it sleep()s, ViceLog()s and assert()s [10:21:55] Yeah; we'd accept all but unrecognized critical tags and just do nothing with them, but we wouldn't generate any new tags in 1.4. [10:21:57] so, it's a better stepping stone for 1.6, yes? [10:22:20] For 1.6 and beyond, yes. [10:22:25] what happens if the thread gets started during startup, and other threads are started later? (i guess pthread_hinum doesn't get incremented) [10:22:28] k [10:23:27] I thought if we don't hit setThreadId, we don't affect the hold bits at all [10:23:37] meh… on the thread count issue, yeah, i guess it's not worth worrying about [10:23:47] I mean, the vparattach stuff already doesn't deal with this, I thought [10:24:23] I was just worried about losing a worker thread; if we need to lose a worker thread for it, then I'm not really for including it [10:24:50] X is broken isn't generally a good argument for continuing to be broken, however, it's probably not worth worrying about [10:24:54] pushed [10:24:56] --- tkeiser has left [10:25:25] --- tkeiser has become available [10:25:37] So, returning. Are we going to pull in the volume tags work? [10:25:46] i assume so [10:25:51] Cool. [10:26:01] I think Linux file locks is a pretty obvious bug fix. [10:26:15] i agree. (hence why i pulled it) [10:26:27] The code there was very broken, now it's not. I don't think there's anything controversial in it. [10:26:30] exactly [10:26:34] ok [10:26:40] I guess it's possible that it might break 2.2... [10:26:41] if no one's shouting, i don't care [10:26:55] 2.2 might already be broken. meredith finally has 2.4 so my last test box is gone [10:27:04] dumptags is now 993 [10:27:25] If you really want a machine with a 2.2 kernel, I can probably spin up a VM for you. [10:27:49] another thing... there's also (from awhile ago) 125270 [10:27:55] at this point if no one else is complaining we can let it go [10:27:55] I don't _want_ another platform that have to test on. I'm really inclined to see if anyone screams, then take it from there. [10:28:15] 125270: looking [10:28:59] can we defer that and discuss the rest? [10:29:03] Yeh. We should probably do something about that. [10:29:14] sure, just wanted to make sure I mentioned it [10:29:19] ok [10:30:04] so is there any controversy about salvager chdir() or rx ENETUNREACH handling? [10:30:20] if not, we can move on to the grudge match [10:30:24] i mean, the linux discussion [10:30:47] I'm happy with both of those. [10:30:52] I think it's worth doing something about 125270 [10:30:52] I was concerned if you got all of the fixes for the salvager chdir() thing, but as far as I can tell they're all there [10:31:07] i diffed 1.5.x salvager to ensure i did [10:31:16] though one didn't get pushed before i went to sleep [10:31:25] I haven't reviewed either of those two items, but I can't think of any reason I'd object [10:31:43] taken as the whole of the relevant patches, both are fairly obvious, to me [10:31:46] fine [10:31:55] so, let's move on to the list i sent out [10:32:00] fh-based linux cache [10:32:09] i don't believe this is required for the VFS changes later [10:32:16] simon can correct me if i'm wrong [10:32:23] No, it isn't. [10:32:31] so the question then is does any newer kernel stuff need it [10:32:38] and i believe the answer there is also no [10:32:41] Are we under the impression we _can_ avoid it? [10:32:46] --- haba has left [10:32:48] that's what i'm asking [10:32:54] i think if we can avoid it we should [10:32:55] I'm just looking. [10:33:26] For the time being, we can. [10:33:29] I agree, in the name of keeping stable stable. That might mean we eventually hit a kernel version where 1.4 doesn't work. I'd sort of like not to be there before 1.6 is released... [10:33:38] agreed [10:33:40] It limits the set of filesystems we can play with. [10:33:44] sure [10:33:48] to what the limit always was [10:34:03] IMHO that's OK, as long as ext* is on the table. [10:34:14] if this me vanishes, btw, it means i am changing batteries. since it looks like that' [10:34:17] s going to happen soon [10:34:20] If people want to use something exotic, they can run exotic OpenAFS, too [10:34:32] That's OK; there's two of you, right? [10:34:34] Is ext4 exotic? [10:34:37] yes [10:34:38] --- haba has become available [10:34:43] imo [10:34:56] --- tkeiser has left [10:35:05] really, to me, unexotic is ext2, maybe ext3. that's it [10:35:19] So, unless the world changes we don't need it. [10:35:22] does anyone want to advocate these go in? [10:35:23] ted tso says that ext4 is still very unstable. use it if you are ok with losing data [10:35:36] hearing none.... [10:35:39] --- tkeiser has become available [10:35:45] when did Ted say that? Distros are starting to use it by default. [10:35:58] let's skip VFS for a second, and go to keyrings, because it's smaller [10:36:06] Okay. [10:36:11] Ted said it when I met him at the Storage Developers Conference in Sept [10:36:18] 7b27217 changes to keyring-only pags, without using groups, for kernels 2.6.29 and newer [10:36:29] I don't think ext4 is actually an issue, it's not got excessively large inodes. [10:36:33] But yeah; I think it's OK to continue to recommend that machines running 1.4 use ext2 for the cache filesystem, and grudgingly support ext3. [10:36:59] I don't think we should take 7b27217 for 1.4.x [10:37:00] (i use and advocate ext3; ymmv) [10:37:43] So, we already use keyring-based pags on newer platforms, right? So 7b27217 only makes us stop maintaining groups on those platforms? [10:37:43] ext4 is the default fedora 11 & 12 fs, isn't it? [10:37:43] --- deason has left [10:37:48] I'd say Keyring PAG only will make things less confusing. [10:37:50] --- adeason has become available [10:37:54] --- adeason has left [10:37:55] No 7b27217 makes keyrings win over groups. [10:38:00] the only case i could see taking 7b27217 is if we were concerned for e.g. "In some situations such as background writeback and pre-fetching, this means that we'll now do it with the right credentials, even when in a PAG." [10:38:16] You mean, we use the keyring pag and ignore what's in groups? [10:38:19] That's not 7b27217, though. That's a different problem, which we solve differently. [10:38:21] background writeback won't matter for 1.4 [10:38:26] --- deason has become available [10:38:29] jeff, yes [10:38:37] "Don't use groups at all to determine PAG membership." [10:38:51] background writeback does matter for 1.4 - that's what mmap() ends up doing. [10:39:08] I dunno. I like moving in that direction, but in the name of stability, I have to agree with Simon. [10:39:13] The big thing that 7b27217 fixes is the race that deason reported. [10:39:15] simon: ok, then do you think it's worth separating into 2 changes? [10:39:26] I don't know if I'd call that a 'race' [10:39:40] e.g. take what is described in the log as 1 but not 2? [10:39:42] and the issue I reported was with some obscure application, and has already been fixed other ways [10:39:54] so, this isn't necessary to fix that [10:40:07] i'd rather not give people the rope to hang themselves with (a write which never gets written) [10:40:11] That log comment isn't correct. [10:40:21] ok, hang on, reading [10:40:34] Or, at least, we can solve the problem in a different way. [10:40:49] i agree with your second statement [10:41:13] But all of this is also tied in with Marc's change to get rid of an AFS specific cred structure, and use the kernels instead. [10:41:20] I think all of this is too much change for stable. [10:41:30] ok [10:41:47] reading this, i think i agree: pulling that in is too heavyweight for 1.4.12 [10:42:48] There are a fair number of other patches you'd need to pull to get that one to work. I think it's definitely what we want for 1.6, but too risky for 1.4. [10:42:55] so do we need to reimplement part of this to go with the larger VFS changes (assuming we take the larger VFS changes in some form) [10:43:35] No. We make the VFS change that stashes credentials work slightly differently. [10:44:02] Basically, you stash the UID, rather than the whole credentials structure, and use that directly. [10:44:03] fine. i'm thinking eject this then unless one of you screams otherwise? [10:44:27] and, hearing none... [10:44:54] quickly, i assume c4dfacc cm: address race condition in afs_QueueVCB is not controversial. it means we conform to our lock heirarchy [10:45:07] so when do you get rid of groups as pags and why is this tied to a race condition fix?> [10:45:16] haba: in 1.6 [10:45:49] meta-point: We should discuss, not here, whether groups disappear completely. [10:45:50] > it means we conform to our lock heirarchy my understanding is that it's a workaround since we are inconsistent with parts of the lock hierarchy [10:45:52] c4dfacc has been running in production for some time now. [10:46:18] deason: effectively yes, but we don't do 2 different things in differnt code areas with the same set of locks [10:46:22] Yeh - what it actually does is ensure we can't sleep without holding the correct lock. [10:46:56] It hugely changes the frequency with which afs_QueueVCB is called. [10:47:07] true [10:47:07] what I mean is, there is an instance where the locks are acquired in the wrong order, but fixing it was deemed too big of a change for 1.4 [10:47:25] yes, that is also true, and would be trivial to fix (in terms of stats), iirc [10:47:28] I really should actually read that patch and independently decide whether I think what it does is right. The lock heirarchy there is messy. But I doubt I will have time to do so, so no objection for now. [10:47:29] --- tkeiser has left [10:47:33] Yeh, I think this is the minimal fix that jhutz, me and you guys thrahsed out on jabber. [10:47:50] istr yes [10:47:50] yeah [10:48:04] (so if any of you object, go stab yourself in the eye) [10:48:10] So, no objection, beyond a general yuck at our locking hierarhcy. [10:48:11] --- tkeiser has become available [10:48:13] the issue is AFS_STATCNT(afs_QueueVCB); is called a lot more now. [10:48:15] well, yes [10:48:36] so that leaves us with: ca61359 viced-host-uuid-and-addr-hashing-corrections-20090530 and the linux VFS/VM mess [10:48:53] shall we just move it? just move it like five lines down and it'll be fine [10:48:58] meffie: The stat count doesn't concern me. The overhead of that, compared to the function call is negligible. [10:49:18] yeah, i'm with simon, i'm having a hard time caring about the stat count [10:49:19] It's not like we actually take locks around stat counter increments or anything ... [10:49:22] oh, I thought we were concerned on the stats being wrong; if we don't really care, then nevermind [10:49:48] the stats are already incomplete [10:49:53] (maybe not 'wrong' but 'different' from what they were before) [10:49:57] The fetchstore changes pretty much broke consistent stat counting. So, not that bothered. [10:50:09] ok [10:50:15] we also have routines which simply never got any stat added [10:50:20] or reused someone elses [10:50:35] in any case, i think we can just port it and go [10:50:40] Cool. [10:50:53] so we have the viced hashing, and the linux vfs issues [10:50:56] So, what exactly does ca61359 do? [10:51:56] And WTF do I have to do to get gitweb to show me a commit given I know the number? [10:52:15] what number? [10:52:15] ca61359 (viced hashing) hashes all combinations of host/port for a uuid, as well as the uuid, and properly tracks when host/port pairs are added or dropped [10:52:27] ca61359 i assume [10:52:52] This is a fileserver change? [10:52:57] it is [10:53:07] hence viced hashing... [10:53:16] http://git.openafs.org/?p=openafs.git;a=commit;h= will give you the commit for [10:53:25] Er, yeah. So, why is this needed? I am loathe to touch host-tracking in stable [10:53:28] http://git.openafs.org/?p=openafs.git;a=commitdiff;h=ca61359 [10:53:55] Ah, deason got there first. Why the search box can't handle SHA1s is beyond me, though. [10:53:57] Right, so, that sucks. [10:55:32] Yeh. Write a patch :) [10:55:41] That is a big patch. Too big for me to review, but I guess that's OK, if others have. [10:55:48] But, why do we need it in stable? [10:55:56] i could make it smaller by undoing all the log changes, which i would [10:55:59] is there a bug that ca61359 fixes? [10:56:07] 124634 [10:56:25] Undoing the log changes would make it significantly easier to review, I think. That should have been separated from the beginning, but too late now [10:56:42] you'd have to ask mike for the details [10:56:46] (of the bug) [10:57:18] Yeh. That bug is just "here's some code", not "this is the earth shattering problem that my customer is having" [10:57:46] I think in order to be able to evaluate the risk/benefit, we need to know what the real world benefits actually are. [10:57:57] meffie: Can you elaborate? [10:59:11] From that ticket: - optimizaton - refactoring - eliminate unnecessary assert - comment - saves a few clock cycles Those sound like things that don't need to go into 1.4. There are a couple of things mentioned that might be relevant [10:59:21] reviewing, it has been a while.... [10:59:33] Actually, given the author is listed as jaltman, rather than Mike, does the code we're discussing match the code that Mike submitted, or was it rewritten? [10:59:56] ticket shows it was altered by jaltman, possibly by derrick as well [11:00:07] it was rewritten [11:00:19] i believe both jeff and i did work on it [11:00:32] Were the hash table entries broken in a way that the viced lost clients? [11:01:08] i believe the opposite: host/port mappings not removed leading to clients being "merged" when a host/port were reused. but mike will have to comment [11:01:12] --- tkeiser has left [11:01:19] and i will have to swap batteries. 4% [11:01:22] --- Derrick Brashear has left [11:01:47] we were trying to track down a vice client traking bugs [11:01:48] --- tkeiser has become available [11:02:53] if i recall, there was one hunk that was a bug fix, the rest was optmizations [11:03:27] that was before the rewrite? [11:03:30] I think the problem is that the patch has been through so many authors, it's kind of hard to tell what's what any more. [11:03:45] I'm against taking it in its current form. [11:03:56] That, and violation of "one patch, one change" [11:04:12] I'd consider taking a change that is only the bug fix. [11:04:20] Yes. [11:04:45] I agree that the patch needs to be refactored to 1.4 as was done with prior patches of this nature. [11:04:59] It doesn't need refactored. [11:05:07] It needs to be clear what the problem is, and how the patch fixes it. [11:05:09] --- Derrick Brashear has become available [11:05:18] No, it needs to be split up, and only the bugfix part used at all [11:05:28] And ==sxw [11:05:36] The primary bug that I see being fixed in the patch is the removal of the old addr/port followed by the addition of the new addr/port. If the old addr/port was the only addr/port, the host was marked deleted [11:06:06] that is at line 1634 [11:06:23] (back, btw) [11:06:28] now, wih pretzel [11:06:43] Is the fix important enough to get into 1.4? [11:07:28] there are other fixes as well. the block at 1858 [11:07:49] i'm leaning toward no unless mike has somthing compelling? [11:08:26] sorry was reviewing [11:08:28] there are perhaps others. I would need to go back and review this in depth after extracting out all of the log changes and the purely optimizations [11:08:41] --- deason has left [11:09:18] --- tkeiser has left [11:09:25] So, before we move on to Linux VFS, can we talk about timing? [11:09:36] Like, when did you want to do a release? [11:09:41] "not over christmas" i assume, tho i'd like to have a 1.4.12 rc out rsn [11:09:54] It would be nice to do an rc by the end of the week. [11:09:59] ideally, if we can agree on this stuff, before the weekend is over [11:10:00] --- tkeiser has become available [11:10:01] And then let it bake over the holidays. [11:10:03] --- deason has become available [11:10:07] friday, if we're really ambitious [11:10:40] I guess the question is whether we can get builds done. That's the real benefit of the rcs. [11:10:48] I can do my usual Linux set. [11:10:50] Can we do an rc1 without ca61359/124634, and issue another if Jeff comes up with a more focused patch for 1.4? [11:11:17] yes, i think so. [11:11:19] well, if someone does. it doesn't need to be jeff particularly [11:11:27] Oh, true; it doesn't need to be Jeff [11:11:35] I think the 'does anyone care to do the work' test probably applies here. [11:11:37] i'm ok with that [11:12:30] ok, so... should we move onto linux vfs? [11:12:52] If we must [11:12:57] yeh :) [11:13:20] ok. so... do you want to drive this one, simon? [11:13:26] Okay. [11:13:32] well, probably not. but will you anyway? [11:13:51] 3abc87a seems trivial, low risk, and makes us work 'properly' with current kernels. [11:14:26] The rest are more dangerous. [11:14:40] We're very broken with regards to the _current_ Linux VFS. [11:14:47] In some places, it's surprising we work at all. [11:14:48] --- haba has left [11:15:06] But, we do work, and I'm afraid of breaking 2.4 (and earlier) with some of these fixes. [11:15:07] --- haba has become available [11:15:52] Many of these are hard to take in isolation - for example, we have to refactor writepage_sync in order to fix the fact that write() can get return values it really doesn't want or need. [11:16:01] So, I'm not sure. What do other people think? [11:17:59] Oh, and a number of them rely on the fact that in 1.5, we have completely separate trees for 2.6, and 2.4/2.2 [11:18:04] well, in some sense i'd rather give people an "old" and a "new" linux port, but those people could just run 1.5 [11:18:28] I don't know. What is the danger in not taking these changes? At what point do we stop working/ [11:18:46] Is 1.5.x good enough for those Linux kernels that need it? [11:18:51] The danger in not taking these changes is that we are slower, and deadlock more often. [11:19:06] None of them affect whether we work or not in normal operation. [11:19:10] Same comment as earlier -- I don't want to tell people they have to run unstable because their kernel is too new for 1.4 and we don't have 1.6 yet [11:19:13] No kernel needs 1.5 [11:19:36] "deadlock more often" sounds not stable. [11:19:54] But we're not seeing bug reports to back that up. [11:20:14] We deadlock more often, but that doesn't mean we deadlock often. We rarely see reports of this problem. [11:20:45] The only one of these changes that's bug report driven, is the one that does partial writeback with the correct credentials. [11:20:50] people will run "stable" on bleeding edge kernels and then blame AFS for being a problem. Oh, how often do people report deadlocks anyway? [11:20:52] Which makes me lean towards not pulling to 1.4, despite the performance hit [11:20:53] would people know it was us to report it to? [11:21:30] just curious, are there any kernel IMA fixes in queue for release? [11:21:37] haba, this isn't about how bleedy the kernel is [11:21:39] that's the crux of the question i have: we aren't hearing we need it, but that doesn't mean we don't. [11:21:42] haba: This isn't a bleeding edge kernel issue. Our support for bleeding edge kernels is the same between 1.4 and 1.5 [11:21:53] jason: it's "hard" to fix ima until the kernel is done evolving [11:22:04] ima will be fixed in 2.6.33 [11:22:04] Derrick Brashear: ok [11:22:12] Simon Wilkinson: thanks. [11:22:21] so formal! [11:22:28] Marc and I convinced them that their interface needed thought. [11:22:34] cool [11:22:49] Just being direct [11:23:06] you can directly call me derrick, shadow or hey you [11:23:09] So, 70c8dea is the one that fixes partial writeback, yes? [11:23:12] My fear about the VFS changes is that they are pretty big changes, in the write codepath. They're not what I would consider ideal candidates for a stable release. [11:23:15] there are very few reports from users of deadlocks on Windows. 99.9% of all deadlocks/crashes on Windows have come from the Windows Error reporting service. I do not think we can make an assumption that if we are not hearing of a problem from even one end user that it is not a problem. [11:23:30] --- tkeiser has left [11:23:42] --- tkeiser has become available [11:24:26] jhutz: Yes, [11:24:52] I think we are pretty confident that the deadlock in question is rare. [11:24:57] I agree that these changes are very significant. Our target for a 1.6 release is RCs in single digit months. I suspect that getting 1.4.12 out the door without these changes is the right thing to do. If we believe that 1.4 should get these changes, we should start now to pull them in and test them for 1.4.13 [11:25:12] I hade crashes which might have been deadlocks, but I do not have any evidence that would make a bug report to neither kernel nor AFS. So my guess is that the patch actually is needed. [11:25:24] So, looking at 70c8dea for a moment, since that _is_ bug-report-driven... How extensive is the change? What else does it depend on/ [11:25:31] The deadlock that we know about only occurs when you mmap a file larger than your cache. [11:25:46] 70c8dea is only exposed when you fix the other deadlocks, sadly. [11:26:14] Oh, well then. [11:26:14] In normal operation, you will deadlock before you manage to contact the fileserver with the wrong credentials. [11:26:26] "you've already lost" [11:26:56] I have users with files > cache. Fortunately the ones running FORTRAN are not so mmap() happy. But the others. [11:27:11] Not files > cache. *file* > cache. [11:27:43] harald, one file >> whole cache [11:27:44] files with all of them > cache [11:28:13] each [11:28:23] And even then, what you'll see won't be a crash. [11:28:36] There's probably a simpler fix for 70c8dea [11:28:38] In any case, does anyone disagree with Jeff's reasoning (target 1.6). [11:28:41] harald, is there a reason a 1.5 client would make them hate life? [11:29:00] jeff's position seems reasonable to me [11:29:03] shadow: I don't know [11:29:18] I think the problem is uncommon enough to leave the fixes out of 1.4.12. If it starts to look like 1.6 is further out than expected, we can start talking about whether to do something for 1.4.13. [11:29:20] can we assume they can and revisit if they can't? [11:29:27] i.e. ==jaltman [11:29:35] I think that seems like a good idea to me. [11:29:48] yes, it seems so. [11:29:50] alright, let me summarize, then [11:29:55] (shadow: give me a good value of X for 1.5.X) [11:30:01] 1.5.68 [11:30:05] for Linux [11:30:14] How much faster would you like things to go? [11:30:16] --- deason has left [11:30:37] to do: RT 125270 3abc87a c4dfacc everything else gets deferred. [11:30:40] --- adeason has become available [11:30:52] That sounds right, I think. [11:31:46] We should probably reconsider the VFS patches for 1.4.13, in case the world has changed. [11:31:55] Sounds like a plan. [11:32:07] i will have 3abc87a in a moment [11:32:08] --- adeason has left [11:32:22] --- deason has become available [11:32:28] haba: 1.5.68 dramatically speeds up Linux reads, and slightly speeds up writes. [11:32:54] cache reads are now in the same order of magnitude as reads from the backing cache, which didn't use to be the case. [11:33:45] 3abc87a is now 994 [11:35:16] We should talk about Andrew's bug. [11:35:21] And the other one in the same queue. [11:35:28] "Andrew's bug"? [11:35:52] Andrew has had lots of bugs in its history. [11:35:55] :-) [11:36:01] 125270 [11:36:11] and 124740 [11:36:15] i forget, is this log public? [11:36:20] Probably, yes. [11:36:29] no [11:36:41] oh, this log? [11:36:48] yes, it is [11:36:55] 125270 is not public. [11:36:56] i guess neither of these security issues is particularly a big deal [11:37:03] this log, not the bug [11:37:12] 125270 is a definite denial of service issue. [11:37:27] 124740 is pretty minor information disclosure. [11:37:31] Neither have fixes in 1.5 [11:37:34] sure. but there are plenty of other ways i could attempt such (DoS) if i cared [11:37:41] Yup. [11:37:42] --- tkeiser has left [11:38:00] so, i'm not inclined to be overcareful talking about this one [11:38:09] So, not for 1.4.12? [11:38:10] --- tkeiser has become available [11:38:15] for 125270? I'm not aware of other ways to possibly screw the client disk like that [11:38:15] why not? [11:38:30] i assumed we'd fix it in 1.5 when it also appeared in 1.4? [11:38:51] (125270) [11:38:57] Ah, sorry, misparsed your sentence. [11:39:10] 125270 should have a fix in the ticket, as I recall, but it's old and I'm not sure if that still applies to the code anymore [11:39:19] I also don't spend a lot of time in that code anyway, so... [11:39:39] the fix there also has bug simulation in it [11:39:50] I see no reason why it shouldn't be fixed [11:39:57] it being 125270 [11:40:29] 125270 i believe we should fix. i have no position on 124740, which discloses stat() information to another user of a multiuser client [11:40:30] Yeh. And the non-cache-bypass portion of the fix is less than 10 lines. [11:40:49] 124740 is trivial to fix - you just have to check for l permission before returning from getattr [11:41:14] is the failure mode in the 125270 okay? I think it just hangs forever, iirc [11:41:30] "in the 125270 fix" [11:41:54] If fixing 124740 is as trivial as that, let's do it. I'm just concerned that we don't inadvertently overrestrict information. And there's _nothing_ in the ticket about how or why the problem arises or what the fix should be. [11:41:57] It should return EIO, I think. [11:42:39] I haven't read the 125270 fix, but it seems simple enough. If the fileserver responds with a wrong length, return EIO. [11:43:09] ... and don't retry the call, either [11:43:15] what does "return -31" mean? [11:43:28] They're "magic" error codes in the Unix CM. Don't ask. [11:43:49] It was like that when we bought it. [11:43:58] -31 is ____.225 :-) [11:44:15] fs-cm-spec.h 2051 \n -31 An Rx write into the stream ended prematurely. [11:44:20] The file server is permitted to return length within the range 0.. [11:44:22] --- tkeiser has left [11:44:32] Yes. [11:44:44] Yeh - the error is when the fileserver returns too much. Andrew's patch does that. [11:44:58] but if it returns a length longer than that, we should just EIO immediately and not even read the data, IMHO. [11:45:01] The problem is that at the moment it just returns the error code from the call, so never actually errors out the connection. [11:45:15] --- tkeiser has become available [11:45:39] agreed. [11:46:12] I have to go. I will check the logs when I return. [11:46:12] 124740: Take a directory that is system:anyuser l. Do an ls -l in that directory as the anonymous user. Then, do an ls -l as a user with r permissions. Then, do an ls -l as an anonymous user again. Observe the difference. [11:47:01] didn't get to talk to the vendors. no door prizes for me. oh well. [11:47:40] mix? [11:47:44] no [11:47:48] So, is someone going to write a revised fix for 125270? deason? [11:48:07] I know what the problem _is_. There's no analysis of why it happens. [11:48:11] I can if you want, but others are more familiar with the CM fetching stuff than me [11:49:00] I think the first time I tried that fix, I was a bit confused by the overall code paths in there, and the magic error codes [11:49:30] jhutz: The why is whether the directory is stat'd or not. If it's stat'd, then we have up to date information in the vcache that we return to all comers. If it's not stat'd, then the ls does a stat request, which is failed by the fileserver (because it's from the anonymous user). [11:49:39] it happens because we don't bother to enforce whether the caller is supposed to be able to read it, and just return, in getattr [11:49:56] e.g. should fetchInit return EIO directly, or does it need to return something special such that a higher level interprets it to return EIO? [11:50:40] This patch is going to be completely different for 1.4, too,. [11:50:44] No fetchstore on 1.4, [11:50:59] yeah, there's that, too [11:52:30] The error code from fetchInit ends up in afs_Analyze [11:53:29] If you just return EIO, then it will make its way up to the top level. [11:53:30] i can look at it when i am somewhere with AC if you like. [11:53:36] oh god, in 1.4 it's just all in GetDCache? [11:53:41] Oh yes. [11:53:45] The longest function ever. [11:53:48] or if you want to write something i can review. whatever [11:54:22] do you want things to just go to the ticket, and not gerrit yet? [11:54:37] I think it can probably go to gerrit. I don't think it's that sensitive. [11:54:48] you can stick it in gerrit now [11:54:56] we talked about it here, so.... [11:57:04] okay [11:57:12] and yeah, I'll do it unless someone else wants it [11:59:22] If you're happy to do so, go for it. [12:00:07] It might be worth splitting the patch into one that fixes 'normal' fetches, and one that does cache bypass. [12:00:30] then we can "cherry pick" (ha ha) just the one [12:04:13] i lie. c0323c49 is already the pulled up c4dfacc [12:04:35] so the only action items left are either in the queue or the 124740/125270 issues [12:14:56] --- Derrick Brashear has left [12:16:24] --- edgester has left [12:24:34] --- Derrick Brashear has become available [12:31:53] --- tkeiser has left [14:35:24] --- Derrick Brashear has left [14:36:26] --- Derrick Brashear has become available [14:54:01] --- haba has left [14:54:40] --- haba has become available [15:09:40] --- Derrick Brashear has left [15:10:58] --- deason has left [15:32:17] --- Derrick Brashear has become available [15:37:13] --- matt has left [16:46:12] --- Derrick Brashear has left [17:32:04] --- haba has left [17:32:07] --- haba has become available [17:38:31] --- meffie has left