[00:45:26] --- Stephan Wiesand has become available [04:30:45] --- shadow@gmail.com/barnowlDDCA2172 has become available [05:59:58] --- Simon Wilkinson has become available [06:35:25] --- paul.smeddle has become available [06:59:34] --- Derrick Brashear has become available [06:59:43] --- ktdreyer has become available [07:00:06] Hi all [07:00:47] hello [07:00:49] hi dr nick [07:01:04] er, i guess that would be the response to "hi everybody" [07:01:33] I'm feeling dumb. [07:01:51] derrick: I'm going to have to revoke your mail-order certificate in Doctorology [07:02:09] getting worse... [07:02:30] Stephan: "Simpsons" reference [07:03:00] http://www.youtube.com/watch?v=YlmECL2ED2I [07:03:18] --- kaduk@mit.edu/barnowl has become available [07:03:19] Thanks! That explains why I have no clue what you're talking about :) [07:03:29] don't watch bad american TV. good move. [07:03:38] Sometimes having no clue what Derrick's talking about can be a good thing. [07:06:03] I was expecting Andrew? [07:06:16] i assumed he would be here. [07:06:22] He does know about the meeting -- I spoke to him yesterday [07:06:37] no Mike either, hmm [07:06:45] not currently running jabber. [07:07:42] Maybe hope for a few more minutes? [07:07:53] Is he on central time? If so, he may have just woken up. [07:08:01] I had to rejoin the muc since something wonky happened to MIT<-->OpenAFS federation. [07:08:20] mike is eastern. andrew is central. [07:08:22] Though I think the blame for that lies on MIT's side, not OpenAFS's. [07:08:36] And meanwhile, ask whether anyone can explain to me why we had to rebase gerrit 8551? [07:08:48] I was having issues joining the main openafs room yesterday [07:08:50] hang on, looking [07:09:38] rebase? it looks like it failed to compile entirely. [07:09:39] sorry, it was 8550 we had to rebase [07:10:06] 8551 failed to compile on opensuse and freebsd, but successfully compiled elsewhere. [07:10:13] --- deason has become available [07:10:29] hello Andrew [07:10:32] Path conflict means that one of the files touched by the patch has been changed [07:10:50] 8550 had to be rebased to deal with 8549 which changed afs_analyze.c just before [07:11:08] it should have been pushed to gerrit as a stack on top of 8549. oh well. [07:11:16] clearly nbd [07:11:58] --- meffie has become available [07:12:24] We don't let gerrit autoapply changes in that situation, in the same way as command line git does, because it proved flaky in the past [07:12:44] ah, when I was cherry-picking I missed that it didn't have 8549 formally set as a dependency [07:12:51] sorry for tardiness [07:13:01] I'm not seeing a compile log for the freebsd buildbot for geerrit 8551... [07:14:23] i'd have bet it expired, but the page looks wrong for that. i resubmutted 8551 [07:14:33] pretend i can type [07:15:24] is there an up to date list of what all we need to discuss today? [07:15:52] I'd like to add library changes for perl-AFS into the mix [07:16:09] hi, sorry I'm late as well (reading scrollback...) [07:16:30] 8585, 8604 should be added to the email sent out last week. 8604 is needed for linux 3.7. [07:17:17] there's 8548, too [07:17:56] and the open issue around 8512/13 [07:18:33] What should we talk about first? [07:18:46] let's start with 8585 [07:19:11] 8585 causes logged errors to not be lies. no side effects. [07:19:28] Seems pretty obviously an isolated bugfix. [07:19:32] +1 [07:19:35] i favor not lying especially when it means helping people debug willbe easier. [07:19:59] I did build and test 8585 yesterday. My test fileserver was fine. +1 [07:20:03] yes, trivial [07:20:07] yes, should be fine [07:20:10] So, agreed. [07:20:51] next: 8548 [07:21:37] any objections? [07:22:07] Makes me nervous [07:22:15] Can't comment on the correctness of the behaviour [07:22:20] note that this fixes a regression introduced some time in 1.6 [07:22:50] I don't think there's a problem with it per se - but the whole BUSY behaviour in RX is a massive can of worms [07:22:53] --- meffie has left [07:23:17] --- mmeffie has become available [07:23:27] --- mmeffie is now known as meffie [07:24:33] Andrew, can you pinpoint the change introducing the regression? [07:24:41] Andrew: is the regression easy to reproduce? [07:25:47] It was introduced as part of a88e9a24 [07:26:46] yes, that's it [07:27:54] kaduk@mit.edu/barnowl: you need to initiate a call on an in-use call channel, but the extant call needs to have an error [07:28:32] it's easy enough to reproduce by causing the fileserver to timeout and abort with VNOSERVICE, but there are other ways [07:30:19] a88e9a24 seems to be part of 1.6.1? [07:30:21] that's maybe not too serious, though, since only newer clients even do anything about BUSY packets anyway [07:30:31] Anything idle deadish will trigger it [07:30:35] Would it make sense to clear call->error after saving old_error and then checking for any error (restoring old_error at the end of the block)? [07:31:11] (Well, before unlocking the call lock) [07:31:16] No, because if you clear call->error, the call stops being in an error state [07:31:25] The rxi_WaitForTQBusy() is while (!call->error && ...). How can old_error be an error and be different from call->error after the function call? [07:31:36] I thought that wait was just for when the call enters an error state from a non-error state anyway [07:31:56] WaitForTQBusy can drop the call lock [07:32:25] simon, are you responding to me? [07:33:16] Postponing 8548 will not cause a regression in 1.6.2, right? [07:33:49] from 1.6.1, no. [07:34:11] jeff is saying that just testing for old_error is sufficient, simon; there's no way the call error can change if there was an error before waiting [07:34:39] Yeah, becasue we don't drop the call->lock if we're in error. [07:34:43] it's a regression from before 1.6.1 (either 1.4 or 1.6.0) [07:35:08] And it makes Simon nervous. Defer to 1.6.3? [07:35:20] well wait a second [07:35:24] It's making me less nervous after this discussion [07:35:43] yeah, I mean, if the call is in error, the code path is pretty clear, since we don't do anything [07:35:47] i still favor pulling it in [07:35:48] (less nervous when phrased as a check on old_error?) [07:36:16] Yeah, I think that check could be better written as "if the call is in error, and it wasn't before" [07:36:53] but for now, that's not an issue with correctness, but clarity [07:36:53] Rather than as "if the call is in a different error state than before", which is how I originally read it. It's just a code clarity thing, though [07:37:44] I'm in favor of pulling it in. [07:37:50] the comment suggests such, but the code suggests the error changed [07:37:55] I was trying to restrict the change as much as possible to just the situation I was interested in; I wasn't sure if WaitforTQBusy would always behave that way [07:37:57] If the call is already in error, rxi_WaitForTQBusy is a no-op, right? So the code path could be clearer if there was an error check before calling WaitforTQBusy? [07:38:48] I don't trust the comments in RX. They're almost always wrong. [07:39:42] This would be clearer.... if (!call->error) { rxi_WaitforTQBusy(call): if (call->error) { ... } } [07:39:54] yeah [07:39:58] Actually, that bit of WaitForTQBusy may not be correct. I think the transmit queue can be busy if the call is in error. But that's a whole other can of worms. [07:39:59] +1 [07:40:47] it does sound like one for 1.6.3 the way it's going ... [07:41:22] I don't think so; we're just discussing code clarity/style at this point, which to me suggests the functional change is agreed upon [07:41:46] Yeah, I'm happy with the functional change, with the proviso that we want to clarify this all on master. [07:42:09] Is it important enough to delay 1.6.2? [07:42:10] so +1 on the pullup as-is, and after 1.6.2, pullup new change from master? [07:42:24] yeah yeah, I'll submit something for that (jeff's proposed code looks better, sure); I'm not completely clear on if people want that change for 1.6.2 [07:42:40] Not bothered for 1.6.2 - it's just cleanuo [07:42:43] ok [07:42:48] yes, okay [07:42:51] To remove the regression introduced in a88e9a24, the right thing to do would be to remove if the if (call->error) check after the rxi_WaitforTQBusy() entirely [07:43:03] can we recap the bug? how serious is the regression? [07:43:41] meffie: you don't get BUSY packets if the call you're colliding with has an error [07:44:11] There was no rxi_CallError() at that point prior to a88e9a24 [07:44:22] and most deployed clients drop the BUSY's anyway? [07:45:38] not sure on the deployment rate... everything in 1.6 I think handles them; I assume that fix wasn't in 1.4 but I'm not sure [07:45:57] but the fix here isn't complex, so... [07:46:41] Okay, it sounds like the change is agreed -- cosmetic enhancement to go into 1.6.3 [07:46:54] Paul: Ok. [07:47:02] +1 [07:47:07] should we chat about 8512/8513? [07:47:22] I'm debating whether this fix should go in or the removal of the rxi_CallError() entirely [07:47:39] (I'm taking the todo for submitting the 'cleanup' fix for the busy thing) [07:47:42] i think this fix, and removing CallError is for 1.6.3 [07:48:15] Derrick, which fiz? [07:48:27] 8548. not 8512 yet [07:49:28] i am happy with both 8512 and 8513. [07:50:00] They're not in gerrit for 1_6_x yet. [07:50:05] I agree with 8512; that seems pretty restrictive in what we affect [07:50:12] 8512 isn't in master yet [07:50:30] I'm waiting for Simon's blessing [07:50:59] yeah. simon's is the commentary i'd like to see here. [07:51:14] i know they are working for Chaskiel. [07:52:40] I'm happy with 8512 [07:52:41] [buildbot verified 8551, so that's settled] [07:52:46] 8512 is what is fixing the problem for Chaskiel. 8513 sets the direction on aborts which has never been done before. Nothing checks the direction of aborts. I agree with Simon that 8513 should not be pulled to 1.6.2. I would only like to see 8512 pulled. [07:52:51] I don't think 8513 belongs in a stable release [07:53:01] (I'm not sure it belongs on master...) [07:53:16] Anyway, got to dash. Will look over the logs later [07:53:25] Thanks Simon. [07:53:43] So, we pull 8512 but not 8513? [07:54:01] as soon as i can submit 8512 (needed to rebasE) i will push a pullup to gerrit. [07:54:06] Stephan: sounds like ... [07:54:07] +1 on the above; 8513 at first glance makes sense to me, but it has possible implications [07:54:19] Fine. [07:54:25] next: 8604 [07:54:59] 8604: if you want linux 3.7 support it's not really optional. [07:55:08] is needed, does work. [07:55:26] Marc comments that it may require 7988. [07:56:02] I kinda assumed anders tried loading the module with this, but he doesn't explicitly say so [07:56:14] i am ok with 7988 also [07:57:16] I'm not clear on what the failure case is here if marc is correct (failing to load on 3.7, or 'everywhere' or...) [07:57:16] Delay pre1 for that? [07:57:24] I can ask Anders if the question is self-contained. [07:57:26] but it just needs to be trivially tested on linux by someone [07:58:03] I think we can 'delay' just for confirming that anders loaded the module [07:58:28] That Anders had loaded the module with 8604? With 8604+7988? [07:58:58] 8604 as submitted; currently it doesn't have 7988 [07:59:17] and I can just build it against an older kernel as we're talking :) so I know it doesn't break existing stuff [08:00:40] I have asked Anders for confirmation. [08:01:24] There seem to be no objections to 8604 itself, with or without 7988? [08:02:02] 7988 applies over 7915 (8604) anyway. repushed 8604, no change except now 8605 (7988) depends on it [08:03:00] no objections to 8604, just not sure on what else may be required for it [08:03:46] 7988 is trivial enough that i think taking it is fine. [08:03:48] (note that errors here will not show up in a build failure, but will show up as soon as you try to run the client) [08:04:03] oh, good point. [08:04:49] If experts are reasonably sure 8604/5 won't cause regresseions with older kernels, I'm fine with both. [08:04:55] yeah, and if 7988 is wrong in any way, I'd expect a build failure [08:05:05] er, 8605 I think I mean [08:05:34] so, I would say agreed on pulling, assuming the module loads [08:06:52] I'm not in the position to run 3.7 kernels right now. Anyone? [08:07:26] i assumed ben was pinging anders. i'd venture, though, that anders is not around right now [08:07:35] I can test it by next Wednesday's meeting [08:07:44] Anders may well still be asleep. [08:07:47] 8064+8065 on Kernel 3.7 [08:07:51] right [08:08:01] okay, well, it loads for me on older kernels at least [08:08:07] i'd be content to pull them, put them in pre1, and see what reports we get [08:08:14] so, if it breaks / is wrong, I assume it breaks 3.7, which was already broken [08:08:23] So, no pre1 for another week? Or pull without testing 3.7? [08:08:28] yeah. cant be any worse than today [08:08:33] Pull without testing 3.7 is fine. [08:08:34] pull without 3.7 tests [08:08:35] pull, and let prerelease testers report [08:08:35] pull it now [08:08:46] Got it ;-) [08:09:08] 8604 verified, so it can go [08:09:23] same 8551. [08:09:57] --- simonxwilkinson has become available [08:10:21] 8551 merged [08:10:25] anders confirmed that 8604 permitted the kernel module to load on 3.7 last night in the openafs jabber room [08:11:05] Which I didn't see because MIT's jabberd sucks. Sigh. [08:11:10] --- simonxwilkinson has left [08:11:32] --- simonxwilkinson has become available [08:12:04] --- simonxwilkinson has left [08:12:06] --- simonxwilkinson has become available [08:13:22] Linux has been advertising kernel_thread as deprecated in favour of kthread_run for as long as I've been doing kernel stuff [08:14:13] _technically_ he didn't say it worked; I mean, it seems like he tried it and it fixed his issue, but he doesn't explicitly say that (unless I'm not reading messages correctly or missing some, which is possible) [08:14:31] yeah, I think the general change is agreed on [08:14:43] 8512 fails: /home/buildbot/slave_i386/opensuse12-i386-builder/build/src/libafs/MODLOAD-3.2.10-15-default-MP/afs_util.c: In function ##print_internet_address##: /home/buildbot/slave_i386/opensuse12-i386-builder/build/src/libafs/MODLOAD-3.2.10-15-default-MP/afs_util.c:229:2: error: implicit declaration of function ##rx_GetNetworkError## [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors [08:14:44] --- simonxwilkinson has left [08:14:44] --- simonxwilkinson has become available [08:14:53] that's master [08:15:08] So, agreed on both 8604 and 8605, provided they build and work on older kernels? [08:15:20] i know. but we can't pull it up til it verifies. [08:15:36] yes, 8604 and 8605 fine (and they work on older kernels, so it's fine) [08:15:51] yes [08:16:08] Ok, anything else? [08:16:18] --- simonxwilkinson has left [08:16:26] Stephan: yes [08:16:31] er, to that first question [08:16:48] ('agreed on 8604...') [08:16:54] +1 on 8604 [08:17:16] repushing 8512 with a rebased 8451 to fix it. after that, hopefully can pull up 8512 [08:17:56] since, well, i blew that one. [08:18:23] If no one has any other items... circling back to the rx BUSY issue with 8548 for a moment: Derrick, are you ok with merging 8548 as-is for 1.6.2? [08:18:33] yes [08:18:41] cool [08:19:31] so I have the consensus as "merge 8548 for 1.6.2, code may be restructured later on" [08:20:04] So, we pull up everything agreed, including 8512 when Derrick has come up with a 1.6 version, and then cut pre1? [08:20:25] no [08:20:26] 8471 [08:20:29] I see we also discussed 8471 in our last meeting [08:20:31] yeah [08:21:16] derrick said he'd review, he did review; not sure if we want/need anything else [08:21:23] i don't like 8471 but not so much that i object. it's the correct fix given how things are, but it's not the correct fix, imo. but we should take it and move on [08:21:52] afsd_dynamic_vcaches is synonymous with linux, right? [08:22:11] yes [08:22:13] yeah [08:22:25] it's the hack to deal with the poor inotify() exports. [08:22:47] er, and 8471 was split up per the last meeting [08:24:26] I still need to go through Windows fixes to see what is applicable for 1.6.2pre1. I will make sure that happens before pre1 is cut. [08:24:56] Sorry, what was 8471 split into? [08:24:56] 8471->8606 for 1.6 [08:25:07] 8555 [08:25:15] 8471 was split into 8471 (pull up) and 8555 (don't pull up) [08:25:18] simon may or may not have opinions on 8471; he last touched this, but he was just combining existing functions so I dunno [08:25:35] Speaking of which, will we formally tag it next week, or "as soon as everything agreed here is merged, and Jeff's changes are in place." ? [08:25:48] I prefer the latter ;) [08:26:02] sooner is better, imo [08:26:10] give people more time before holidays to try it [08:26:10] that's what I was hoping to hear [08:26:11] it's just a pre, I wouldn't worry about accidentally tagging something too early [08:26:22] excellent [08:27:40] so do we have agreement on 8606/8471 going in? [08:27:41] any objections to 8606 (the newly-created 1.6 backport for 8471)? Wait for Simon's feedback and re-address next week? [08:27:54] i think 8606 should just go in [08:28:27] Yeah. [08:29:41] Ok. [08:30:45] so I've got: 8548, 8512, and 8604/5/6 [08:31:13] +1 on 8606 [08:31:14] 8512 will be a pullup, probably 8607. but yeah [08:31:33] right, ok [08:32:07] And 8585. [08:32:16] sorry, missed that somehow [08:32:20] windows login sound in my ears just now freaked me out. [08:32:33] er, that wasn't for you guys. sorry [08:32:40] heh [08:32:44] The "welcome to our wonderful software" song :) [08:33:01] so last week Andrew also mentioned a DAFS fix in 8203. Still on the table for 1.6.2? [08:33:10] radio station i listen to via internet apparently had their streaming machine crash, but that's not relevant to yinz. [08:33:48] I said I would review some dafs things by this week and I did not. damn [08:34:00] check your pants for fire. [08:34:03] There's never enough time. [08:34:04] don't worry about it holding up pre1; I'll raise something if I find something [08:34:22] 8464 was discussed last time, but it seems that's not finished? [08:34:55] yeah, i'd like andrew's comments on 8464, or perhaps the right answer is to ask Markus. [08:34:56] not for 1.6.2 currently [08:35:01] but i [08:35:03] yeah [08:35:10] was my impression [08:35:30] cool, so we'll steam ahead and try and get something out asap: if there's anything flamingly obviously required for pre2 we can pull it [08:35:49] yup [08:35:56] I assume all of the relevant bozo stuff got pulled, right? [08:36:40] i pushed them to gerrit for 1.6.x [08:36:51] they're in gerrit, i think not submitted yet [08:36:52] I did test the bozo patches for 1_6_x but they have not yet been merged [08:37:17] Paul and I have started pulling them. What's Mike submitted to gerrit is agreed and will be pulled. [08:37:34] okay [08:37:51] We got stuck on the 8551 buildbot failure. [08:37:59] yep, we got sidelined by a few merge and buildbot kinks [08:38:16] well, I got sidelined by the merge problem -- Stephan fixed it [08:38:24] This is all new to me - making me slow. Sorry. [08:38:50] is there a time estimate on pulling windows changes? are we expecting, like... by the end of this week? [08:38:57] for windows, I'm debating a restriction in the installer to prevent its use on Windows 7 and above. The SMB server authentication is not stable on Win7 and Microsoft has never completed the fix to their SMB client to make it stable. Any opinions? [08:39:27] I think pre1 should be tagged on Friday. [08:39:33] Excellent. [08:40:04] Sounds good to me. [08:40:07] 'not stable' means.... near-unusable? or causes issues generating frustrated bug reports [08:41:42] if you are lucky to have a desktop system with a wired connection and a static ip it is usable. any world in which the ip addresses expire or network links drop periodically results in the smb client refusing to authenticate to the afs smb server and the machine appears to deadlock all afs traffic. [08:42:00] usgs is quite familiar with this problem. [08:42:19] "it works until it doesn't" [08:42:27] yes, we are! [08:42:29] and then a reboot is required [08:42:37] the thing you mentioned in edinburgh [08:42:57] I first mentioned this more than three years ago [08:43:41] ...okay [08:43:48] 1.7 is stable enough that I believe everyone can use it which is why I think it may be time to prevent 1.6.2 from installing on win7 [08:43:49] which I assume is the thing you mentioned in edinburgh [08:43:51] The redirector has been a nice experience on win7 when I've been playing with it. [08:44:15] i'm down with no 1.6 on win7 [08:44:37] 8605 verified. [08:44:40] My only question would be whether there are people already running 1.6.{0,1} on win7 who would not/could not take 1.7. [08:45:13] I'm not going to prevent them from continuing to use 1.6.0 or 1.6.1 [08:45:14] is 1.6.1 usable at all? (in case someone maybe needs a last-resort option if 1.7 entirely is not an option for some reason) [08:45:45] you can run 1.7.x in SMB mode to get 1.6.x behavior if you tweak it correctly. [08:46:18] I'm more concerned with an environment where corporate IT decrees that 1.7 is not acceptable. (This is an entirely hypothetical concern for me at present.) [08:46:52] I'm more concerned with unsupported end users that get stuck with a machine that is unusable [08:47:03] at our site we moved to 1.7.x as fast as possible [08:47:17] if you are a corporate IT environment, they can modify the MSI to remove the install restriction [08:47:32] Fair enough. [08:47:58] I won''t have that restriction in place for pre1 but I will get it done before the final build [08:48:30] We should note it in the pre1 announcement, then. [08:48:45] Ben, was typing exactly that. [08:48:46] --- simonxwilkinson has become available [08:48:50] say it doesn't install on Win7 and when it does its a bug to fix :) [08:49:03] note that the final release won't [08:49:42] Jeff: noted. [08:50:03] which means that I do need to update the Windows release notes before pre1 [08:50:27] --- simonxwilkinson has left [08:50:47] Are we done for today, then? [08:51:02] i think so. [08:51:11] From my POV, yes. [08:51:24] next meeting is next Wednesday? [08:51:47] or do we want Friday of this week? [08:52:07] if we're releasing pre1 friday, we better have nothing to talk about on friday [08:52:18] also, this time is a conflict for me on fridays. [08:52:47] We'll merge all the agreed changes [08:52:58] pending a few backports [08:53:11] and Jeff can signal readiness on the windows side [08:53:20] I will send mail to release-team [08:53:29] then I guess Stephan and/or I can give Derrick the go-ahead to tag pre1 [08:53:37] ok [08:53:41] Thanks, Jeff. [08:54:02] I think we are finished. [08:54:09] I think we want a few successful tests by us before pushing the tag. [08:54:10] time to walk the dog [08:54:30] Stephan: fair enough [08:54:40] Thanks all. [08:54:49] Thanks everyone! [08:55:06] thank you [08:55:16] have a goodday [08:56:32] Ken, do you volunteer for sending around a summary again? [08:56:45] sure, can do [08:57:06] Thanks. [08:57:08] draft at http://pastebin.com/deSZMhCk [08:58:26] I'll go offline for an hour, and then look into the draft and gerrit. [08:58:46] --- Stephan Wiesand has left [08:59:42] --- Derrick Brashear has left [09:01:02] --- meffie has left [09:01:47] ktdreyer: 8064 and 8065 should be 8604 and 8605 [09:02:03] ah rats, just sent :( [09:02:14] 8604 subject being "libafs: use kthread_run when available" [09:02:14] thanks for checking that [09:05:05] we need some sort of bot to keep track of these :) [09:06:52] or a ticketing system :) [09:07:39] we tried doing it in RT once. cumbersome [09:07:42] could make a ticket for each 'thing' and somehow mark it as blocking the release, as others do, but I don't know if you wanna do that _now_ [09:08:15] yeah. that's what we tried. [09:08:58] I think that could be worth trying again, but maybe not while we're in the middle / almost at the end of the process [09:10:21] i'd prefer better RT-gerrit integration if you forced me to do it, but lucky for me ... :) [09:52:01] --- ktdreyer has left [10:25:32] --- paul.smeddle has left [10:30:13] --- stephan.wiesand has become available [12:12:07] --- stephan.wiesand has left [12:37:29] --- jhutz@jis.mit.edu/owl has become available [14:54:31] --- Simon Wilkinson has left [16:32:29] --- deason has left [22:48:10] --- Stephan Wiesand has become available