Home
release-team@conference.openafs.org
Wednesday, October 18, 2017< ^ >
Room Configuration
Room Occupants

GMT+0
[12:39:27] meffie joins the room
[14:59:08] wiesand joins the room
[14:59:27] <meffie> good evening wiesand
[15:00:14] <kadukoafs@gmail.com/barnowl772461F2> greetings
[15:01:17] <wiesand> Good morning
[15:01:40] <wiesand> Looks like we're in trouble...
[15:02:16] <meffie> trouble?
[15:02:30] <kadukoafs@gmail.com/barnowl772461F2> Surely not too much trouble
[15:02:49] <wiesand> I guess you read the mail about problems at notre dame with clients on EL7.4 kernels?
[15:03:40] <wiesand> Add to that that we observed the getcwd problem on such a system here too, and we have very few users on that platform yet.
[15:03:43] <meffie> ah, yes, i just saw that one.
[15:04:23] <meffie> seems getcwd problems are persistent.
[15:04:23] <kadukoafs@gmail.com/barnowl772461F2> I'm not sure whether the avenue of "try to figure out what redhat
changed in that kernel that affects us" is most likely to fruitful.
[15:04:25] <wiesand> And then, "someone" told me that the openafs client is incomaptible with those EL7.4 kernels.
[15:04:42] <wiesand> It could be serious trouble.
[15:05:53] <kadukoafs@gmail.com/barnowl772461F2> I am somewhat curious if 1.8 is any different than 1.6 in this regard,
but don't really expect there to be a difference.
[15:06:21] <wiesand> Ben: Yes, given that RH no longer provides individual patches and you'd have to go through the complete diff, difficult and tedious...
[15:06:24] <meffie> didnt andrew have some approaches in gerrit for the getcwd issues?
[15:07:48] <kadukoafs@gmail.com/barnowl772461F2> ownder:adeason status:open doesn't pull up anything that looks
promising on that front
[15:08:04] <meffie> ok
[15:08:06] <wiesand> hang on
[15:08:11] mvita joins the room
[15:08:27] <mvita> ah, there it is
[15:08:30] <wiesand> https://gerrit.openafs.org/#/q/status:open+project:openafs+branch:openafs-stable-1_6_x+topic:linux-mtpt-bindmount
[15:08:43] <kadukoafs@gmail.com/barnowl772461F2> "Oh, that one"
[15:09:28] <meffie> yeah, that must be the one i was thinking about.
[15:09:42] <wiesand> I believe that in the end this is the only real solution.
[15:10:08] <mvita> I saw the email in openafs-info.  I did not have time since last week to look at the getcwd issue at all.
[15:10:10] <wiesand> But obviously pretty intrusive and a lot of work to complete.
[15:10:18] <kadukoafs@gmail.com/barnowl772461F2> Indeed.
[15:10:37] <wiesand> Mark: same here.
[15:10:50] <wiesand> But I do believe we have a pro blem.
[15:11:34] <wiesand> Shall we ask notre dame to try 1.6.20?
[15:12:14] <meffie> i'm not sure that would be any better, since we dont know what redhat changed.
[15:12:37] <wiesand> (backing out "shake harder" again may re-mask the issue :-/ )
[15:12:49] <meffie> ah, maybe.
[15:13:08] <mvita> I would want to try to reproduce this myself first
[15:13:33] <mvita> it might also help to see a config.log from the failing sites
[15:13:39] <wiesand> Sure, that's my plan too. If I just had the time.
[15:14:29] <wiesand> Mark: I'll get you one.
[15:14:44] <meffie> in parallel, i'll see if we can make any progress on the linux-mtpt-bindmount patches
[15:14:57] <meffie> (for the future)
[15:15:37] <wiesand> Wow, I think that's a serious project. And I doubt we have the resources required for it.
[15:16:19] <meffie> i'll find out if we can get some resources for it.
[15:16:33] <kadukoafs@gmail.com/barnowl772461F2> meffie: thank you for looking into that approach
[15:17:07] <wiesand> But maybe you can acquire them anyway… after all, I think if we have no reasonable client for EL7.4+ we can just as well give up.
[15:18:13] <mvita> it's a little early in the game to be thinking about giving up
[15:18:32] <meffie> yeah, after all, it was reported only one day ago :)
[15:19:04] <meffie> and it's not the first time we had getcwd issues.
[15:19:12] <mvita> most likely it is a broken autoconfig test in our stuff
[15:19:45] <mvita> that is, a combination of results that was unforeseen.
[15:19:53] <meffie> that is possible.
[15:20:28] <kadukoafs@gmail.com/barnowl772461F2> Who is going to respond to the email from ND?
[15:20:33] <wiesand> OK, let's see. I'm glad we seem to agree that we should take this seriously.
[15:21:06] <meffie> yes indeed.
[15:22:13] <meffie> it would be better to come from a neutral party i think.
[15:22:47] <kadukoafs@gmail.com/barnowl772461F2> Is that a coded way of asking me to do so?
[15:23:01] <mvita> you broke the code.
[15:23:10] <mvita> either you or Stephan
[15:23:13] <meffie> i need to find a stronger code.
[15:23:15] <wiesand> Ben: I tend to agree with Mark that we should try to reproduce the issue first. But we shouldn't wait too long either.
[15:24:01] <kadukoafs@gmail.com/barnowl772461F2> I mean, I could send a bland "we're looking into it" mail later today
without much trouble.
[15:24:59] <wiesand> Seems reasonable, oh fearless guardian
[15:25:58] <meffie> "fearless engineering"!
[15:26:17] <mvita> "We are working to reproduce and identify the problem"
[15:26:34] <mvita> we=release team
[15:27:21] <kadukoafs@gmail.com/barnowl772461F2> Right
[15:27:56] <wiesand> This "minor" issue asside, I think it's clear that the next 1.6 release will be a 1.6.21.2?
[15:28:17] <wiesand> With basically the Linux 4.14 fix?
[15:28:19] <kadukoafs@gmail.com/barnowl772461F2> Seems likely.
[15:28:36] <kadukoafs@gmail.com/barnowl772461F2> Since that is timely and we're unlikely to have a bind-mounts patch
ready in the desired timescale.
[15:28:51] <meffie> correct
[15:29:06] <mvita> bind-mounts is a long term strategy
[15:29:10] <meffie> yes
[15:29:17] <mvita> on the order of months
[15:29:47] <wiesand> You're very optimistic...
[15:30:29] <mvita> I try to be.  My dad taught me "Vitales never quit."
[15:30:30] <wiesand> Anyone in for a bet on what will be first, bind-mount or BER?
[15:30:41] <meffie> BER?
[15:30:41] <mvita> BER?
[15:30:47] <mvita> echo?
[15:30:55] <wiesand> that airport thingy...
[15:31:30] <mvita> sorry, you lost me
[15:31:42] <kadukoafs@gmail.com/barnowl772461F2> Ah, "Berlin Brandenburg Airport is an international airport under
construction"
[15:31:47] <wiesand> Really?
[15:32:23] <kadukoafs@gmail.com/barnowl772461F2> I have no data on how bad a "project that will never end" BER is.
[15:33:53] <wiesand> It was supposed to be finished more than 5 years ago. The most precise statement about when it's actually going to start is "not this year, probably not next year either".
[15:34:20] <kadukoafs@gmail.com/barnowl772461F2> Anyway, for 1.6.21.2, do you want to pull in the curses.m4 and
stdint.h fixes (12740 and 12724)?
[15:34:25] <wiesand> It's *really* bad. Probably unprecedented.
[15:34:48] <mvita> oh, we call that "Real Soon Now."
[15:34:56] <kadukoafs@gmail.com/barnowl772461F2> I'm sure there are some american things that could be precedent for
such delays ... though I guess german engineering has a different sort
of reputation.
[15:35:06] <wiesand> Re 1.6.21.2, no objections to those (after due review).
[15:36:20] <wiesand> These days, you enter a taxi in a far far away country, the driver asks you where you from, you say "near Berlin", and the response is "ah, BER, hahaha".
[15:37:01] <wiesand> It's quite embarrassing, really.
[15:37:13] <kadukoafs@gmail.com/barnowl772461F2> My condolences.
[15:37:33] <meffie> heh
[15:37:45] <wiesand> I'm not convinced it will *ever* open.
[15:38:26] <kadukoafs@gmail.com/barnowl772461F2> Will London ever get an additional runway? (No bets on where.)
[15:39:16] <wiesand> Back on topic, I was really wondering why the Linux-next builders succeed for 1.8.x before the Linux-4.14 changes landed?
[15:40:12] <mvita> hmm
[15:40:19] <wiesand> BER has two runways. One is in use (by the old Schoenefeld airport), the other one is being sanitized ;-)
[15:42:08] <mvita> looking
[15:45:26] <meffie> weisand: did you see the mail i sent about the macos client?
[15:48:07] <wiesand> er, no, I'm not aware of that one… was it to -info?
[15:48:23] <kadukoafs@gmail.com/barnowl772461F2> private mail
[15:48:38] <meffie> hmm, no it was a direct message. maybe i used the wrong address?
[15:48:40] <wiesand> let me check...
[15:49:04] <wiesand> Ah, "looking for testers"
[15:49:20] <mvita> well, it looks like the daily 1.8x build is actually building master, not 1.8.x
[15:49:39] <kadukoafs@gmail.com/barnowl772461F2> whoops.  Is that my fault?
[15:49:49] <mvita> don't know
[15:49:54] <meffie> oh, rats.
[15:49:59] <wiesand> Mark: Ah, that would explain it ;-)
[15:50:25] <meffie> ok, i'll take a look at the buildbot config.
[15:50:53] <mvita> version : BP-1.8.x-10-ge0c5a
[15:51:01] <mvita> that's the top master commit
[15:51:10] <meffie> ok
[15:51:15] <wiesand> Mike: Sorry, yes I read it and I may have users willing to test, but haven't approached them yet.
[15:51:25] <meffie> great!
[15:52:28] <meffie> marcio is running it now, in a vm.
[15:52:55] <mvita> confirmed: Daily and Daily-1.8 are both building master
[15:53:13] <meffie> ok, that's on me. i'll take a look later today.
[15:53:48] <meffie> must be an incorrect branch option in the buildbot config.
[15:53:54] <wiesand> Relief :)
[15:55:06] <mvita> excellent catch, Stephan, thank you!
[15:55:08] <wiesand> Sorry I wasted so much time on OT items today! I think we're finished with 1.6.x for today.
[15:55:26] <kadukoafs@gmail.com/barnowl772461F2> I don't know that there's a whole lot of 1.8/master to talk about.
[15:55:29] <wiesand> Are there other topics? Master? 1.8.x?
[15:55:32] <meffie> yes, nice catch on the buildbot (/me hangs head in shame)
[15:55:38] <kadukoafs@gmail.com/barnowl772461F2> (But I do have some)
[15:56:07] <wiesand> Go ahead please
[15:56:08] <meffie> (i have one minor thing)
[15:56:19] <kadukoafs@gmail.com/barnowl772461F2> I got to look more at the rx events -- there's 9 in total, of which 7
have a reference on the connection/call to correspond to the event
tree/event handler.
[15:56:44] <mvita> I've often wished the build status screen showed the SHA1 of HEAD.
[15:57:02] <kadukoafs@gmail.com/barnowl772461F2> The outliers are conn->delayedAbortEvent and conn->challengeEvent
[15:57:27] <mvita> and/or the commit summary
[15:57:56] <meffie> mvita: i can add a build step to to a git log -n1 (like our internal buildbot)
[15:58:12] <mvita> how are they outliers, Ben?  No reference count?
[15:58:55] <kadukoafs@gmail.com/barnowl772461F2> They both seem unlikely to have raced in practice, but in theory
could.
So, it seems that we should be able to add such connection references
for those events, and then go over the events again to ensure that all
usage complies with the locking and reference-counting requiremnets.
(The latter includes not trying to release a reference corresponding
to a cancelled event when the rxevent_Cancel() call did not cancel the
event.)
[15:59:06] <kadukoafs@gmail.com/barnowl772461F2> Outliers for not holding a reference on the connection, right.
[15:59:26] <mvita> ok, tx
[15:59:27] <kadukoafs@gmail.com/barnowl772461F2> So hopefully there will be some stuff in gerrit this week
[15:59:48] <kadukoafs@gmail.com/barnowl772461F2> But I would probably want two non-me reviewers as a sanity check,
since this is somewhat invasive.
[16:00:23] <mvita> I'm familiar with the event plumbing, I'll review it when it's ready
[16:00:28] <kadukoafs@gmail.com/barnowl772461F2> This remains the only thing blocking 1.8.0, I think.  (Am I forgetting
something?)
[16:01:16] <kadukoafs@gmail.com/barnowl772461F2> Have we heard anything from Andrew about the rxgk patches?
[16:01:54] <mvita> he intends to get to them after he finishes up the project he and I are jointly working at the moment
[16:02:01] <mvita> it seems to be winding down.
[16:02:06] <kadukoafs@gmail.com/barnowl772461F2> *nods*, good to know
[16:02:32] <kadukoafs@gmail.com/barnowl772461F2> mvita: thanks in advance for the rx event review :)
[16:02:44] <kadukoafs@gmail.com/barnowl772461F2> meffie: what's your minor thing?
[16:03:29] <meffie> ah, i dusted off my patch set to convert xstat/gtx/fsprobe/afsmonitor/scout to pthreads.
[16:03:35] <kadukoafs@gmail.com/barnowl772461F2> Yay!
[16:03:42] <meffie> https://github.com/meffie/openafs/commits/meffie/pthreaded-xstat/4
[16:03:49] <meffie> it's just about ready to push to gerrit.
[16:03:56] kadukoafs@gmail.com/barnowl772461F2 has an itchy trigger finger to 'git rm src/lwp'
[16:04:05] <meffie> amen
[16:04:59] <kadukoafs@gmail.com/barnowl772461F2> (tangent: I've been talking to a guy wanting to build openafs for
netbsd arm BE, or something like that, and the lwp stuff is a bit
problematic)
[16:05:22] <meffie> lwp stuff must die.
[16:05:55] <meffie> anyway, just want to report progress.
[16:06:00] <kadukoafs@gmail.com/barnowl772461F2> Thanks!
[16:06:10] <kadukoafs@gmail.com/barnowl772461F2> Any other topics for this meeting?
[16:06:33] <meffie> none from me.
[16:07:23] <mvita> I'm out.
[16:07:44] <kadukoafs@gmail.com/barnowl772461F2> Alright, let's call it, then.
Thanks for being here!
[16:07:57] <wiesand> Let's adjourn then. Thanks a lot everyone!
[16:08:06] <meffie> thanks, have a good one. hope BER is done soon.
[16:09:25] <wiesand> I hope BER will be killed for good ;-)
[16:09:28] wiesand leaves the room
[16:12:37] <mvita> good grief - NO ONE checks the return code from rxevent_Cancel!
[16:14:24] meffie leaves the room
[16:16:57] <kadukoafs@gmail.com/barnowl772461F2> Yup!
[16:17:48] <kadukoafs@gmail.com/barnowl772461F2> Several of the event handlers also don't check if the
conn/call->myevent is NULL before rxevent_Put()ing it.
[17:01:45] meffie joins the room
[20:23:39] meffie leaves the room
[21:20:09] meffie joins the room
[21:21:01] meffie leaves the room
Powered by ejabberd Powered by Erlang Valid XHTML 1.0 Transitional Valid CSS!