Home
release-team@conference.openafs.org
Friday, February 14, 2020< ^ >
wiesand has set the subject to: weekly openafs release team meeting
Room Configuration
Room Occupants

GMT+0
[07:01:37] kaduk@jabber.openafs.org/barnowl joins the room
[14:54:55] cwills joins the room
[16:34:13] yadayada joins the room
[16:34:17] meffie joins the room
[16:35:32] <yadayada> Hello all, sorry was missing last month meetings. Was mostly on vacation due to some personal work.
[16:39:08] <meffie> welcome back
[16:56:08] wiesand joins the room
[16:56:44] wiesand has set the subject to: wow, what a crowd today
[16:57:39] <wiesand> good morning USA, good night India
[16:59:26] <meffie> good day
[17:00:59] <yadayada> good day Stephan
[17:01:15] <wiesand> Ok, thanks to your reviews I was able to merge away most off the 1.8.x backlog :)
[17:02:43] <yadayada> 14062 I have tried and it works perfectly fine on Catalina. Gave +1 on review,
[17:03:08] <wiesand> What's left is the fix in 14062. Once that landed on master, I can merge the "Catalina" stack.
[17:04:01] <wiesand> And I'm currently working on the 1.8.6 release notes. Expect a NEWS push soon.
[17:06:03] <wiesand> The other thing I'd like to have in 1.8.6pre1, but I won't block for it, is 14063 (Linux 5.6). Thanks a lot for putting that up so swiftly!
[17:06:59] <meffie> excellent! thank you
[17:07:51] <yadayada> I can work on reviewing 14063.
[17:08:20] <wiesand> 14063 may be deferred to pre2 if everything else is ready for pre1, but I think it should be part of the 1.8.6 release
[17:08:36] <wiesand> yadayada: thanks!
[17:09:05] <meffie> thank you yadav
[17:09:53] <wiesand> And less than 10 minutes into today's meeting, I'm out of topics to discuss :)
[17:10:44] <meffie> and i had a snack today so i could sit through a long meeting!
[17:10:56] <wiesand> ;-)
[17:11:37] <wiesand> NB buildbot has been very reliable and the windows builders reasonably fast this week
[17:11:51] <meffie> good news, thanks.
[17:11:59] <wiesand> Whatever magic you conjured after my whining last week, it sure helped a lot!
[17:13:10] <yadayada> just some update from my side. Recently in our production env we hit lots of Meltdown scenario with 1.8 DAFS servers. Sent some analysis to Mike, Andrew. Still digging into core, but looks some refactoring surely needs to be done the way we have current MultiBreakCallBack. I will send some more thoughts on it in coming weeks
[17:14:16] <meffie> i made a change to limit which ones are gerrit builders (suggested by ben) and Alejandro gave me a pull request to limit builds on the same hypervisor
[17:15:06] <wiesand> yadav: thanks, even though that's not entirely good news
[17:15:07] <meffie> thanks yadayada
[17:15:31] <wiesand> BTW: we had to relocate a couple of servers lately
[17:16:22] <wiesand> So I moved all volumes off them before that happened to provide our users uninterrupted access to their data
[17:17:13] <wiesand> This generally went smoothly, but I made a slightly disturbing observation again:
[17:17:52] <wiesand> If you keep an evacuated file server running "long enough", everything is fine.
[17:18:36] <meffie> i think 4 hours is the "long enough" if i recall.
[17:18:39] <wiesand> But if you shut it down too quickly, clients will fail with "connection timeouts".
[17:19:15] <wiesand> 4h fits my observation that 1d is "long enough"
[17:19:25] <wiesand> My question is: how come?
[17:19:51] <wiesand> Or rather: could that be improved?
[17:20:18] <meffie> if the server is still up, the clients get a VMOVED error if the vl cache still shows the volume on the old server on the client
[17:21:11] <wiesand> yes, makes sense
[17:22:03] <meffie> after "a long time" the vl caches can expire, so the clients know to ask the vlserver for the new location before trying the old server
[17:22:09] <wiesand> if a client would recheck the volume location before giving up with a connection timeout, what would be the possible drawbacks?
[17:23:14] <meffie> i think the drawback would be excessive load on the vlserver. if you DDOS your vlserver it can take the whole cell out.
[17:23:45] <meffie> the afsd --volume-ttl is meant to help improve the situation.
[17:24:23] <meffie> you can lower the time the vl cache entries expire
[17:24:46] <meffie> regardless of volume releases or fileserver connectivity
[17:25:09] <wiesand> guessed that "DDOS on vlserver" reason :-(
[17:25:25] <wiesand> but I'll have a look at —volume-ttl
[17:26:41] <meffie> the vlserver does not have callbacks like the fileserver, so it takes more time for changes to propagate.
[17:27:12] <wiesand> Thanks a lot for the insight!
[17:27:49] <meffie> no problem. i you want, i can ask mark and andrew to send some info. they could explain better than i.
[17:29:07] <kaduk@jabber.openafs.org/barnowl> If each client rate-limits its vldb "re-lookup"s to, say, 1 per 5
minutes, that would probably be enough to avoid vlserver meltdown
[17:29:37] <meffie> welcome ben!
[17:29:43] <wiesand> Don't bother them, but thanks. I expected something along these lines - if it was trivial to solve, this problem probably would have been long ago.
[17:29:44] <kaduk@jabber.openafs.org/barnowl> (We'd also have to make sure we don't hit an infinite loop in the
"fileserver down" case, as we timeout, re-lookup, and try again to
connect, though there are several ways to do that.)
[17:29:51] <wiesand> Hi Ben :)
[17:29:57] <kaduk@jabber.openafs.org/barnowl> Hi :)
[17:30:23] <kaduk@jabber.openafs.org/barnowl> I will have to duck out again in a few minutes, again, though
[17:31:00] <wiesand> Any chance you can just merge 14062 and 14063 in the meantime? ;_)
[17:31:08] <meffie> :)
[17:31:31] <wiesand> "always take your chances…"
[17:31:38] <cwills> As soon as 14063 is merged, I'll push a 1.8.x version of it
[17:31:50] <kaduk@jabber.openafs.org/barnowl> I will probably get to do so within the hour
[17:31:57] <wiesand> seriously though: no pressure!!!
[17:32:15] <meffie> welcome cwills
[17:32:17] <kaduk@jabber.openafs.org/barnowl> (within "an" hour?  I guess the former construction could imply
hour-alignment.)
[17:32:54] <wiesand> relax, enjoy your vacation…
[17:33:37] <kaduk@jabber.openafs.org/barnowl> okay :)
[17:35:17] <wiesand> cheyenne: will a 1.8.x version of 14063 look substantially different from 14063 itself? I hoped for it to be a clean chery-pick?
[17:35:45] <cwills> I did a quick check and it cleanly cherry-picks into 1.8.x
[17:35:52] <wiesand> :)
[17:36:35] <wiesand> Are there more dark clouds on the Linux 5.6 horizon yet?
[17:37:03] <cwills> I don't think so.  I try to do a weekly build against the latest kernel
[17:37:35] <meffie> this fix was for 5.6rc1, correct?
[17:37:43] <cwills> yes
[17:37:44] <wiesand> Great. Thanks a lot!
[17:38:41] <cwills> The build I do is a "vanilla" copy of the current tag in the linux  git repo
[17:39:29] <wiesand> So, to wrap up, we're homing in on 1.8.6pre1, probably even with Linux 5.6 support.
[17:39:57] <meffie> nice!
[17:40:27] <wiesand> Next steps: get the remaining  1 or  2 changes merged, finalize the release notes, then get out 1.8.6pre1.
[17:41:10] <wiesand> I have a little bit of hope that I can even look at 1.6.x again in between.
[17:41:24] <wiesand> But my priority is on 1.8.x.
[17:41:57] <meffie> just to mention here, the 2020 workshop site is up, and is now properly hosted at grand.central.org.  CFP ends tomorrow! https://workshop.openafs.org/
[17:42:24] <meffie> wiesand: any chance you could merge 14067 for the web update?
[17:44:28] <meffie> (more talks will be posted after the cfp ends)
[17:45:49] <wiesand> At this point, yes, probably. Note we declined similar changes by others in the past… I always hoped we could get a list of all those workshops and conferences together. Moving obsolete links there rather than just deleting them would be a lot easier.
[17:47:03] <meffie> the link actually is the same. it's just text that changed. but yes, we could do something like that in another commit.
[17:47:34] <wiesand> I actually had such a list, fairly complete, at some point. As always: if you don't get it done completely, you get to do it all again from scratch :-(
[17:49:24] <wiesand> I'll look at 14067 tomorrow.
[17:49:30] <meffie> ok, thank you!
[17:50:29] <wiesand> Will remote participation be possible? (worked great last year)
[17:51:34] <meffie> I can put a list of passed workshops on workshop.openafs.org/ that's a good idea.  i dont know if remote participation is planned.
[17:51:39] <wiesand> (though the time shift takes its toll, whether through jet lag or not)
[17:51:59] <meffie> s/passed/pass/
[17:52:15] <wiesand> past?
[17:52:23] <meffie> er, yes past :)
[17:52:39] <meffie> workshops gone by.
[17:52:58] <wiesand> time for lunch or another snack I guess ;-)
[17:53:16] <meffie> indeed!
[17:53:31] <wiesand> Anything on 1.9/master? Or should we adjourn  for today?
[17:53:44] <yadayada> nothing from  my side
[17:54:22] <meffie> just gerrits to update and review review review
[17:54:43] <wiesand> 14062&3 ;-)
[17:55:03] <wiesand> Let's adjourn then. Thanks a lot everybody!
[17:55:27] <yadayada> Thanks !!
[17:55:40] <meffie> have a good weekend
[17:55:48] meffie leaves the room
[17:56:11] wiesand leaves the room
[18:33:03] yadayada joins the room
[18:49:05] yadayada leaves the room
[19:47:39] meffie joins the room
[20:03:26] yadayada leaves the room
[20:30:58] meffie leaves the room
[22:14:04] yadayada joins the room
[22:49:58] yadayada joins the room
[23:05:55] yadayada leaves the room
[23:21:51] yadayada leaves the room
Powered by ejabberd Powered by Erlang Valid XHTML 1.0 Transitional Valid CSS!