Home
release-team@conference.openafs.org
Wednesday, November 5, 2014< ^ >
Room Configuration
Room Occupants

GMT+0
[01:08:26] Jeffrey Altman leaves the room
[03:16:06] Jeffrey Altman joins the room
[13:03:17] wiesand joins the room
[14:33:09] shadow@gmail.com/barnowlE5B64A04 leaves the room
[14:33:16] shadow@gmail.com/barnowlE5B64A04 joins the room
[14:37:35] meffie joins the room
[14:54:24] <wiesand> Tried a test build of 1.6.10 + these: Simple fix: 11551
11358 regression: 11558 11568
Linux 3.17: 11549 11550
aklog/524: 11538
Linux 3.18: 11569 11570
Yosemite: 11571 11572
[14:54:43] <wiesand> Works on SL6. Reliably panics on SL5.
[14:57:46] <shadow@gmail.com/barnowlE5B64A04> woot
[14:58:04] <shadow@gmail.com/barnowlE5B64A04> i assume you get a backtrace
[14:58:28] <wiesand> Kind of... give me a second
[14:58:38] kaduk joins the room
[14:59:59] <wiesand> http://www.zeuthen.desy.de/~wiesand/1611.jpg
[15:02:16] <wiesand> I'll bisect it. I bet it's one of the Linux 3.18 changes or 11558/11568 . I think I had the Linux 3.17 changes running on EL5 without problems a while ago.
[15:03:09] <wiesand> Hello. I'm glad some o you are here despite the late invitation. Sorry.
[15:03:28] <kaduk> In the absence of data, it's easiest to assume that there will be people here to talk with.
[15:04:07] <wiesand> Rituals are good, sometimes.
[15:05:01] <wiesand> I guess we already know that some more work is needed on the more interesting changes foreseen to go into 1.6.11.
[15:06:14] <wiesand> As I said, I'll try to pinpoint the change causing the panics. Alas, I'm pretty busy these days...
[15:06:56] <wiesand> Ben, any chance you could push an updated 11538 yourself?
[15:07:29] <kaduk> I probably could.
It is more likely if someone sends me an email reminder to do so.
[15:08:51] <wiesand> Gerrit should just have done this ;-)
[15:08:52] <kaduk> Hmm, that panic might plausibly be related to some of the stack-reduction work?  Hm, but you said that you had been running things with that which worked.
[15:09:45] <meffie> i was just wondering the same. in init-req?
[15:09:56] <wiesand> It was running plain 1.6.10 before, and there were no crashes. It's not heavily used though.
[15:10:11] <wiesand> But I can check.
[15:11:02] <wiesand> With the current test build, it survives 'ls -l /afs/my.cell' w/o token, but creashes as soon as I log in with ssh/gssapi (~ is in AFS).
[15:11:42] <kaduk> Hmm.
[15:11:57] <wiesand> Not much data, I know.
[15:12:23] deason joins the room
[15:15:01] <kaduk> I think I have a couple other 1.6.11 candidates, let me pull them up...
[15:16:17] <kaduk> (Things that I pulled in as patches for debian, or that Anders pulled in as patches for the Ubuntu PPA)
[15:17:01] <wiesand> 1.6.11 should really have a limited scope. Fix the regression, support recent linux and osx, fix *critical* bugs that haven't been around forever, and zero-risk stuff.
[15:17:05] <kaduk> blerg, gerrit is being slow
[15:17:11] <wiesand> Anything else I'd like to defer to 1.6.12
[15:18:03] <kaduk> Fair enough.  "Maybe these are 1.6.12 candidates, then"
[15:18:36] <wiesand> Pulling them up is fine though.
[15:18:40] <meffie> that sounds good. the two commits i pushed to the 1.6.x branch last night were things i just dont want to miss again.
[15:18:53] <kaduk> Ugh, I think I'm going to have to bounce gerrit.
[15:19:40] <wiesand> I seems so...
[15:20:37] <meffie> the internal server failed, according to the error message.
[15:20:53] <kaduk> restart is in progress...
[15:21:43] <kaduk> Well, one of my three debian patches was to work around RT 131943, for which I haven't concocted a decent patch yet, so I'll probably just have to keep that one for now.
[15:22:30] <kaduk> 11436 is a pullup of 9986, which should fix an actual crash which was reported as a debian bug.
[15:22:55] <kaduk> but I guess that has been a bug in our code "forever" by your metric, and is only exposed by ~recent glibc updates.
[15:24:05] <kaduk> Oh, actually the crash fix is both 11436 and a pullup of 11453
[15:24:20] <kaduk> Apparently I haven't submitted that pullup to gerrit, though :(
[15:24:56] <wiesand> It's never too late :-)
[15:25:14] <deason> that crash that wiesand posted earlier; what code is that running? what changes on top of 1.6.10?
[15:25:16] <kaduk> Yeah.  Still kind of in a flurry, though (woke up a little late).
[15:25:33] <wiesand> Andrew: that should be in the scrollback
[15:25:59] <wiesand> Simple fix: 11551
11358 regression: 11558 11568
Linux 3.17: 11549 11550
aklog/524: 11538
Linux 3.18: 11569 11570
Yosemite: 11571 11572
[15:26:04] <kaduk> Anders' changes were the debian ones and also 11558, 11559, 11549, 11550, 11562, and 11563, which have (all? mostly?) been mentioned already.
[15:30:01] <wiesand> If one of them crashes EL5 but not EL6, it's likely that it won't crash debian either.
[15:31:31] <wiesand> I took a shortcut in 11571. Since this one has to be edited anyway after cherry-picking, I added the whitespace fixes from 11556.
[15:31:48] <wiesand> Is that acceptable, or should I redo it the clean way and wait for 11566 to be merged?
[15:32:38] <kaduk> If it contains the logical contents of two changes, it should probably get the "cherry-picked" lines from both of them, so our tooling can try to track that.
[15:33:36] <wiesand> I can redo it. Now that 1.6.11pre1 is delayed for other reasons, no problem.
[15:34:01] <deason> if you mean for actually pulling into 1.6.x, I don't see why we wouldn't just want to keep them as separate changes
[15:34:30] <wiesand> It wasn't merged on master, and I hoped we could have a test pre1 today.
[15:34:55] <deason> and sorry, I don't have enough scrollback above, but you mention bisecting that panic, so I assume you can reproduce it?
[15:35:31] <wiesand> 2 out of 2 so far.
[15:35:40] <wiesand> Let me reset the syetem once more...
[15:37:19] <wiesand> Jeffrey pushed an updated 11528. Thanks.
[15:37:26] <wiesand> 538
[15:38:12] <kaduk> (My missing cherry-pick is 11588)
[15:39:51] <wiesand> Thanks.
[15:40:04] <wiesand> [building with the 3.18 changes reverted]
[15:41:20] <wiesand> 3 out of 3.
[15:42:21] <deason> is there a more complete backtrace or a coredump for it?
[15:42:25] kaduk leaves the room
[15:42:25] <wiesand> I think it's getcwd related... let me try again...
[15:42:30] kaduk joins the room
[15:42:32] <wiesand> Not easily
[15:44:54] <wiesand> Unfortunately, the test system is real server hardware. Booting takes ages.
[15:45:17] <deason> I don't suppose you have a convenient place to push the git tree you're using? just to save a few minutes if I don't have to pick each patch
[15:45:39] <wiesand> np, hold on
[15:47:19] <wiesand> www.zeuthen.desy.de/~wiesand/openafs-1.6.11pre0-src.tar.bz2 <http://www.zeuthen.desy.de/~wiesand/openafs-1.6.11pre0-src.tar.bz2>
[15:47:46] <deason> ah, I meant like 'git push'ing it somewhere, but that works too :)
[15:47:47] <kaduk> I think Andrew wanted a git repo, not a tarball.
[15:48:07] <wiesand> It's not the git tree, but should suffice to verify the problem
[15:49:00] <kaduk> 'tis true
[15:50:48] <wiesand> oa1105.tar.bz2 in the same location is the complete tree. 150 MB.
[15:51:24] <wiesand> use 1e6e2ad
[15:53:14] <wiesand> 4 out of 4 :)
[15:56:13] <wiesand> It's a bit strange. I got a token for that ~ as root, could ls-l it, cd into it, pwd, no problem. But as soon as I log in with tat account by ssh, boom.
[15:57:39] <kaduk> different PAM modules running, probably, but still worrisome.
[15:59:19] shadow@gmail.com/barnowlE5B64A04 leaves the room
[15:59:30] shadow@gmail.com/barnowlE5B64A04 joins the room
[16:02:27] <wiesand> Reverting the 3.18 changes fixes it.
[16:03:36] <wiesand> 11570?
[16:04:23] <deason> damn, I was just about to guess that
[16:04:41] <deason> check your config.log for checking the 'match' op?
[16:06:13] <deason> in config.log "checking for match in struct key_type"
[16:06:44] <wiesand> I have to rerun configure
[16:09:32] <wiesand> configure:11650: checking for match in struct key_type
make -C /usr/src/kernels/2.6.18-371.12.1.el5-x86_64 M=/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir modules KBUILD_VERBOSE=1
make: Entering directory `/usr/src/kernels/2.6.18-371.12.1.el5-x86_64'
test -e include/linux/autoconf.h -a -e include/config/auto.conf || (            \
        echo;                                                           \
        echo "  ERROR: Kernel configuration is invalid.";               \
        echo "         include/linux/autoconf.h or include/config/auto.conf are missing.";      \
        echo "         Run 'make oldconfig && make prepare' on kernel src to fix it.";  \
        echo;                                                           \
        /bin/false)
mkdir -p /tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/.tmp_versions
rm -f /tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/.tmp_versions/*
make -f scripts/Makefile.build obj=/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir
  gcc -Wp,-MD,/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/.conftest.o.d  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include -D__KERNEL__ -Iinclude  -include include/linux/autoconf.h  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -fno-delete-null-pointer-checks -fwrapv -Os  -mtune=generic -m64 -mno-red-zone -mcmodel=kernel -pipe -fno-reorder-blocks -Wno-sign-compare -fno-asynchronous-unwind-tables -funit-at-a-time -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fomit-frame-pointer -g  -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign    -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(conftest)"  -D"KBUILD_MODNAME=KBUILD_STR(conftest)" -c -o /tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/.tmp_conftest.o /tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.c
make: Leaving directory `/usr/src/kernels/2.6.18-371.12.1.el5-x86_64'
/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.c:42:28: error: linux/key-type.h: No such file or directory
/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.c: In function 'conftest':
/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.c:46: error: storage size of '_test' isn't known
/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.c:46: warning: unused variable '_test'
make[1]: *** [/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir/conftest.o] Error 1
make: *** [_module_/tmp/rpmb.sw/BUILD/openafs-1.6.11pre0/conftest.dir] Error 2
[16:11:21] <wiesand> Is it the missing key-type.h ?
[16:11:27] <wiesand> EL6 has it.
[16:11:42] <wiesand> EL5 doesn't.
[16:11:57] <deason> well, first, the config test for the 'match' op is not written "correctly"
[16:12:34] <deason> since the 'new' behavior should be indicated by success, and not failure; I think I can fix that (so this problem appears as a build issue instead of as a panic)
[16:15:05] <deason> but yeah, I assume this is due to some difference in how rhel5 pulled up some kernel change; but I'm checking
[16:17:15] <deason> okay, yeah, key.h; that would be annoying to deal with in the config test, but I don't think we'll need to if we make the test look for key_match_data.cmp instead
[16:19:19] <kaduk> Do we think we know what 1.6.11 will look like, then?
[16:21:15] <wiesand> I think so. Like my test build, plus anything you succeed to talk me into accepting ;-)
[16:23:21] <kaduk> I guess I will tolerate the AFSDIR_PATH_MAX crashes until 1.6.12, so don't look for anything from me there.
[16:23:35] <kaduk> I did fix a couple typo's in Jeff's 11538 update and +1 it.
[16:23:47] <wiesand> Thanks.
[16:26:29] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] <wiesand> So I'll redo 11571, hope for a fix for 11570, test again. And if no panics, we'll try to call it pre1?
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:31] shadow@gmail.com/barnowlE5B64A04 leaves the room
[16:26:40] shadow@gmail.com/barnowlE5B64A04 joins the room
[16:26:46] <kaduk> While working on the Unix QuickStart Guide, I noticed that master's asetkey list doesn't know about the new key types (i.e., rxkad_krb5); I added that as a todo item for the 1.8 branch
[16:27:24] <wiesand> Required before branching?
[16:27:55] <kaduk> It's impossible to know if you have the proper keys in the proper places without, like, hexdump, at present.
[16:28:10] <kaduk> That's a pretty bad upgrade experience to give to people.
[16:28:16] <wiesand> Right.
[16:28:55] <kaduk> Also, Daria merged a bunch of bits that had been sitting around for a while, and now I'm wondering if we want to use -1 and/or -2 review in gerrit to mark thing submitted against master as "not for 1.8", as is sometimes done for 1.6 things "not for 1.6.[latest]".
[16:30:10] <wiesand> You mean things not yet merged on master?
[16:30:23] <kaduk> right
[16:30:48] <wiesand> Sounds reasonable, but master is not my turf ;-)
[16:30:48] <Jeffrey Altman> hello. if anyone has something they think should not be in 1.8, then send to gatekeepers and one of us will -2 it
[16:31:13] <Jeffrey Altman> and of course if there is something that should be, comment that as well
[16:31:35] <wiesand> NB this project seems to be running low on gatekeepers...
[16:32:02] <kaduk> I'm thinking along the lines of both Daria and I are probably going to try and triage master changes, and it would be nice to avoid duplication of work.
[16:32:33] <kaduk> We seem to still be sending mail to Ken when new changes are uploaded...
[16:32:58] <Jeffrey Altman> Ken can unsubscribe himself in Gerrit if he wishes
[16:34:56] <deason> wiesand: 11589 if you want to try, but it's more likely to break the other case (newer linux, 3.18)
[16:35:36] <wiesand> Anyway, thanks. Let's see what Marc and Anders think too. But I'll test it on top of 11570.
[16:36:09] <wiesand> Ben, anything else to discuss regarding 1.8?
[16:37:08] <kaduk> I don't think there's anything new.
[16:37:27] <wiesand> Ok. Anything else to discuss today?
[16:37:36] <kaduk> Next week is the IETF meeting, so I may be otherwise occupied.
[16:37:51] <wiesand> Thanks for the heads up.
[16:38:39] <wiesand> Fine. Thanks a lot everyone!
[16:38:52] <wiesand> Bye.
[16:38:56] wiesand leaves the room
[16:39:45] kaduk leaves the room
[16:44:19] deason leaves the room
[16:58:55] meffie leaves the room
[17:22:49] kaduk joins the room
[18:31:04] <kaduk> 11546 should get merged once the buildbot chimes in; its absence is probably what's breaking other builds.
[19:54:59] Jeffrey Altman leaves the room
[19:58:46] Jeffrey Altman joins the room
[23:40:05] kaduk leaves the room
Powered by ejabberd Powered by Erlang Valid XHTML 1.0 Transitional Valid CSS!