[00:03:39] --- abo has left [00:11:37] --- abo has become available [00:23:39] --- jaltman has left: Disconnected [00:38:48] --- lars.malinowsky has become available [00:43:31] --- mfelliott has left [01:04:02] --- reuteras has left [02:10:04] --- jaltman/FrogsLeap has left: Replaced by new connection [02:10:05] --- jaltman/FrogsLeap has become available [03:19:51] --- lars.malinowsky has left [05:09:34] --- reuteras has become available [05:53:06] --- jaltman/FrogsLeap has left: Disconnected [05:53:09] --- steven.jenkins has become available [05:53:15] --- jaltman/FrogsLeap has become available [05:53:28] --- steven.jenkins has left [07:03:03] --- phalenor has left [07:06:00] --- jaltman/FrogsLeap has left: Replaced by new connection [07:06:02] --- jaltman/FrogsLeap has become available [07:06:30] --- phalenor has become available [07:14:10] --- jaltman/FrogsLeap has left: Disconnected [07:14:21] --- jaltman/FrogsLeap has become available [07:25:57] --- reuteras has left [07:38:15] --- reuteras has become available [07:54:09] --- mfelliott has become available [07:58:10] --- jaltman/FrogsLeap has left: Disconnected [08:01:56] --- deason has become available [08:13:48] --- reuteras has left [09:03:50] --- Simon Wilkinson has become available [09:10:29] --- rra has become available [09:52:49] --- ksumner has become available [10:40:51] Hm, Heimdal libkafs seems unhappy with a 1.6.0pre4 client. It returns false for k_hasafs(). [10:41:10] strace it? [10:41:20] open("/proc/fs/openafs/afs_ioctl", O_RDWR|O_LARGEFILE) = 3 ioctl(3, CAPI_REGISTER or SNDCTL_COPR_LOAD, 0xbfdd7a9c) = -1 EINVAL (Invalid argument) close(3) = 0 [10:41:24] And then it falls back on the other paths. [10:43:42] i wonder what questiin is asks that it gets that back [10:45:24] Yeah, this program prints out 0: [10:45:26] #include #include #include int main(void) { printf("%d\n", k_hasafs()); return 0; } [10:45:47] sure, but that says nothing of what k_hasafs does. i am looking at that now [10:45:53] I know, just confirming. [10:45:58] Heimdal 1.4.0-5 in Debian. [10:46:01] I'm taking a look. [10:46:19] VIOCGETTOK [10:47:11] Yeah, with all nulls in the input parameters. [10:48:44] i bet the checks to verify input parameters screwed ity. [10:50:22] that happens if we have between 1 and 3 bytes of input, inclusive, looks like; I think the code expects to get 0 in that case, but for some reason it's not? [10:50:24] I'm doing a VIOCSETTOK in my kafs library, and that seems to still work. [10:50:48] VIOCSETTOK with all 0/NULL. [10:51:09] The only difference seems to be that Heimdal is doing VIOCGETTOK instead. [10:51:22] Does VIOCGETTOK barf if the output buffer is NULL? [10:53:58] do you have tokens? [10:55:36] apparently heimdal *wants* EFAULT and that's not what we return given there's no place to write. we can fix that [10:56:22] ah, but it's the getXXX which return EINVAL, not putXXX [10:59:08] I do have tokens. [10:59:57] destroy them and try? [11:00:31] No change. [11:00:47] --- a/src/afs/afs_pioctl.c +++ b/src/afs/afs_pioctl.c @@ -2247,7 +2247,7 @@ DECL_PIOCTL(PGetTokens) newStyle = (afs_pd_remaining(ain) > 0); if (newStyle) { if (afs_pd_getInt(ain, &iterator) != 0) - return EINVAL; + return EFAULT; } i = UHash(areq->uid); ObtainReadLock(&afs_xuser); [11:02:19] Ah, and my k_hasafs in my kafs library explicitly expects EINVAL in response to the SETTOK call. [11:02:51] I wonder if the Heimdal approach is because some failures due to AFS not being present can return EINVAL. I see that it returns EINVAL if the ioctl path doesn't exist. [11:03:12] EINVAL is the more correct errno to return, not that that helps. [11:03:16] right, so we have to help heimdal i suspect. the patch proposed should work. [11:14:23] why is afs_pd_remaining(ain) > 0, though? if we didn't pass in any data, shouldn't it be 0? [11:19:50] --- phalenor has left [11:19:52] --- phalenor has become available [11:54:53] --- mdionne has become available [11:59:27] --- jaltman/FrogsLeap has become available [12:01:30] that's the only place i can see we'd be able to get EINVAL in that function tho [12:01:55] i should be able to build heimdal now and find out [12:04:56] yeah I know, but I wonder if something with ain is wrong or something [12:05:48] and afs_HandlePioctl has a few other places that could return EINVAL first, so I dunno [12:06:45] maybe HandlePioctl. [12:13:54] --- mdionne has left [13:11:22] Let's see if I can figure out how to apply a patch to a DKMS tree and have it do the right thing and then I'll test. [13:14:46] I think that worked. Now as soon as the build finishes I'll be able to tell you if that fixes it. [13:59:33] Hm. This still didn't work. Now is that because the change didn't fix it, or because I failed to actually build a new kernel module? [13:59:41] i assume the former [13:59:54] i will look into it for real when i finish the thing i am bashing at [14:56:52] --- deason has left [15:06:45] --- ksumner has left [15:54:03] Is: make[2]: *** No rule to make target `/home/eagle/dvl/openafs/lib/libcom_err.a', needed by `command-t'. Stop. a known failure at the moment? [15:54:49] uh. no? [15:54:56] but i can guess. hang on [15:55:12] I can push a patch for it. [15:55:25] I don't think my tree has desynced. The problem looks simple, but I don't know why buildbot didn't catch it before. [15:55:37] libcom_err->libafscom_err [15:55:44] Yeah, it looks like that's all it is. [15:58:07] Fixing that now gives me: error_msg.c:(.text+0x90): undefined reference to `rk_strlcat' error_msg.c:(.text+0xac): undefined reference to `rk_strlcat' [15:58:11] Looking now. [15:58:34] ${LIB_roken} [16:02:17] make[2]: *** No rule to make target `-lrokenafs', needed by `command-t'. Stop. [16:02:21] Now looking at that. :) [16:03:40] Ah, can't put that there. [16:06:42] All the tests need LD_LIBRARY_PATH set or need to be built with libtool or something equivalent so that they can run without installed shared libraries. [16:06:49] * rra sets LD_LIBRARY_PATH for the time being. [16:08:06] Which unfortunately means that runtests -o from the command line isn't going to work unless someone sets LD_LIBRARY_PATH. [16:09:59] Okay, down to the command-t test dying with heap corruption, which is not my problem. I'll push the other changes. [16:43:35] --- jaltman/FrogsLeap has left: Disconnected [17:00:16] --- jaltman/FrogsLeap has become available [18:39:02] --- rra has left: Disconnected [18:57:51] --- Russ has become available [19:41:31] --- jaltman/FrogsLeap has left: Replaced by new connection [19:41:40] --- jaltman/FrogsLeap has become available [19:46:52] --- jaltman/FrogsLeap has left: Replaced by new connection [19:46:53] --- jaltman/FrogsLeap has become available [20:08:13] --- rra has become available [20:10:25] Yeah, 1.6.0pre4 is definitely not stable on 2.6.38, at least the Debian version. [20:10:27] --- jaltman/FrogsLeap has left: Replaced by new connection [20:10:28] --- jaltman/FrogsLeap has become available [20:10:30] It just crashed hard for me as wel. [20:10:56] Similar symptoms: hard kernel panic so that the system goes entirely non-responsive apart from low-level kernel functions. It still responds to pings and accepts TCP connections, but nothing else. [20:11:07] It won't undo the console power saver, so I can't get the backtrace. [20:11:42] * rra grabs Simon's patch while I'm at it and builds a new kernel module. [20:12:07] Just in case the other person with this problem somehow didn't build a new kernel module properly. [20:12:35] --- jaltman/FrogsLeap has left: Replaced by new connection [20:12:38] --- jaltman/FrogsLeap has become available [20:15:32] Ouch. [20:16:27] I'll keep running with it in the hope that I can get it to explode while I'm doing stuff on console; otherwise, someone will need to try to reproduce in a VM so that one can get at the console logs. [20:16:51] Although it sounded like when it does this it goes into an infinite panic loop, which means that being at console won't help since the first one is the important one and it's going to just scroll off the screen. [20:16:59] Though, we seem to be running 1.6.0pre4 on lola-granola (natty), which has a nominally 2.6.38 kernel, and the only instability I remember is seemingly attributable to unity ... [20:17:15] Maybe there's something in one of the stable patch sets that causes a problem? [20:18:54] * rra reloads AFS with a patched kernel. Now we see if it will stay up. [20:19:55] * rra will avoid using AFS until I get into work tomorrow. [20:19:58] --- rra has left: Disconnected [20:20:45] 2.6.38? uh, hang on. [20:24:01] nothing of note. so, i dunno [20:29:24] Yeah, it's a weird one. [20:29:27] I'll follow up to the bug. [20:39:55] --- jaltman/FrogsLeap has left: Disconnected [20:40:03] --- jaltman/FrogsLeap has become available [23:01:56] --- reuteras has become available [23:16:44] --- Russ has left: Disconnected