[06:39:18] --- shadow@gmail.com/owlB13F2F3E has become available [06:43:56] --- lama has become available [06:48:04] --- sxw has become available [06:49:25] --- sxw has left [07:49:28] --- deason has become available [08:25:02] --- lama has left [09:09:38] --- reuteras has left [09:28:09] --- Simon has become available [09:41:42] --- Simon has left [11:40:20] --- jaltman/FrogsLeap has left: Disconnected [13:38:17] --- jaltman/FrogsLeap has become available [14:12:08] --- jaltman/FrogsLeap has left: Disconnected [14:12:45] --- jaltman/FrogsLeap has become available [14:12:45] --- jaltman/FrogsLeap has left: Disconnected [14:12:57] --- jaltman/FrogsLeap has become available [14:13:04] --- jaltman/FrogsLeap has left: Disconnected [14:13:29] --- jaltman/FrogsLeap has become available [14:52:29] --- mfelliott has left [15:08:31] --- Simon has become available [15:25:43] --- Simon has left [15:32:52] --- deason has left [16:03:51] From zephyr: -> scripts / 1507915.d / geofft 18:35 (and to dust you shall return) > fs:'/afs/athena.mit.edu/dept/cron/project/slab/web_scripts/vn': > server or network not responding is how OpenAFS 1.6 says you don't have bits to do thtat I can't reproduce this on my fbsd client; is geofft correct? [16:04:18] --- Simon Wilkinson has become available [16:05:04] Which fs command is he trying to run? [16:27:07] la, I believe. [16:27:55] Yeah, dr-wily ~> fs la ~slab/web_scripts/vn fs:'/afs/athena.mit.edu/dept/cron/project/slab/web_scripts/vn': server or network not responding [16:29:12] I see the same thing with 1.6.0pre2 [16:33:18] Interesting... [16:34:09] Are you both trying the same path? [16:34:38] I'm trying a completely different path in my own cell [16:34:53] Yeah. I can't reproduce this against a path in a local cell here. [16:37:07] I can reproduce this in 2 different cells on two different 1.6.0pre2 machines [16:38:31] Oh, I don't doubt it. [16:39:19] So, that error means that we're getting a '-1' back from the pioctl, or from the code that parses it. [16:39:45] While I try and get something up and running that can actually reproduce it, could you strace the fs la, and pastebin it? [16:40:33] http://pastebin.com/FLdzVFKD [16:41:11] Okay, so the problem seems to be that we're using the ioctl return value (-1) and not the error number (EACCESS) when we report back [16:43:31] I'm also seeing "fs: server or network not responding" in the output of fs listcells, but I bet that's a separate issue [16:43:41] I suspect its not. [16:44:00] I'd imagine we're returning the error in the error code there, but something's replacing it with the output from ioctl. [16:44:11] Thing is, I can remember a patch from Marc that was supposed to sort this a while back. [16:45:50] Actually, I think that patch is the problem. [16:46:55] 0bc837f68a72ba1f75d940cc5dd057774d9f36bb changed things so that our ioctl() emulation would return -1 on error and set errno. I suspect that the command line tools haven't been updated for that new behaviour. [17:04:34] phalenor: Have you built from git, or are you just on RPMs? [17:05:27] rpms [17:05:40] Okay, I shall just have to fire up a VM. [17:05:47] Think I know what the problem is, and its Linux only. [17:06:17] I could probably build from git, high probability that I have newer autoconf on this machine [17:11:43] So, reverting 0bc837f68a72ba1f75d940cc5dd057774d9f36bb fixes this problem. [17:11:44] --- shadow@gmail.com/owlB13F2F3E has left [17:11:55] I just need to figure out how to fix the problem that patch was supposed to fix. [17:28:02] So, it's yet another positive vs negative error code problem. If we return a negative value to the ioctl handler in the kernel, it assumes that it's an error code, and so sets errno = -retval, and retval to -1. [17:28:43] Problem is that we're not always getting this negation right, so sometimes we return a positive value to the ioctl handler, hoping that it will take it as an error. It doesn't, returns it as a postive result, and setpag doesn't know what to do with it. Yuck. [17:36:27] --- shadow has left [17:52:16] Gerrit 4222 and 4223 [17:53:16] want me to test that? [17:53:39] If you don't mind. I've verified them locally but more eyes wouldn't hurt. [17:53:51] 4222 should be sufficient to fix the fs bug. [17:55:36] need to make sure that gets applied to 1.6 too [17:56:02] Yeah. I can't do that until it's on master, though. [17:56:09] Thanks for tracking it down, Simon. [17:56:24] no problem [17:57:06] What I really, really want for Christmas is a test suite. [17:58:22] if (code) will match any nonzero code, positive or negative, right? [17:58:34] Yes [17:59:30] yes [17:59:45] (I'm not going to say anything else without having read the rest of the function.) [18:19:26] --- shadow@gmail.com/owlB3684B9A has become available [18:26:21] -1 and set errno is correct, so if tools fail to deal we should fix that. and they may [18:27:03] --- mdionne has become available [18:29:41] but ioctl already sets errno and returns -1. the case I was trying to fix was broken for different reasons - see Simon's explanations in the commits. [18:33:14] --- shadow@gmail.com/owlB3684B9A has left [18:34:24] --- shadow@gmail.com/owlBDE40F78 has become available [18:50:19] so 4222 and 4223 look ok, I tested that errors from install_session_keyring are passed back up. my bad on the original commit. [18:54:19] --- mdionne has left [20:43:20] --- summatusmentis has left [21:26:50] --- phalenor has left [21:26:56] --- phalenor has become available [21:41:20] --- rra has left: Disconnected [21:52:49] --- Russ has become available [23:49:58] --- steven.jenkins has left [23:50:55] --- steven.jenkins has become available