[00:15:17] --- dev-zero@jabber.org has become available [00:55:04] --- Simon Wilkinson has become available [01:11:47] --- Russ has left: Disconnected [01:29:05] Did we really change the dump format in 1.5.x? If so, whoooops. That pretty much stops anyone from testing ... [01:33:46] --- Simon Wilkinson has left [01:34:08] --- Simon Wilkinson has become available [01:39:36] uh, not that I know of. [01:39:44] We did. [01:39:51] http://rt.central.org/rt/Ticket/Display.html?id=18349 [01:40:22] Also known as add-volupdate-counter-20051015 [01:42:40] There are a lot of big patches in that ticket. Do you see something that actually changes the dump format? [01:43:10] Well, the report in openafs-devel suggests a 'V' tag appearing in dumps. That patch appears to add a 'V" tag. [01:43:29] I haven't looked in a huge amount of detail yet, though. [01:44:31] Oh, I do see code that appears to add a 'V' tag on volumes. Which is interesting, because I don't recall any discussion of such a protocol change. We did at one point discuss defining some extensibility in the dump format, but I don't recall it ever getting implemented. [01:45:41] From the ticket, I suspect that this patch was committed, but then not pulled up to allow for further discussion. Further discussion did not occur, and then HEAD became 1.5, and then we started talking about releasing 1.5... [01:46:07] Unfortunately, it means that there isn't a downgrade path that I can so for people testing 1.5 fileservers. [01:46:13] s/so/see/ [01:46:40] If that's the case, then this should show up as something that was a difference between 1.5 and HEAD before the merge. [01:47:18] And it'd only affect very recent 1.5, too. But you're right, there is likely nothing that can parse or ignore that tag but does not emit it. [01:47:40] Oh, well, I suppose one could restore a volume, downgrade, then dump, but that's not really an acceptable path. [01:49:32] Assuming it happened recently, I think we just need to remove that for now. We also should get right on implementing unknown tag handling -- right now we do very badly with that. [01:49:55] This is pre-merge. [01:50:23] That patch was pulled up to 1.5? When? [01:50:47] My suspicion is that 1.5 was cut from HEAD after that patch was applied. [01:51:36] IIRC, the spec we wrote a few years ago designates certain ranges of tags as int32, others as strings, and others as TLV, all optional. Plus it adds a range of tags which have unspecified format but are critical -- if you see them in a dump you have to fail. [01:51:52] Oh, hm. Yeah, I could believe that. [01:52:36] In which case we probably should do unknown-tag handling for 1.4. I think there's a good argument for treating at least part of it as a bug fix, since right now the behavior for unknown tags is to treat them as empty. [01:52:47] when really we should either skip them or fail. [01:54:17] Presumably, from the mail on openafs-devel, 1.4 is failing when it sees 'V'. [01:54:37] We really need to make sure Kris Webb is involved in these discussions, too. [01:55:34] He is welcome to join the standardization list. The really bad thing is that this 'V' seems to break the rules. [01:58:38] I wrote a message to afs3-standardization on Nov 1, 2007, replying to a message jaltman sent with the topic "Compression support for AFS vos dump - Specification", which refers to RT #17947. My message summarizes the tag classification rules, laid out earlier in that thread and/or ticket. [01:59:39] --- Simon Wilkinson has left [02:00:24] --- Simon Wilkinson has become available [02:04:03] The super-short form: - All tags already existing at that time are grandfathered - Tags 0x05 .. 0x60 are TLV with length following the tag - Tags 0x61 .. 0x7a are 32-bit integers - Tags 0x7b .. 0x7f are empty - Length 0x00 .. 0x7f is the actual length - Length 0x80 is indefinite, with representation as yet unspecified - Length 0x81 .. 0xfd is 0x80 + the length of the length - Length 0xfe is a single bit with value 0 - Length 0xff is a single bit with value 1 - Tag 0x7e (which is empty) is a prefix marking the next thing critical. [02:05:19] The problem is, under those rules 'V' is supposed to be TLV, but the code you pointed at emits it as a 32-bit value. [02:05:35] Yuck. [02:06:10] what? [02:06:27] The fact that we've got a 'shipping' fileserver that breaks that scheme. [02:06:39] Oh, yeah. [02:07:57] The only sane things to do are either grandfather 'V' and make 1.4 understand it, or fix 1.5 to do something consistent with that, which would mean people have to upgrade before they can downgrade. [02:13:41] --- Simon Wilkinson has left [02:26:01] --- RedBear has left [02:37:58] --- haba has become available [02:42:25] * haba back from vacation [03:26:46] --- reuteras has left [03:27:28] --- reuteras has become available [03:37:13] --- Simon Wilkinson has become available [03:41:58] --- reuteras has left [03:42:11] --- reuteras has become available [05:29:05] --- Simon Wilkinson has left [05:31:44] --- Jeffrey Altman has left: Replaced by new connection [05:42:50] --- reuteras has left [06:20:49] --- Jeffrey Altman has become available [06:30:53] Simon, that change could be the cause of http://rt.central.org/rt/Ticket/Display.html?id=76728 [06:48:02] --- Simon Wilkinson has become available [07:00:08] --- deason has become available [07:03:52] as soon as i saw the mail, ythat was my guess [07:04:16] i'm working on it. the key is to unroll the patch while leaving the "restore" path accepting of "V" [07:29:42] --- RedBear has become available [07:32:25] Yes, but I think you should also add to 1.4.11 a patch that accepts and ignores the 'V', so people don't have to upgrade to the latest 1.5 before they can downgrade to 1.4 (they'll have to upgrade/downgrade to the latest 1.4, but that's not quite so bad). [07:39:29] that's not unreasonable [07:46:10] Jeff: the lack of VLF_DFSFILESET handling in the Windows client is annoying but since Arla doesn't have it either it really doesn't make a difference. [07:47:39] I still like the idea of permitting either file ACLs or directory ACLs on new fileservers, but not both. It seems to have the major advantage of displaying understable behaviour. [07:48:06] It would make a large difference in the number of clients that don't do it. [07:49:51] --- dwbotsch has left [07:50:13] --- Rrrrred has become available [07:50:43] I don't see how that helps anything, unless you simply deny all access to "legacy" clients (read: 100% of clients deployed today) if per-file ACL's are in use, and that is simply unacceptable. [07:51:43] I think you have to, if you're going to able to allow per file-ACLs to be as expressive as we would like them to be. All of the other alternatives I've seen suggested either fail in scaling, or in usability. [07:52:48] then maybe we just need to abandon the feature. [07:52:59] but now I have to get back to real work. [07:53:08] No, I think we need the feature. It's one of the most requested items at my site, and I suspect many others. [07:54:06] --- mmeffie has become available [07:54:52] --- haba has left [07:54:56] it certainly is a highly requested feature. [07:55:10] Then we need to find a way to make it work without breaking backward compatibility and requiring everyone to upgrade their clients in order to be able to continue accessing their data. [07:55:12] I think we're getting tied in knots about what we're discussing, too. [07:55:31] It's not a question of access to the initial object, but of subsequent accesses to the object in the local cache. [07:55:32] Simon: what about jhutz's proposal of presenting legacy clients with altered unix mode bits? [07:55:36] --- haba has become available [07:55:37] i should have the option to lock you out of my data. [07:56:06] I suspect that altered mode bits will break on Linux. The VFS layer does things with the mode bits before we can play with them. [07:56:12] agree [07:57:03] The thing to remember is that root@client has access to all this data, anyway. So, we don't need to protect against anything that root@client can do. [07:57:32] Yes, I think there's some conflation going on. There are two completely separate points to consider: 1) What model does the fileserver use for deciding who has access to what? 2) Given the existing interfaces, how do we provide access rights data to legacy clients in a way that prevents or minimizes information leakage without unnecessarily breaking accesses that the ACL would allow. [07:57:37] Which I think means that any arguments against using capability bits are wrong. [07:58:09] Yes, and I think what we do for 1) is far clearer than what we can acheive for 2) [07:58:32] aside: someone want to give /afs/andrew.cmu.edu/usr/shadow/unroll-v-tag.diff a quick glance so i can get it in the tree? [07:58:36] > I suspect that altered mode bits will break on Linux. Irrelevant. The only copy we care about is that provided by the server to a legacy cache manager in a FetchStatus structure. Linux doesn't get to touch that. [07:59:09] Nevermind (1); it has no impact on (2). [07:59:30] what about having an optional magic acl entry specifically for legacy clients? [07:59:53] That's kind of my idea - where the 'directory ACL' becomes something that only legacy clients get. [08:00:04] Except, I want to just make that ACL empty :) [08:00:08] are the unixModeBits tracked per PAG on Unix? they aren't on Windows. [08:00:15] Nope. [08:00:25] We only get to have one value for those, across all users. [08:00:25] how do you set it to avoid making it effectively either leak or excessively restrictive? [08:00:26] jaltman: no, I think jhutz was saying you'd calculate the bits across all users [08:00:49] then I'm not sure how that would help us? [08:01:04] I think the theory is that you make the directory ACL some kind of compromise, and then use the Unix permission bits to narrow down that compromise per file. [08:01:29] jaltman: it's sorta the same as preventing access entirely, except you could still retain access for individual files readable by s:anyuser [08:01:33] which prevents users from accessing data on client A that they would have access to on client B. [08:01:33] --- haba has left [08:01:56] but is much more confusing to understand and support [08:02:12] You leave the directory ACL what it is, and use the mode bits (really, only the 'r' bit) to restrict access to individual files that have ACL's which are more restrictive for 'r' than the directory ACL is. [08:02:44] shadow: the legacy acl? I meant to have it user-settable [08:02:47] --- stevenjenkins has left [08:03:04] i assume you mean that in the presense of somewhere where a fetchdata would also fail for that user and you're only talking about mediating cache access [08:03:07] It's the same as denying access to everything in the directory to legacy clients, except it only denies access to files which have ACL's. So setting an ACL on a file breaks that file, but not everything in the directory. Which turns out to be fairly usefil. [08:03:21] w does matter. [08:03:45] write-on-close means that the cache manager gets to decide whether it's "likely" that a write will succeed. [08:04:20] The only behavior I am proposing is having the fileserver suppress the u+r mode bit on files where the file has an ACL that restricts read more than the directory ACL. I wouldn't touch the u+w bit, because you already can't write unless the server thinks you have write. [08:04:34] Oh, well, I suppose you could do the same for u+w, then. [08:04:52] But you can write to the cache. And then when the server fails you, it's too late, because the POS application doesn't do anything with errors from close() [08:04:53] that seems reasonable. [08:05:32] It does seem reasonable. [08:05:57] In any case, it seems there are two fairly common use cases: - Directory with broad access, but some files are restricted. My proposal prevents legacy clients from accessing those files; if you are one of the people who does get access, you need to use a new client to have it. [08:06:14] --- stevenjenkins has become available [08:06:56] - Directory with narrow access, but some files are more open. My proposal doesn't touch the modes on those files, since they don't have ACL's that are more restrictive than the directory ACL(*). If you have an old client, you get what access the directory ACL gives; if you want the broader access, you need a new client. [08:10:30] are we simply never storing the actual u mode bits, then? [08:10:53] (*) We need to use a somewhat conservative definition of "more restrictive" to avoid having to do lots of complex ACL evaluation. I think I'd compare on an entry-by-entry basis, and call it more restrictive if there is any vice ID for which - an entry is present and grants the right on the directory, but is not present or does not grant the right on the file's effective ACL - an entry is present and denys the right in the file's effective negative ACL, but is not present or does not deny the right on the directory's negative ACL. [08:11:02] No, we store the actual u mode bits. [08:12:03] right, just fake ones to legacy clients, okay [08:12:12] But in some cases, the UnixModeBits reported in the FetchStatus has some bits turned off even though those bits are turned on in the stored mode. [08:15:14] BTW, Simon, I don't think my arguments about capabilities are bogus. What we are talking about is delegating some amount of access control to a client, based on our expectation that the client will obey certain rules in how it controls access. The problem here is not that the client might lie to us, but that a user might modify the client's claim in flight in order to gain access he would not otherwise have. [08:17:12] For example... You and I share a multi-user timesharing machine, which I know reboots every day at a designated time. Around the time of the reboot, I arrange to spoof that client's TellMeAboutYourself response to the server, claiming it supports per-file ACL's when it really does not. I also spoof callback breaks to that client on things in your home directory. -more- [08:18:36] Then I wait for you to log in and fetch all sorts of interesting things from your home directory, some of which should not be accessible to me. Once those things are in the cache, I access that same directory very carefully, making sure in each case that I first touch an object with a permissive ACL before accessing one in the same directory which has a more restrictive ACL that I wish to defeat. [08:20:06] The only even vaguely hard part in all of this is spoofing the reply to RXAFSCB_TellMeAboutYourself. You can claim the likelyhood of doing that successfully is low, but it really isn't, and we don't really have any countermeasures for that on unauthenticated connections. [08:20:46] --- cclausen has left [08:31:10] > Yes, but I think you should also add to 1.4.11 a patch that accepts 'V' done. revert-voldump-v-tag-generation-20090629 makes 1.5 and head not generate it; on 1.4.x, it adds only the portion of that patch which accepts and discards it [08:53:13] --- cclausen has become available [09:14:38] Not only does afs_pag_destroy potentially sleep whilst its caller has a spinlock, in the other codepath that invokes it, it sleeps whilst on the kernel work queue. Neither of those are particularly friendly. [09:25:43] ow [09:29:21] --- Russ has become available [09:42:36] mmm... sleep [10:14:48] --- dev-zero@jabber.org has left [11:00:30] --- Russ has left: Disconnected [11:24:31] --- dev-zero@jabber.org has become available [13:51:01] --- deason has left [13:51:10] --- deason has become available [14:30:33] --- Rrrrred has left [14:30:53] --- dwbotsch has become available [14:37:45] --- stevenjenkins has left [14:45:50] --- stevenjenkins has become available [15:02:57] --- dev-zero@jabber.org has left [15:32:41] --- deason has left [15:43:27] --- Russ has become available [15:46:27] Hm, my workaround for getting Zephyr to work seems to have finally stopped functioning. [15:49:14] --- deason has become available [16:42:56] --- asedeno has left [16:43:47] --- asedeno has become available [17:04:46] --- cclausen has left [17:16:59] --- mdionne has become available [17:47:50] --- Russ has left: Disconnected [18:08:28] How so? [18:12:47] --- Russ has become available [18:21:36] I used to be able to point owl at some old Kerberos and Zephyr libraries I had sitting around, but it doesn't seem to be working any more. I now get the same results as if I had no tickets. [18:22:17] * Russ should download and rebuild it from source. [18:25:51] Building it myself just produces a "Zephr not available" error. [18:26:08] Ah, I see. [18:26:10] I can fix that. [18:29:56] Okay, there we go. Back on Zephyr, at least until we delete those old Kerberos libraries. [18:30:57] --- mmeffie has left [19:13:24] --- mdionne has left [20:52:44] --- deason has left [22:12:21] --- dev-zero@jabber.org has become available [22:48:28] --- dev-zero@jabber.org has left: Replaced by new connection [22:48:29] --- dev-zero@jabber.org has become available [23:18:21] --- stevenjenkins has left [23:47:10] --- stevenjenkins has become available