[00:10:15] --- Russ has left: Disconnected [01:20:43] --- Simon Wilkinson has left [03:04:03] --- sxw has become available [03:05:58] deason: One way to figure out the best way of handling this would be to build Heimdal on Solaris and see what libtool does. [03:21:21] --- sxw has left [03:51:47] --- meffie has become available [04:09:28] --- sxw has become available [04:19:45] --- sxw has left [04:32:03] 2932 made it only into 1.6. 2864 made it into master [06:06:43] --- deason has become available [06:15:23] --- jaltman has left: Disconnected [06:39:21] sxw: afaict, building heimdal's libroken.so on solaris doesn't use a mapfile [06:41:07] oh, i missed simon's suggestion. yeah, i'm pretty sure there are no mapfiles in play in heimdal at all, and that's where this issue derives from [06:41:34] um, so why do we use one? [06:42:23] we were trying to do things right. [06:43:34] I guess, but in my limited foresight, I mean, rokenafs is an internal-only library... well, it's not like this is hard to fix, so whatever [06:44:07] the issue is its "sort of" internal only. if it replaces util, eventually you need it to link afs, but what happens if you also link heimdal? [06:44:18] and what if afs's roken is not a match for the heimdal one? [06:44:45] symbol remapping *could* be done to avoid the issue, i suppose [06:45:24] the issue doubtless needs further thought [08:22:47] --- mho has become available [08:50:01] --- jaltman has become available [08:53:28] simon has submitted a patch for openafs that uses the heimdal roken if it is present and afsroken only if it isn't [08:54:48] realize it's still more complex than that. i build openafs. some time later i build heimdal. some program which wants to use both is built yet after. maybe we don't care much about it, but it merits at least deciding we don't care [09:01:55] --- jaltman has left: Disconnected [09:02:31] --- jaltman has become available [09:20:42] There's no good answer to this that openafs can do unilaterally. You simply need to insure that you have a coherent set of libraries at any given time. Under the present model, that means that if you have built openafs and then build a heimdal that needs newer roken, you get to build roken with heimdal and then rebuild openafs against the new roken. [09:20:55] Does openafs install its libroken? It should. [09:21:28] it does [09:21:36] Well, really, nothing should install it, or even ship with it; the only truly sane solution is to distribute roken as a completely separate package on which both openafs and heimdal (and whatever else) depend. [09:21:51] good luck with that [09:22:07] congratulations. you have just invented glib. [09:22:47] In my environment, I end up with a 'kth' collection that has roken, sl, com_right, and editline from whichever of krb4 or heimdal happened to have the newer one at the time. It's a bit kludgy, but it works OK. [09:23:24] sure. unless distros ship such a thing, it's just a different pain in the ass we've created. [09:23:32] > you have just invented glib. No, roken was already glib. And libiberty was before that. But those had the right distribution model. [09:24:39] As long as distros ship with heimdal, and we can use it, then we can simply depend on the distro's libroken. And I bet most of them will be willing to split that library out into a separate binary package (not source package), if they haven't already, so users can instal it without installing heimdal. [09:30:57] just so you are aware of what is happening on windows, the libroken that is built as part of Heimdal is a static library and is going to remain so. Creating side-by-side assemblies that have external dependencies on other assemblies is a management nightmare for organizations that execute their applications out of AFS. For OpenAFS the afsroken.dll is a dynamic library that is being installed in the OpenAFS\Common directory. in any case, I believe this is the wrong forum to hold this discussion as the critical party that must be part of it, Love, is not in this room. [09:33:06] --- jaltman has left: Disconnected [09:33:12] --- jaltman has become available [09:38:25] --- rra has become available [10:36:57] --- meffie has left [10:39:06] --- meffie has become available [10:59:58] solaris mapfile foo in 3365 [11:00:20] also, it appears that our syscall number has been taken by unlinkat in solaris 11 express [11:00:59] which is somewhat annoying, because we've already been using 65 for opensolaris, which has the same sysname [11:01:31] "yay" [11:03:17] have we mostly just been picking these numbers randomly, or do we get them from somewhere? [11:04:04] 65 was unofficially ours, we did not pick it at random. [11:04:35] I mean in general, not for solaris [11:04:44] "do I pick a new one, or do I ask someone for one" [11:05:10] It seems like it would be polite to ask someone for one ... the question is 'who?'. [11:05:17] didn't we create a solaris list to discuss issues like this? [11:05:26] yeah yeah, I'll do that [11:05:27] we did. [11:05:46] I'm just wondering in general, if other OSes got them from the respective kernel developers and such [11:06:04] when possible [11:06:20] some, like linux, reserved one for us then screwed us. [11:06:43] yes, I was about to say... I guess I'm a little surprised linux hasn't been stomping all over it [11:06:57] we do other magic on linux to avoid the issue [11:08:23] In addition to having one (well, two) reserved for us, FBSD also has six reserved for general third-party-module use. [11:28:13] --- jaltman has left: Disconnected [11:59:59] i figured that was going to be the answer :\ [12:01:38] Er, the answer to what? [12:02:01] solaris [12:02:10] "what syscall number should I use on solaris?" "don't use syscalls, grr, you are a bad person" [12:02:32] andrew mailed port-solaris. frank batschulat answered, as i guessed he might. his answer was basically "be like you are on linux and macos" [12:02:39] tho he did not realize that was his answer [12:04:38] er, is that the ioctl thing on afs_ioctl ? [12:06:01] /dev/openafs_ioctl on macos; /proc/fs/openafs/afs_ioctl on linux [12:06:59] and he says "The only supported / documented way is to use a driver's ioctl, not a syscall.". so, that... [12:07:53] well, he's also saying that like it's always been the case, and the past 10 years haven't really been a problem.... [12:08:28] the political situation has changed, imo... [12:08:59] well yes, of course there's that [12:09:38] well, i suspect this change predated oracle and was in the pipe, but still... [12:10:10] hmm, well a char device still involves grabbing a number, doesn't it? [12:10:32] or do we have a namespace like linux /proc to play with? [12:10:39] (I mean, I know /proc exists, but it's not really the same thing) [12:10:57] as least, from a kernel dev perspective I expect its not, due to the amount of stuff I'm used to in linux /proc [12:11:41] "it's" [12:13:26] well, ignoring that we don't care about a particular device number... it should be that we can use um... mumble. devopsp. it hink. hang on. [12:15:36] oh, I see, there's a system to handle this, how sane [12:16:43] said system is "new" (like, i think it dates from solaris 10) [12:29:31] > be like you are on linux and macos That's a really crap answer, and not all what I would have expected from Sun [12:29:44] you're not dealing with sun [12:36:08] Nico suggests reading the syscall number from a file at runtime. [12:36:29] hey, like, a file like /etc/name_to_sysnum? [12:36:47] Yeah, see, I was happy about not dealing with AT&T [12:37:04] ah, i see you remember that ad campaign as well [12:41:51] --- shadow@gmail.com/owlCDFD9865 has become available [12:41:51] --- shadow@gmail.com/owlDFDA9887 has left [12:55:13] --- mlane has become available [12:55:43] We could, you know, file a bug report. [12:56:37] that may be the best option (of the crappy ones available) [12:57:08] Well, maybe we get it back in FCS that way. [12:57:58] I suppose I should go see what I have to do to make my sunsolve etc accounts migrate over to the oracle crap [13:01:38] I can talk to some people to open cases with oracle... anything that should be said besides what was in that email? [13:03:25] I didn't see the email. But, 65 was listed as "reserved" in syscall.h, and then got reallocated as part of PSARC/2009/657, the mail for which made no mention of reusing existing "reserved" syscall numbers. [13:04:37] We'd like to see 65 reverted to its "reserved" state, preferably with a note that it's reserved for OpenAFS so this doesn't happen again. We should also recommend reverting 64 and 66..69 to "reserved", since it seems likely they were that way for a reason. [13:05:00] --- jaltman has become available [13:06:13] Oh, and there is a bugid relevant to the original PSARC case, too: 6906485 delete obsolete system call traps [13:06:17] you may want to read frank's email, as the argument against that is probably something along the lines that syscall numbers are not part of any interface, and so "anything goes" [13:06:20] https://www.openafs.org/pipermail/port-solaris/2010-November/000009.html [13:08:12] In my experience, Sun in general and PSARC in particular have always been fairly conservative, preferring to avoid unnecessary incompatibilities even when something is uncommitted or Not An Interface, if something would break. [13:08:29] not that I'm necessarily saying that's valid, or even the motivation behind it, but just going from that email, it seems to be the public position or whatever [13:09:56] Oh, also, note that while the system call dispatch table is not loaded from name_to_sysnum and libc doesn't use it to do mappings, either, at one time it certainly was the case that if a syscall was not properly listed in /etc/name_to_sysnum, it couldn't be registered. [13:10:56] It is the _public_ position, and rightfully so. FWIW, I don't think there is a formal contract for this, but my recollection is that, back in the IBM days, someone supposedly made some arrangement for this sort of thing not to happen. [13:11:20] sorry, network glitch. my understanding was IBM [13:11:24] yeah, what jeff said [13:24:07] --- jaltman has left: Disconnected [13:24:11] --- jaltman has become available [13:42:51] --- jaltman has left: Disconnected [13:52:52] --- mlane has left [13:54:41] --- jaltman has become available [13:56:06] --- jaltman has left: Replaced by new connection [13:56:06] --- jaltman has become available [14:17:44] random thought: since the syscall we're conflicting with is unlinkat, is it thus possible that if they keep things the way they are and someone runs an old afs binary on sol11, some random file gets unlinked? [14:18:27] I think the answer is that it's very unlikely. [14:20:07] well, the first arg we give the afs syscall is a small integer, isn't it? which would be an fd [14:20:26] The first argument to unlinkat is a file descriptor, or the special constant AT_FDCWD, which is a bizarre large number. the first argument to afs_syscall is an AFSCALL_* number. [14:21:41] er "which could be an fd" [14:21:46] So yes, they're both small integers, but AFSCALL_PIOCTL (20), AFSCALL_SETPAG (21), and even AFSCALL_CALL (28) are unlikely to be open file descriptors at all in an AFS binary, and even less likely to be open _directories_ [14:21:58] --- jaltman has left: Disconnected [14:22:08] --- jaltman has become available [14:24:06] perhaps code in _our_ tree does not (I'd want to check, though), but other stuff can call haspag/setpag &co for some afs integration... but I'm not sure how common that would be on such a system [14:24:54] Also, the second argument to unlinkat is a path, and for all but pioctl, the second argument to afs_syscall is not going to be one [14:25:06] (also PAM, hmm) [14:25:48] I don't think I'd be comfortable with relying on random binary data not colliding with a filename [14:26:39] For AFSCALL_CALL, the second argument is not going to be a valid pointer. For AFSCALL_PIOCTL, it might be a path, or it might be NULL, so that's an issue. Hm. [14:27:03] I will certainly grant "unlikely" at the very least, but if it actually happened in such a situation, I wouldn't want to be oracle :) [14:28:12] Actually, it may not be as unlikely as I thought. unlinkat doesn't use the fd if the path is absolute. I don't know if it does the fd validity check before or after the absolute path check. [14:29:31] So yes, I could see someone doing 'fs la' with an old binary and ending up removing something instead [14:36:02] --- deason has left [15:46:43] had to drive back from moronville. anyway, yeah, uh, i did wonder what unlinkat's args were but didn't remember to look before i started drving [15:49:12] They are what you'd expect - a file descriptor, path, and flags. [15:49:45] It's worth wondering how agressive Solaris is about checking the flags. The pioctl function parameter will never be valid for that argument. [15:50:38] well, solaris express is now in the regime of "you can't see that", right? [15:51:00] Sure, but the changes in question went into OpenSolaris. [15:51:09] ok. fair enough [15:51:15] And unlinkat existed before this; it was just a subcode of fsat [15:51:38] doesn't mean... well didn't mean the args stayed the same. [15:56:43] True. However, the user-mode interface didn't change, so the user-kernel interface is unlikely to have changed much, in that regard. [16:40:32] --- geekosaur has left [16:40:37] --- geekosaur has become available [18:06:39] --- deason has become available [18:30:46] --- rra has left: Disconnected [18:47:58] --- Russ has become available [20:19:11] --- deason has left [21:11:18] --- geekosaur has left [21:11:40] --- geekosaur has become available [22:23:36] --- jaltman has left: Replaced by new connection [22:23:37] --- jaltman has become available