[00:47:23] --- haba has left [01:46:21] --- haba has become available [04:20:34] --- reuteras has left [04:21:28] --- reuteras has become available [05:03:58] --- Mickey has become available [05:07:02] Would there be any interest in a buildbot setup for testing stuff that appears in refs/changes in openafs.git? [05:07:17] Is there anyone who would be prepared to offer machines (or time on machines) as build slaves? [05:15:11] KTH will have a RHEL machine for builds ready soon. [05:15:58] Would you be prepared to automatically build changes that get submitted to gerrit on that machine? [05:16:22] If someone sets it up. [05:16:39] Cool. [05:32:47] --- Jeffrey Altman has left: Replaced by new connection [05:33:06] --- sxw has left [05:34:25] to the extent SCS's Irix box is "mine" (like, it seems to be up for just me) i'd kick it in. but it has a strange set of compiler tools, so it shows a make bug in libafs that other machines don't [05:34:56] --- Jeffrey Altman has become available [05:39:02] --- Simon Wilkinson has become available [05:40:04] Reading through the buildbot documentation, it looks like we should be able to kick off a build every time a new object appears in refs/changes from a committer that we trust. [05:47:54] a lesson from tinderbox [05:48:02] just because you can doesn't mean you should [05:48:16] In terms of launching builds? [05:48:21] yes [05:48:32] basically you end up firing a lot of builds [05:48:39] irix, for instance, builds very slowly [05:48:51] Yes. But how else can we get verification changes for gerrit. [05:49:04] we could probably cheat and for tests, build just one static and one modload kmod on irix [05:49:15] We're not tracking a single branch, where you can merge multiple change sets into a single build. [05:49:49] We can do that for the 'master' branch if we want, but that will only catch errors which occur after the submission has been pushed to the master tree. [05:51:40] some changes are "safer" than others [05:51:48] either [05:52:06] 1) only build every trusted change on linux, have a button which submits a change to all platforms [05:52:07] o [05:52:08] or [05:52:29] actually unless we get faster machines for this, there is no 2 [05:52:59] There could be 'build every trusted change on Linux, only build on other architectures if Linux succeeds'. [05:53:09] even that will be too much [05:53:26] How long does an IRIX build take? [05:53:29] consider the number of changes in gerrit, and that an irix build probably takes 4-5 hours [05:53:34] with the hardware i have [05:53:41] Okay. Yeh. That sucks. [05:54:57] my Irix is not better/faster [05:55:56] so we probably should delay Irix, AIX, *Itanium* until a fast build is OK [05:56:16] aix6 build host i have access to is much better but i may not be allowed to do such with it [05:56:40] But if we're taking 4 hours per build, that means we can only check 3 changes per day. And I suspect we'll end up with a much higher rate of patches than that. [05:56:50] yes [05:56:54] My suspicion with gerrit is that we'll start to see significantly more, smaller, patches. [05:56:59] --- reuteras has left [05:57:14] --- stevenjenkins has left [05:57:17] actually, how about checking patches on fast machines, and tracking the committed tree on slow ones [05:57:20] for now anyway [05:57:47] --- reuteras has become available [05:57:50] I think we would be able to do that. [05:58:42] I'd be really interested in seeing if we can get some Windows hardware into the pool. Because that's what I tend to break most often :) [05:59:08] that'd be nice [06:03:36] --- stevenjenkins has become available [06:58:40] --- deason has become available [07:40:22] --- reuteras has left [08:10:55] --- Russ has become available [08:12:24] Is there anything that prevents people from submitting a patch to Gerrit that opens a back door on the build system? [08:12:54] That's why we can only verify patches from 'trusted' committers. [08:13:01] We may want to add a bit somewhere to accounts-- right. [08:13:11] Shawn, Derrick, Jeffrey and I talked about this at length while we were at Google. [08:13:17] Oh, okay. [08:13:52] AIUI, Google's plan is to only auto-test changes that come from @google.com or @android.com, at least initially [08:14:45] Crikey, drawing diagrams to explain merge commits and git rebase gets dull after a while ... [08:15:36] --- stevenjenkins has left [08:18:49] --- Rrrrred has become available [08:18:57] simon: did you happen to see any of the discussion yesterday about working with multiple git working trees? [08:19:27] Hang on, scrolling back. [08:21:46] --- stevenjenkins has become available [08:22:14] Read it. [08:24:21] If you're on filesystem which supports hard links, then any clone from a local git repository is very low cost. [08:24:42] It's not clear to me how things like --reference will behave on a filesystem that has no hard-link support. [08:26:00] I think --reference uses a pointer inside the object store rather than hard links. [08:26:18] It sounded like they were using the same mechanism they used to implement merging multiple Git repositories. [08:27:40] Yes. From the manpage, I think --reference and --shared must use the same mechanism, and have the same risks. [08:29:07] --- Derrick Brashear has left [08:31:38] you can specify to not use hard-links even on fses that use hard links if we want to see [08:32:47] I just don't think Jeff's workflow is a particularly common one, especially in a tree that only has two active branches. [08:36:39] (and if you just want to look at a file on a different branch, then git cat-file : is your fiend) [08:37:33] I suspect it is more common than you might like to think [08:37:58] It's useful to be able to compare the contents of two branches with pagers and the like from time to time. [08:37:58] Well, I think it probably _was_ common. [08:38:17] I think that was jhutz's strongest argument, that and keeping multiple builds around. [08:38:20] But if you approach git going "I did X in CVS, how do I do it in git", you're going to be very, very, very sad. [08:38:46] The trick is to look at why you were doing X, and how you might do that with git. [08:39:10] Yeah, that's generally true. [08:39:44] Pretty much, git was written with the assumption that if CVS does something one way, then doing the exact opposite is likely to be a good design choice. [08:40:20] I suspect that there are several sites / support providers that are maintaining their own repositories with modified builds for their users / clients [08:40:35] Well, then that's a branch. [08:40:44] Possibly a branch per client. [08:40:51] Yeah, they're going to want to find a way to turn those into branches. [08:41:00] and they now need to figure out how to perform the cvs to git migration [08:41:33] and modify their own internal workflow to match that of git instead of cvs [08:41:51] It'll take a while. [08:41:53] or svn or whatever they were using internally [08:42:03] Without knowing what that internal workflow might be we can't comment on how tricky that process may or may not be for them. [08:42:07] --- edgester has become available [08:42:28] If they've kept their patches separate, then actually moving from cvs to git should be relatively straightforward. [08:42:45] If their local patches are intertwined with the OpenAFS tree, then it's going to be much, much harder. [08:42:50] Should I hit the IRC chat for a fileserver issue, or stay here? [08:42:50] Yeah, don't do a repository conversion at all, just branch the Git repository and then replay changes. [08:43:17] --- stevenjenkins has left [08:43:25] --- abo has left [08:43:28] Yeh. If you do a repository conversion, even if you use my scripts, it's pretty much useless, as what you get is a different repo (as far as git is concerned) [08:44:01] --- abo has become available [08:45:20] Linus on CVS/SVN : 'The slogan of Subversion for a while was "CVS done right", or something like that, and if you start with that kind of slogan, there's nowhere you can go. There is no way to do CVS right.' [08:46:42] --- Derrick Brashear has become available [08:46:53] --- stevenjenkins has become available [08:52:24] what's your fileserver problem, jason? [08:52:43] 07/13/2009 11:13:34 The volume header file V0536935827.vol is not associated with any actual data (deleted) [08:52:49] I'm chatting in IRC with Simon [08:52:56] ick. sorry [08:53:05] namei or inode? [08:53:08] But I'm stumped. So feel free to jump in! [08:53:08] inode [08:53:17] solaris, then? [08:53:31] what fileserver version? [08:53:31] I did a vos zap on the R/O and vos rel, but no luck [08:53:51] solairs 9, inode upgrade from 1.4.6 to 1.4.10 with STABLE14-background-fsync-consistency-issues-20090522.txt [08:54:18] what exists in /vicepa? [08:54:36] ls /vicepa|grep -v vol [08:55:59] lost+found [08:56:38] ls /vicepa|grep -v vol gives lots of file, though [08:56:55] ugh, ls /vicepb/*vol* give lots of files, though [08:57:09] like what? [08:57:10] the volume is in /vicepb [08:57:33] ls /vicepb/V0536935827.vol /vicepb/V0536935827.vol [08:57:53] oh. and ls /vicepb|grep -v vol ? [08:58:30] lost+found [08:58:52] it's an inode server [08:59:08] that's fine. i just wanted to be really, really sure of that [08:59:27] ls /vicepb/V053693582?.vol /vicepb/V0536935826.vol /vicepb/V0536935827.vol /vicepb/V0536935828.vol [09:00:20] the RWs are fine? [09:00:47] yes, I can ls in the RW path, just fine [09:00:50] er, the RW is fine? [09:00:53] and the BK? [09:01:34] backup looks good, I had to fs mkm [09:01:41] to check [09:02:06] is that the only volume with issues? [09:02:12] hold on [09:03:17] so far, yes [09:03:32] I'm trying to create a RO of another volume to test [09:04:51] no errors when vos rel on the new volume [09:05:18] and it's online? [09:07:22] hmm, maybe not how can I tell besides fslq on the RO path? [09:08:23] look for the files you expect to be in it? [09:08:42] fs lq is not showing the readonly volume [09:08:50] vos ex on the reaonly volume shows online [09:11:41] --- Simon Wilkinson has left [09:11:41] --- Simon Wilkinson has become available [09:12:00] is it volume i should be able to see? [09:12:58] the problem volume is at /afs/uncc.edu/usr, which anyone can see [09:13:11] the one you just replicated. [09:13:14] the new voume is not visible to other sites [09:13:20] ok, so that's no help [09:13:33] can you move the problem volume off the server which is having the issue? [09:13:39] i mean, move the rw [09:13:53] basically, arrange for nothing having to do with the volume to be on that partition [09:15:48] --- Simon Wilkinson has left [09:15:48] --- Simon Wilkinson has become available [09:16:24] so, move the orginal problem volume off of the server? [09:16:43] yes [09:16:47] the RW [09:16:55] and remove the RO site? [09:17:01] yes [09:17:13] after you're done, and you're sure the RW is copied and working elsewhere, tell me [09:17:19] will I cause user errors by removing the RO site? [09:17:30] the one that already doesn't work? [09:17:42] do you hve another RO site? [09:17:51] yes [09:17:54] & yes [09:17:55] good. [09:18:14] you may well get rid of errors by doing this. [09:18:32] --- deason has left [09:21:43] --- deason has become available [09:24:51] anyway, when you are done and *sure* it's working, vos zap the RW (nothing should be left) and salvage the RW by ID (also nothing should happen) [09:25:00] *do not* salvage the whole server or partition, of course [09:25:38] Simon: my main annoyance with just keeping one repo around is having to rebuild the /entire/ tree when I switch branches [09:25:43] you almost certainly know all this, but i am amplifying the points simply because i assme this is a production server, and i know i've made mistakes when i knew better [09:25:50] like pulling power from the wrong machine! [09:26:01] unless building outside of srcdir works, there's no way around that that I know of [09:26:15] andrew: rebuilding the entire tree is something i do dozens of times a day. it keeps your processor in shape. [09:26:18] Building outside of srcdir does work. [09:26:56] Well, it _should_ work. I think it was broken recently, and jhutz hasn't screamed sufficiently for it to be fixed :) [09:27:02] hm, I thought someone reported issues [09:27:04] oh, okay [09:27:35] When you swap branches, git only modifies the files that actually differ between those branches. [09:27:36] Derrick: also gives more time for "my code's compiling" activities [09:27:58] which assumes our dependencies don't suck [09:27:58] So, with a build system that has a proper view of the code's dependencies, you don't need to rebuild the whole tree each time. [09:27:59] yeah, but I usually want to also build to make sure I didn't break something [09:28:15] However, top tip - our tree does not have such a build system. [09:28:25] Derrick: the volume is working, but I haven't done vos zap. Should vos zap be done to the old server and partition? [09:28:27] i'd like a tool which goes through and computes broken dependencies [09:28:36] jason, yes [09:28:54] I believe such tools exist. [09:29:04] basically, the goal is to obliterate any detrius left where it was, and then confirm it's gone (the salvage step) [09:29:06] * Russ would really like to switch our build system over to Automake. [09:29:08] even if it worked properly it still takes noticeable extra time if I'm switching between branches like 1.4.old and 1.5 dafs etc [09:29:20] Which among other things automatically calculates dependencies. [09:29:21] i'd really like to blind myself before i touch automake [09:29:40] * Russ is now using it for nearly all of my other packages and is finding it saving me a lot of time. [09:30:03] Among other things, it automatically tests out-of-tree builds for you. [09:30:11] Derrick: Volume not attached, does not exist, or not on line [09:30:29] jason, good. [09:30:31] --- agoode has become available [09:30:55] Derrick Brashear: ls /vicepb/*volid* shows no entries [09:31:07] which means nothing [09:31:14] k [09:31:27] shold I move the volume back to the old server nor? [09:31:41] so if vos zap and the salvage by volume id are done, you can try moving it back, and then replicating [09:31:53] hold on, I didn't salvage by id [09:33:01] just for paranoia: does the salvage find and try to bring back anything [09:34:09] --- haba has left [09:34:14] 07/13/2009 12:33:31 No applicable vice inodes on c1t2d0s6; not salvaged 07/13/2009 12:33:31 0 nVolumesInInodeFile 0 Temporary file /vicepb/salvage.inodes.c1t2d0s6.4483 is missing... [09:34:23] good. so you can move it back [09:34:58] --- cclausen has become available [09:40:06] --- Mickey has left [09:40:26] Derric, it's move back. I added a RO site. It's released [09:41:08] so you're good? [09:41:24] I think so. No errors in filelog [09:41:36] good [09:41:51] so you had schmutz on vicepb [09:42:53] whew [09:43:43] Thanks! [09:43:56] --- agoode has left [09:44:07] --- agoode has become available [09:50:57] time for lunch, ttyl [09:51:09] --- edgester has left [10:02:46] --- edgester has become available [10:02:59] Hi Derrick [10:03:10] what blew up [10:03:13] Nothing [10:03:17] ok [10:03:40] I'm looking for a better explanation of what caused my problem [10:04:08] at some point previously a release presumably failed and left stuff behind [10:04:09] one of my coworkers didn't like my email with the cause of "dirty bits" [10:04:29] 1.4.6 was fine with it, but 1.4.10 choke on it [10:04:41] is 1.4.10 more picky? [10:04:55] than 1.4.6? only if you had more than one copy of the volume on the server [10:05:08] just RW and RO [10:05:17] on another partition [10:06:26] the strange thing is that all volumes attached fine on the sunday morning bos restart, but that one volume didn't attach after the upgrade [10:06:45] again, it's only strange if there wasn't a second copy of the volume on the server [10:07:14] there was only one copy on the server [10:07:28] based on what information? [10:07:35] vos ex [10:07:42] that's not a good way to base it [10:07:47] that uses the vldb [10:08:04] vos listvol would have been better [10:08:33] --- abo has left [10:08:50] --- abo has become available [10:08:52] too bad I don't have that info [10:09:03] i agree [10:09:14] does your unhappy coworker have a time machine? [10:09:51] so, you're saying that at some point in the last, a release failed and left two copies of the RO volume on the server. 1.4.6 was fine with that, but 1.4.10 would not accept it. Is that correct? [10:10:25] no time machine [10:10:58] no, i'm saying at some point someone had a replica on a partition which is not where the RW was, and never cleaned it up [10:11:14] and 1.4.6 was ok, and 1.4.10 doesn't like that [10:11:19] hmm [10:11:40] ah, ok, so 1.4.10 is pickier [10:12:12] deliberately [10:12:32] because otherwise it's possible for you to release a volume, the volserver to write it in one place, and the fileserver to serve from another [10:12:39] this results in predictable problems [10:12:40] how can we clean out said cruft? [10:13:01] well, in general, compare vos listvol output with what's actually in the vldb [10:13:26] would vos syncserv or vos synvldb fix that? [10:14:11] > it seems to be up for just me pretty much [10:14:13] i'd want to experiment [10:14:52] --- agoode has left [10:15:04] --- agoode has become available [10:15:34] so, then the solution is to compare vos listvldb and vos listvol and zap the differences? [10:16:57] if you are sure that the vldb is right (e.g. there's a known, up to date copy where it says one is) [10:17:13] ok, thanks! [10:20:55] --- dev-zero@jabber.org has left [10:21:42] --- phalenor has become available [10:30:26] --- edgester has left [10:36:27] --- dev-zero@jabber.org has become available [10:49:39] Hmmm. I guess the case with USEIFADDR isn't #defined at the start of afs_server.c has never been tested, then. [10:52:10] probably bitrotted [10:52:44] There's a label that's in completely the wrong #if block. If it's referenced, its not defined. If it's defined, its never referenced :) [10:53:12] rock on [11:02:13] Okay. This is so special, I just have to share ... [11:02:41] We have oldmvid, and tvc->mvid, both of which are pointers to struct VenusFid [11:03:36] But someone saw fit to do oldmvid = (char *)tvc->mvid [11:03:51] awesome [11:03:54] i wonder who it was [11:03:59] but not enough to see if it was me [11:04:26] "IBM" [11:20:21] --- Derrick Brashear has left [11:35:41] --- dev-zero@jabber.org has left [11:36:53] suddenly the export/import of 'rx_enable_stats' in the Windows build is broken [11:37:31] --- mmeffie has become available [11:38:07] What directories does that depend on? [11:38:20] and of course looking at the error the question that comes to mind is "why did this ever successfully link?" [11:38:49] the problem is the definition of EXT2 in rx_globals.h [11:39:09] which is *your* definition, since before we just had EXT [11:39:43] --- deason has left [11:39:51] fileserver.exe is pthreaded on Windows and yet it is linked to the lwp version of afsprot.lib [11:40:49] afsprot.lib contains the compiled version of ptint.cs.c which includes multiple references to rx_enable_stats [11:40:50] --- abo has left [11:40:54] --- stevenjenkins has left [11:41:07] --- abo has become available [11:41:25] on Unix, is ptint.cs.c compiled twice? once for pthread and once for lwp? [11:41:47] presumably once in afsauthent and once in libprot [11:43:44] --- deason has become available [11:44:23] the real problem is that the wrong libs are being linked and several functions are not exported from afsauthent.dll which should be [11:46:47] Great. All of the libuafs build rules don't pass their error code back to the top level make. [11:47:03] at some point that was deliberate, i think [11:47:20] --- stevenjenkins has become available [11:47:26] It looks to me like someone has just used ';' when they meant '&&', but I may be mistaken. [11:47:44] (For example, we run ranlib, on targets that haven't built properly) [11:47:50] heh. ok [12:47:52] --- dev-zero@jabber.org has become available [13:38:04] --- haba has become available [13:47:13] edgester (and how to clean out said cruft): Have a look at the script called afs-sanity in /afs/stacken.kth.se/home/haba/bin/scripts/. It uses some more scripts in that directory and makes a diff between what's on your vldb and what's on your fileservers. [13:48:16] (hm, edgester gone?) Well, he might see it later. [13:50:02] the list of people in the room isn't always reliable [13:50:17] my own client doesn't even show myself [13:50:35] Excellent, now I just have to deal with installation of the message catalog for fstrace and figure out how to handle the difference of agreement over bos configdir permissions and there will be no Debian diff apart from removing APSL-covered code. [13:50:46] For some clients, not seeing yourself is normal [13:51:15] Of course, one wonders how we have any APSL-covered code [13:51:30] (well, one doesn't wonder, but one is bitter about it) [13:51:38] +1 to bitter. [13:52:21] Portions of it I could probably do a clean-room rewrite of, but I can't do that with afssettings.m or whatever that file is called, so doing it with the other bits doesn't feel very rewarding. [13:53:12] jhutz: well, normally I do see myself; it also doesn't show you, heh [13:54:23] threaded volserver now builds on Windows [13:54:51] --- stevenjenkins has left [13:56:42] Portions that were changes to AFS should not have been accepted with a license other than AFS's license, period. What we _should_ do is rip them out and insist that Apple recontribute under our license if they want it in there. The Mac-specific standalone app probably shouldn't be distributed as part of OpenAFS, under the "no, really, we are not the kitchen sink" theory, but I guess it doesn't do any harm to do so. [14:01:02] --- stevenjenkins has become available [14:12:25] jhutz: Saying that app shouldn't be distributed with OpenAFS is like saying 'fs' shouldn't be distributed with OpenAFS [14:12:49] We should just make the real modes behaviour a pioctl, and control it through fs, and get rid of afssettings.m [14:45:40] Now at 938 warnings :) [14:48:16] I suspect 64-bit Windows builds are still higher than that [14:49:15] Oh, 64bit will be a mess. [14:49:53] Yeah, fixing the 64-bit warnings will require API changes. [14:50:06] Or at least I strongly suspect that they will. [14:50:06] Sorry, API? We have an API? [14:51:10] --- mdionne has become available [14:51:14] Well, yes. There's a reason why I still don't ship the shared libraries on Debian. :) [14:51:28] But we do to an extent, and people do have applications out of tree written against it. [14:51:49] At some point, we need to brainstorm how we're going to build a supported API. [14:52:24] Whether that be blessing existing functions slowly as we're sure they're not going to change, or something. But I'd like to get to the point where we have shared libraries with stable and supported interfaces and SONAME versioning that are actually used by most of the programs we ship. [14:52:32] This static linking everywhere thing results in hugely bloated client packages. [14:52:33] --- stevenjenkins has left [14:52:37] --- abo has left [14:53:32] --- abo has become available [14:56:19] --- haba has left [14:57:19] --- haba has become available [14:59:16] --- stevenjenkins has become available [15:12:28] --- dev-zero@jabber.org has left [15:34:56] --- deason has left [15:35:06] --- deason has become available [15:44:37] --- summatusmentis has left: Lost connection [15:44:37] --- kula has left: Lost connection [15:44:37] --- Rrrrred has left: Lost connection [15:44:37] --- dwbotsch has left: Lost connection [15:44:37] --- phalenor has left: Lost connection [15:44:37] --- mdionne has left: Lost connection [15:44:37] --- shadow@gmail.com/owl1A15EC91 has left: Lost connection [15:44:38] --- kula has become available [15:44:43] --- mdionne has become available [15:45:04] --- dwbotsch has become available [15:45:11] --- phalenor has become available [15:46:01] --- shadow@gmail.com/owl1A15EC91 has become available [15:46:41] --- dwbotsch has left: Lost connection [15:46:41] --- mdionne has left: Lost connection [15:46:41] --- shadow@gmail.com/owl1A15EC91 has left: Lost connection [15:46:41] --- kula has left: Lost connection [15:46:41] --- phalenor has left: Lost connection [15:47:19] --- summatusmentis has become available [15:47:23] --- Rrrrred has become available [15:49:02] --- mdionne has become available [15:49:38] --- kula has become available [15:50:04] --- dwbotsch has become available [15:50:11] --- phalenor has become available [15:51:02] --- shadow@gmail.com/owl1A15EC91 has become available [16:10:22] Huh, I thought %p was portable. [16:13:38] Hardly [16:15:29] In all of the places I've seen use something like AFS_FMT_PTR, it's usually just to deal with Windows vs Unix issues. [16:16:31] But we're currently using %x to print (potentially) 64 bit pointers. Which is a sure fire path to doom on some platforms. [16:18:00] --- agoode has left [16:18:13] --- agoode has become available [16:22:23] * Russ checks. %p is required by C99. [16:22:50] Yes. And we just gained an implementation of it for our native printf routines, too. [16:26:11] (Basically, Jeff pulled in the Heimdal version in util-snprintf-replacement-20090624 ) [16:26:17] And yeah, %x is definitely bad. I was just wondering if we could replace %x with %p. But jhutz seems to say no. [16:26:22] Which also resolves our %x anomaly. [16:26:34] If we're printing a pointer, we should absolutely be using %p. [16:26:38] * Russ wonders how the Heimdal version compares to the one I've been using. [16:26:57] I thought that's what we were doing? [16:27:07] Using %x is wrong, because if an int is shorter than a void *, it will consume the wrong amount from the var_args stack. [16:27:15] Right. %x very bad. [16:27:17] We currently print pointers using 0x%x [16:27:39] Yeah, sorry. I mean, what we're currently doing is bad -- I was wondering why your patch didn't just use %p instead of string concatenation and our #define. [16:27:55] Because on 64bit Windows, you don't want %p. [16:28:08] Oh, huh, why? [16:29:09] Because Microsoft have decreed that you need to use %Ip [16:29:16] Oh. Right. [16:29:18] I remember this now. [16:29:47] --- agoode has left [16:29:55] --- agoode has become available [16:29:59] So, anything that needs to work in both places, should use AFS_FMT_PTR. Anything that's just Unix can use %p and be done with it. [16:30:15] If we're using our printf et al, then we can certainly define %p to mean "pointer to data" and use it that way. But it's not portable to expect the library's printf to interpret it that way. [16:30:38] C99 does actually require that works. [16:30:40] Traditionally I've cast to unsigned long and used %lx, but that's really not as good as having an actual pointer type. [16:30:47] I'm not sure how Microsoft is getting away with not allowing it. [16:31:10] %lx definitely does not work on platforms where sizeof(unsigned long) < sizeof(void *), which do exist. [16:32:48] I think pretty much all of the time, we're just using %p as a means of printing the value of a pointer in a debugging message, so as long as what we get is consistent with that usage, I don't think we care beyond that. [16:33:13] C99 requires %p works? perhaps, but that doesn't make it portable in the real world [16:36:32] Uptake of C99 outside of the math stuff is fairly good at this point. [16:37:01] C99: p: The argument shall be a pointer to void. The value of the pointer is converted to a sequence of printing characters, in an implementation-defined manner. [16:37:11] Really not sure how Windows gets away with saying %lp should be used. [16:37:42] l is not listed as a modifier for %p. [16:38:16] --- agoode has left [16:38:23] That's (I)ndigo, rather than (l)ong [16:38:23] --- agoode has become available [16:38:31] Oh! [16:38:43] Sorry, I have crappy font problems in Pidgin. [16:39:18] Although there are bits of the OpenAFS code that assume that (l)ong is an acceptable modifier to 'p'. They're just broken. [16:40:09] Yeah, uppercase modifiers are reserved by the standard for extensions, so they can do pretty much whatever they want with that. [16:40:25] Although I don't think the standard lets them require a modifier when one is printing a void *. [16:41:46] I appears to indicate that it is an object which changes size depending on the architecture. [17:12:01] --- Russ has left: Disconnected [17:28:04] --- Russ has become available [17:42:20] Simon: where did you read that Microsoft require %Ip [17:42:44] Here are the docs http://msdn.microsoft.com/en-us/library/56e442dc.aspx [17:42:58] I is only valid with the integer types [17:43:04] Microsoft does support %p [17:43:43] What we require is %I64x %I64d to support 64-bit integers [17:45:36] the *printf() routines that I recently added to the OpenAFS tree support %p, %I32[dioxX], %I64[dioxX] and the %I type that OpenAFS uses for output of IPv4 addresses in either dotted notation or hostname via dns lookup. [17:48:23] what I would like to do is use %p everywhere and also explicitly use afs_*printf() functions [17:49:38] especially in the file server because using %I will permit us to simplify all of the existing log messages that output ipv4 addresses. [17:50:32] * Russ looks at that file. Oh, it's one of the ones that uses sprintf to implement %g/%f. [17:50:43] I was wondering how it solves that problem -- that's the hardest part of replacement snprintfs. [17:50:57] I think using %I is fundamentally the wrong direction. [17:51:06] It's IPv4-specific. [17:51:10] I added the %g/%f support [17:51:15] And doing IPv6 properly requires a completely different approach. [17:51:41] I thnk we'd be better off abstracting conversion of IP addresses to strings for printing than embeding it in printf. [17:52:01] I think the code is cleaner if we embed it in printf [17:52:07] Otherwise, we're going to have to maintain a forked printf with special format modifiers forever, and that causes us problems for warning testing, since gcc is not going to know how to diagnose problems with the arguments to printf. [17:52:17] Which I think is a fairly serious issue. [17:52:44] we are going to have to maintain a forked *printf implementation forever. There is no way around that. [17:53:01] If you want gcc to be able to deal, then we have to punt on %I32 %I64 or whatever they are, and actually handle those things the standard way, ugly though it is. [17:53:03] The new implementation also support the a*printf variants. [17:53:11] I don't agree that there's no way around that. [17:53:30] there is no other 64-bit integer printf format on Windows [17:53:41] BTW, I think supporting I as both a flag and a format specifier is sheer madness; it means the parser has to guess what you meant. [17:54:02] You have to use string concatenation when printing 64-bit integers, which is ugly and annoying, but it is an option. [17:54:16] --- stevenjenkins has left [17:54:18] --- deason has left [17:54:22] --- agoode has left [17:54:31] --- agoode has become available [17:54:38] > there is no other 64-bit integer printf format on Windows The correct format is provided by the preprocessor macro PRIu64 [17:55:30] one of the primary motiviations for making the changes to the s*printf implementation was so that we would get the a*printf functions everywhere. So we can avoid putting large string buffers on the stack and not have to worry about buffer overwrites [17:55:44] One says something like printf("foo: 0x%016"PRIx64"\n", foo); [17:55:48] Implementing asprintf is easy. [17:55:51] I can give you an implementation of that. [17:55:57] I already implemented it [17:56:00] ... in terms of vsnprintf [17:56:11] Yes, but my point is that we dont need to commit to maintaining our own snprintf forever. [17:56:19] and Jeff, where is PRIu64 on Windows? [17:56:20] You can selectively compile it where you need it. [17:56:36] that is what we do today [17:57:08] In , if __STDC_FORMAT_MACROS is defined, according to C99 [17:57:35] what makes you think Microsoft's compiler is C99? [17:57:51] --- deason has become available [17:58:14] That's fine, we can define it ourself. [17:58:17] We know what the value should be. [17:58:23] there is no inttype.h on Windows [17:58:35] ourselves rather. [17:58:48] Jeff: You cannot use %I if you selectively compile snprintf where you need it. [17:59:02] I don't think supplying our own snprintf implementation when the platform has one is a good idea. [17:59:09] If Microsoft's compiler is not C99, then we either find a compiler that is and require that people use that, or else work around it. This is the fundamental philosophy behind sane feature-test-based configuration: you start by assuming something sane, and then apply corrections where needed, like defining {PRI,SCN}[uidxo]{32,64} on platforms that don't have them. [17:59:10] When the platform has a working one, I should say. [17:59:23] we only use %I when using our own afs_*printf() functions [17:59:24] sorry; [17:59:37] We seem to be talking in circles. [17:59:45] no inttypes.h on Windows [18:00:19] --- edgester has become available [18:00:30] Jeff, we are not requiring a compiler other than Microsoft's on Windows [18:00:33] --- stevenjenkins has become available [18:00:33] --- abo has left [18:00:40] Yes, we do. Russ is arguing that we _should not_ be in a position where we have to use our own implementations of printf et al because we are using OpenAFS specific features. His argument is that we should eliminate OpenAFS-specific features and their use, and instead use only standard features. [18:00:43] This is a red herring. We can define the macro on Windows trivially. [18:01:13] --- abo has become available [18:01:33] The decision isn't going to rest on the availability of the PR macro. I think there are other things you're worried about. We should talk about those bits and not worry about how we're going to get the PR* macro defined. [18:01:38] That's an easily-solved problem if we have it. [18:02:13] Then, when we do encounter a platform that fails to have the standard features we are using, we either provide our own implementation or a suitable workaround, as appropriate. For example, if we discover that Windows has a printf implementation that supports printing 64-bit integers with %64Ix, but does not defined PRIx64, then we _define PRIx64_. This is much saner than using %64Ix everywhere and having our own printf to support it. [18:02:22] I'd say that we have the following arguing against use of a standard printf: [18:02:34] %I is cleaner and produces simpler code than formatting the address separately. [18:02:39] similarly, if there is no asprintf, it's easy to build one from vsnprintf [18:02:51] And using the same format specifier everywhere is nicer than using string concatenation to build the format string. [18:03:01] My argument in favor of using the system printf is: [18:03:16] plus we may not be able to assume %p [18:03:21] %I isn't going to work for IPv6 anyway and adding another format specifier for IPv6 seems like the wrong direction for me. [18:03:22] not all snprintf's provide sane return values [18:03:26] plus snprintf may be non-working [18:03:41] but those are things we can test for at configure time, like sane people [18:03:45] System snprintf is likely to be more efficient if we ever care. [18:03:48] --- haba has left [18:03:56] (We probably don't usually.) [18:04:12] . System snprintf is likely to be more efficient only if ours sucks, and if we care we probably shouldn't be using sprintf [18:04:12] System snprintf means we don't have the additional code to maintain. [18:04:22] --- haba has become available [18:04:31] I don't actually agree with that entirely. But it's a minor point. [18:04:48] I'm pretty sure %p will work, except in kernel code. [18:04:57] > don't have the additional code to maintain. only true once we reach the point where it's good enough everywhere. do you actually think we're close to that? particularly wrt snprintf? [18:05:01] Probing for sane return values in snprintf is something I've been doing in all of my packages for years. [18:05:07] yes. [18:05:08] --- deason has left [18:05:19] I don't build my snprintf on any platform except Solaris 9 and earlier these days. [18:05:30] The native one is fine on all Linux, all *BSD, Solaris 10, etc. [18:05:53] solaris 9 is still a signficant platform. also, what other platforms do you actually support? what about AIX, HPUX, Irix, etc [18:06:09] > all *BSD for a suitably narrow definition of "all" [18:07:03] Russ, you just started to convert to using strlcpy and strlcat. we do not have those functions on all platforms either. [18:07:06] * Russ has not personally checked those recently -- AIX needed it previously because they didn't prototype their snprintf. [18:07:14] Jeff: Which is why we build our own if we have to. [18:07:16] --- abo has left [18:07:24] That's fine. [18:07:28] However, those functions are trivial. [18:07:30] but if we have to build on any platform we have to maintain it [18:07:36] snprintf actually requires work to maintain. [18:07:51] strlcat and strlcpy actually don't. [18:08:10] --- abo has become available [18:08:41] --- deason has become available [18:08:49] Jeff, if we start maintaining our own versions of the entire C standard library, our binaries are going to be way bigger. Demand-paged shared libraries are another argument for using the system implementations. [18:08:54] I agree that we're going to have to maintain snprintf for a while yet, but I think the end is pretty clearly in sight. I stopped maintaining my own strerror, for instance. I think snprintf is similar. [18:09:18] It may be another five years, but I can see it happening. [18:09:42] And Russ is right about maintainability. A printf formatter is actually a very complicated and not terribly easy-to-understand piece of code, if you want to get it right. I'll bet the one we have gets something wrong. [18:09:51] And long before we stop needing it entirely, we stop caring about its corner cases, since bugs on ancient Solaris aren't as exciting. [18:10:05] Solaris 9 problems are still exciting now. [18:10:10] In three years? Not so much. [18:10:57] Just about every snprintf replacement gets something wrong. [18:11:10] I have a new version someone sent me for mine that fixes a ton of floating point problems that I've not yet looked at, for instance. [18:12:09] Well, many replacement formatters punt when it comes to floating point and just call the system one to handle that. Which is fine, as long as the system one handles precision correctly. [18:12:48] And never overflow the size of your static buffer. [18:13:06] you might want to run the test application I wrote for our snprintf against your implementation to see what you got wrong. [18:13:14] Anyway, even apart from this debate, even if we keep afs_snprintf, I dont think using %I or other custom formatting attributes is a good idea. [18:13:23] its in the util/test directory [18:13:28] gcc format checking is a really good idea. [18:13:41] And we should be enabling it every place we use any *printf varient. [18:14:12] --- abo has left [18:14:41] --- agoode has left [18:14:51] --- agoode has become available [18:14:54] Jeff: Yeah, I have a similar test suite for my replacement. But I'm actually not looking at the new version mostly because I'm hoping it can largely go away. [18:14:59] --- abo has become available [18:15:14] (The new version someone sent me for mine, I mean, not the AFS one.) [18:16:06] I don't really care about %I. I implemented in the new routines because we had it in the old one and I didn't want to break anything. [18:16:20] The old routine failed more than half the tests [18:16:21] That doesn't look like a very complete test. It doesn't test zero-padding, or explicit-plus, or precision as applied to integers, or anything about strings. [18:16:23] Anyway, I just want to sing the praises of gcc format checking. There are some warnings that mostly find places where you need to avoid warnings, and there are warnings that find bugs. gcc format checking is definitely the latter category. It's found a ton of really major bugs in my code. [18:16:50] Jeff, feel free to add additional tests. [18:17:44] The float/double part looks not too bad, though it could test more flags. Oh, and it doesn't test * or $ behavior, either, though I don't know if we use those in OpenAFS ($ is really valuable for translations) [18:17:45] Without %I, the code is more annoying, sadly. [18:18:17] In practice, you need a static buffer of INET6_ADDRLEN+1 in local scope and then call a function to format the IP address into it before calling the *printf function, passing the results as a string. [18:18:47] Such a function is available in rra-c-util, FWIW. [18:19:01] Which has a test suite and works for IPv4 and IPv6. [18:19:57] I think there are enough cases where it would work to justify having a formatting function that can use a static buffer, but I'm not sure it's a good idea, because if we have it, people will misuse it. [18:20:24] The function takes a buffer and a length and does length checking, of course. [18:20:38] OTOH, a function that took a char * and returned it might produce reasonably readable code. [18:21:01] --- agoode has left [18:21:13] --- agoode has become available [18:21:23] But isn't thread-safe. [18:21:30] Oh, wait. [18:21:33] Yes, I see what you mean. [18:21:39] Return the buffer. [18:21:58] You still need to declare the static buffer, but it moves the call site slightly. [18:22:14] The version I have also supports IPv6 mapped addresses, BTW. [18:22:31] It's built on inet_ntop, for which I have a replacement if the system doesn't provide one. [18:22:35] Also with a test suite. [18:24:34] I have replacement versions of all the key IPv6 API functions, since how I write network code now is to use the IPv6 functions everywhere and provide IPv4-only replacements on those platforms that don't have them. [18:25:31] BTW, we should find the functions that take printf-style arguments in the tree and add the gcc __attribute__ markers to enable format checking where we haven't already, while I'm thinking about format checking. I suspect we have a few. [18:25:45] Yeah. [18:33:16] Hm I must not be setting up this test suite properly. [18:33:22] --- mmeffie has left [18:33:27] which? [18:33:29] When I run my test suite against the AFS snprintf, it fails all the floating point tests saying that the value is 0. [18:34:00] can you point me at it? [18:35:28] http://git.eyrie.org/?p=devel/rra-c-util.git;a=blob;f=tests/portable/snprintf-t.c;h=010842ed16f8d2fff51c0a614b3e770fdbc28275;hb=a0d56c9b7c2fcbea51c344b72cebb09ed4ae25db hacked to call afs_snprintf and afs_vsnprintf. But there has to be something wrong here -- this is too simple of a failure. [18:35:36] Everything but the %f tests works fine. [18:35:45] This is pretty clearly a test suite failure of some kind. [18:36:13] The problem with this is that you need libtap to build and run that test case, so it's hard for you to duplicate what I'm doing. [18:36:49] Your test suite is more thorough than mine at present. I just wanted to check to see if I was testing anything that you weren't, and running my test seemed like an easy way to do that. [18:37:25] ok [18:41:20] could someone please verify http://gerrit.openafs.org/77 on Unix [18:41:47] the only relevant change is the one to src/rx/rx_globals.h to remove the definition of EXT2 [18:42:24] Yup, doing so now. [18:42:29] thanks [18:45:54] It's lovely to be able to do a make -j6 [18:46:20] --- dwbotsch has left [18:48:39] Hm. [18:48:43] --- edgester has left [18:48:48] Running make and then immediately running make again rebuilds some stuff. [18:48:59] I don't think that's related to this change -- in fact, I'm sure it's not. [18:49:02] But it's strage. [18:49:04] strange. [18:49:56] The main code I use at work does that, too. But it is broken in many other ways ... [18:50:17] Jeff, do you need a code review as well, or just the verification? [18:50:52] the only source code change of note is that one file [18:51:08] its a handful of lines. please review as it will takes 10 seconds [18:51:43] the windows build changes I have verified by building [18:52:28] I have asked Asanka to look it over [18:54:39] Okay, looks good to me. [18:59:29] thanks [19:18:29] > If Microsoft's compiler is not C99, then we either find a compiler that is and require that people use that [19:18:49] that statement comes with an explicit "and i volunteer to do the work" or it's basically something you never said [19:22:10] Yeah, we'd take the workaround approach. [20:09:03] --- mdionne has left [20:40:46] --- cclausen has left [20:43:35] --- dev-zero@jabber.org has become available [21:30:44] --- Russ has left: Disconnected [21:33:12] --- deason has left [21:58:06] > Yeah, we'd take the workaround approach. Yes, probably. OTOH, while I know Microsoft has been making compilers since before 1999 and thus must have released some compilers that are not C99-compliant, I'd be surprised to discover they have no C99-compliant compiler today. [22:48:55] --- Russ has become available [22:56:31] --- dev-zero@jabber.org has left: Replaced by new connection [22:56:32] --- dev-zero@jabber.org has become available [23:13:22] --- reuteras has become available [23:16:39] --- dev-zero@jabber.org has left [23:44:54] --- dev-zero@jabber.org has become available