[00:22:35] --- dragos.tatulea has become available [00:39:53] --- dev-zero@jabber.org has left [01:07:34] --- dev-zero@jabber.org has become available [01:07:38] --- dev-zero@jabber.org has left: offline [03:21:51] --- dragos.tatulea has left [03:22:06] --- dragos.tatulea has become available [04:38:25] Where do we stand on the use of $< in make rules. There's a README at the top level which says not to use it, but I not that src/config/Makefile.config uses it for its .c.o: rule ? [04:52:20] --- kula has left [05:01:52] --- Russ has become available [05:15:16] > Where do we stand on the use of $< in make rules. There's a README does it say why? [05:15:46] oh, the devel readme. hang on [05:24:56] i guess we can use $< [05:25:01] looks like we do everywhere [05:25:24] i can't find the rationale anyway [05:25:30] might have been for sunos make [05:31:58] --- Russ has left: Disconnected [05:33:57] --- Jeffrey Altman has left: Replaced by new connection [05:45:30] --- kula has become available [05:53:30] --- Russ has become available [05:54:00] > In particular, with a manual 'make', I am hitting > aklog_main.c:181:2: error: #error "Must have either > krb5_princ_size or krb5_principal_get_comp_string" tell configure. [05:54:11] You can only use $< portably in pattern rules. [05:54:15] Otherwise, you lose with non-GNU make. [05:54:37] Boo. I guess I should fix my -Werror patch, then. [05:55:21] I don't think you lose with all non-GNU make, but as I recall you lose with makes you care about, like Solaris. [05:55:46] Well, personally, I don't care about Solaris :) [05:57:52] ah, only in *pattern* rules. wanna update the README.DEVEL [05:57:52] --- dwbotsch has left [05:58:07] I'll patch README.DEVEL [05:58:16] --- dwbotsch has become available [05:58:37] excellent [06:13:57] is there any reason that osconf.m4 hardcodes CC to 'cc', rather than using what configure gave it. I can see why we might want to do so for kernel builds, but it's a real pain when trying to use a non-standard CC (either gcc-4.2, or a static analysis tool). [06:14:52] on several platforms we hardcode to what we know works. no one ever hooked in the solaris and aix explicit "known working" tests, alas [06:15:13] I wonder if we should shift the hardcoding of CC into platform specific sections, then. [06:15:33] well, we should hook in those tests, update them, and otherwise not hardcode it [06:16:05] Are those tests around anywhere? [06:16:14] src/cf/*-cc.m4 [06:16:41] I will take a look ... [06:17:29] actually, i know what to do, and i will need to look at aix 6 anyway [06:17:42] Okay. I will leave it for yoU! [06:21:42] That would be great -- that's another patch that we've had to carry around for Debian. [06:44:34] * Russ finds it disturbing that when I run make, and then immediately run make again, a bunch of stuff builds again. [06:45:05] Our make rules are exciting. [06:48:51] One of the things that always rebuilds is the Linux kernel module. [06:49:12] Also, is this stuff just ignorable? [06:49:15] WARNING: "openafs_procfs" [/home/eagle/dvl/openafs/src/libafs/MODLOAD-2.6.26-2-686-bigmem-SP/afspag.ko] undefined! WARNING: "afs_global_lock" [/home/eagle/dvl/openafs/src/libafs/MODLOAD-2.6.26-2-686-bigmem-SP/afspag.ko] undefined! WARNING: "afs_global_owner" [/home/eagle/dvl/openafs/src/libafs/MODLOAD-2.6.26-2-686-bigmem-SP/afspag.ko] undefined! [06:50:06] i've only had two cups of tea so far this morning, could someone provide a hint for how I actually build with RXDEBUG_PACKET [06:50:15] Russ: Yes. [06:50:29] That's the afspag.ko module, which is something to do with the NFS translator. [06:59:06] Oh, okay. [06:59:35] --- deason has become available [07:06:55] the make rules are sometimes dumb. especially the weird stuff needed to build more than one kernel module as a result of one make target [07:34:57] --- Russ has left: Replaced by new connection [07:34:57] --- Russ has become available [07:34:58] --- deason has left [07:35:46] --- deason has become available [07:36:36] --- deason has left [07:36:38] --- deason has become available [07:49:13] --- reuteras has left [08:19:03] --- dragos.tatulea has left [08:19:39] --- dragos.tatulea has become available [08:20:29] --- Russ has left: Disconnected [08:20:56] --- Russ has become available [08:21:47] --- deason has left [08:22:39] --- deason has become available [09:10:29] --- mmeffie has become available [09:10:37] hmmm: "Reviewed-on: http://gerrit.openafs.org/http://gerrit.openafs.org/152" [09:10:42] that doesn't seem right [09:11:11] Nope. It doesn't. [09:11:20] looks like all of them since 25eb69a32aac30f50a33432664c287984f24162c are like that [09:11:45] Which will be after the 2.0.16 upgrade, I suspect. [09:12:28] between Wed Jul 1 13:20:14 2009 +0200 and Fri Jul 17 21:29:10 2009 -0400; upgrade was friday morning, right? [09:12:37] Yes. [09:13:04] I suspect the change that pulled out the leading "/Gerrit/" has had some side-effects. [09:28:14] --- mmeffie has left [09:32:18] --- mmeffie has become available [09:49:43] --- mmeffie has left [09:52:56] --- Russ has left: Disconnected [09:54:39] --- cclausen has become available [09:56:07] Well, I now have an OpenAFS tree that will build with -Werror enabled at the top level ... [09:56:14] nice. [09:56:23] i have a tree that hangs on unmount. wish i knew why [09:56:48] I can't get it to behave badly on Linux. [09:56:48] --- Russ has become available [09:56:53] this is osx [09:57:04] hangs at RxEvent... [09:57:04] Let me prod it some more. [09:58:43] i might know. hm. hang on [09:59:57] if so, i have no idea why this changed [10:00:09] in any case, i will know in about 45 seconds :) [10:00:46] --- dragos.tatulea has left [10:02:39] --- dragos.tatulea has become available [10:08:45] Was that it? [10:10:24] nope. but i think it's a real bug [10:11:36] As in, something that's existed for a while? Or is it something we've introduced since git? [10:11:46] for a while [10:12:35] crap [10:12:42] 162, except i generated a merge commit [10:12:43] ? [10:14:06] Yeh. 162 seems reasonable. I'm amazed that hasn't bitten us before. [10:14:17] Yeah, I was going to ask you about 161, since I thought you'd already committed that. [10:14:53] gerrit isn't clever about using patch IDs with cherry picked changes. [10:15:12] 161 has a different SHA1 from the version in the tree (because it has a different history) and so got submitted again. [10:15:33] abandon 161 [10:16:08] * Russ dumped it. [10:16:27] * Russ doesn't know enough about the code to be able to review 162. [10:17:09] --- deason has left [10:17:15] Wrt CC... The problem is, we want to use whatever CC the user said to use, but we don't want autoconf's assumption that gcc is always better than anything else. [10:17:29] --- dragos.tatulea has left [10:17:40] --- deason has become available [10:17:49] Well, at the moment, we don't let the user pick, and we don't let autoconf pick. Seems like the worst of both worlds. [10:18:34] we have a set of hardcoded choices, same as on day 1 [10:19:04] I don't think 162 is correct. [10:19:14] > One of the things that always rebuilds is the Linux kernel module. It has to be that way, because we have to build the kernel module by invoking the kernel build system, which means it's a separate invocation of make. But under normal circumstances, it shouldn't actually compile everything again. [10:19:19] explain? [10:19:31] --- deason has left [10:19:34] --- deason has become available [10:19:40] --- haba has left [10:19:56] --- stevenjenkins has left [10:20:00] --- abo has left [10:20:14] note that it's analogous to what afs_AFSDBHandler does, in the case where we aren't waking it up [10:20:24] Actually, no. Speaking nonsense. [10:20:34] --- abo has become available [10:20:40] I was confused by what the caller was doing in afs_call.c [10:21:01] now, normally, we have an afsdb child, so we never trip on this [10:21:07] --- stevenjenkins has become available [10:21:26] Yes, but when there's no AFSDB child, we ask RXEVENT to stop, but never bother telling it that we've done so. [10:21:35] So, it's sleeping, we're sleeping, and not much gets done. [10:21:42] right. hence the bug [10:21:50] but it's not the bug i see [10:21:51] Yup. I'm happy that change is correct. [10:21:56] Bah! [10:22:30] --- haba has become available [10:23:25] > afspag.ko module ... is for (optional) use on clients of the NFS translator, to manage PAG's locally and transparently transform AFS system calls into rmtsys RPC's. As a gatekeeper, you ought to be concerned if something was recently committed that makes it not build. [10:24:32] That's not not building, it's failing to find some symbols. And its been like that for as long as I can remember. [10:24:49] i could have a meta argument about linux kernel changes and the nfs translator. it's not worth it. send patches. nothing was committed to break it. [10:25:21] (actually, does it load along side the AFS module, and use its symbols to satisfy the things its missing?) [10:27:41] That's what it looks like. [10:29:07] * Russ has verified that 158 (vlprocs safety checks) builds and the code looks right to me. Do we want to do more testing and verification than that, or should I push it? [10:29:54] Oh, I agree the current compiler situation is suboptimal. We pick by patching osconf.m4 :-) [10:30:00] is it the one which potentially changes allocation? [10:30:09] i wish jeff would read backwards. [10:30:32] No, it does not load alongside libafs. [10:31:39] I wish that jabber had instances [10:31:43] me too [10:32:00] It has threads, it's just that nobody has done client support. [10:32:31] i was unable to figure it out. there's convo in the zlog about it. i came to the "this doesn't work" conclusion [10:33:17] shadow: Yes, it's the one that scans for allocations until it can find a free block. [10:34:35] yeah, i should try it on a real server (or someone should) [10:34:46] it doesn't update maxvolid on manually-specified volids, though [10:36:55] > [trouble making aklog] Oh, crap, this is going to be MIT vs. heimdal, isn't it. (heimdal in /usr/lib and MIT in /usr/local/lib; configure was pointed at the latter) [10:37:44] --- Jeffrey Altman has become available [11:08:53] --- gendalia@gmail.com/owl32E20CFC has become available [11:12:46] * Russ hehs -- I should have said something here about the vlserver warning fix when I started on it. :) [11:13:03] The second change is to fix the comment, since looking further the rxkad_* function will always initialize I think. [11:16:22] deason: https://review.source.android.com/#change,10742 will fix your Reviewed-On issue. [11:16:33] I guess I could build a locally patched gerrit with it in ... [11:24:35] --- dev-zero@jabber.org has become available [11:29:13] Okay, so a 1.4 afsd, and a 1.5 kernel module don't place nicely with each other ... [11:31:05] duh [11:31:55] That's good to know -- I'll bump the versioning on the Debian packages when I package 1.5. [11:32:25] I was just being lazy on my test machine. I think I knew that really. [11:32:50] (If nothing else, Marc's new cache stuff changes the kernel/userland interface) [11:39:02] --- mdionne has become available [11:44:58] yes, with 1.5.60+ the protocol changes between afsd and the kernel module (we can't pass just inode numbers), so they need to be updated together [11:47:30] afspag.ko: the problem is that it's tied to NONFSTRANS in the source and the NFS translator is disabled on recent kernels because of GPL-only symbols. so it's missing symbols from code that's ifdef'ed out [11:49:52] > One of the things that always rebuilds is the Linux kernel module. [11:50:20] mdionne: Oh, okay, htat makes sense. [11:50:38] Some of this is from the parallel make patches. I'm trying to find a way to fix it without breaking the parallel make. [11:52:22] for afspag.ko one solution would be to not try to build it when we're not building the NFS translator [11:52:38] that would be imo the right solution [11:52:40] That would probably both make sense, and speed up the build [11:52:44] since it's otherwise useless anyway [11:55:59] Ok I'll see if there's a way to do this without too much pain. [13:16:24] --- gendalia@gmail.com/owl32E20CFC has left [13:21:44] --- gendalia@gmail.com/owl63CD805A has become available [13:34:12] i presume either i did something wrong trying to build with RXDEBUG_PACKET or the perl module is doing something weird or some combination of both, since all I got with my most recent crash with the libraries built from git is "Fatal Rx error: rx packet not free" [13:34:36] I had that error. [13:34:46] yes. it's been haunting me recently. [13:34:52] It happens when you call new repeatedly. [13:35:15] i guess i should write a function like jeff's packet dumper for you [13:35:19] You only get one instance of the perl class per invocation of your program. [13:35:32] Otherwise it inits rx twice, and the world ends. [13:35:43] rx_Init should be able to be called again [13:36:24] that was my suspicion. although i also thought that if i undef'ed the handle it got rid of that problem, if that was indeed the problem. [13:36:30] Don't you need to cleanup after yourself before you can do so? [13:37:00] i have structured the program so that after i create any handle i do not create any other handle of any other type until i undef, which i believe is supposed to cleanup. [13:37:04] Certainly $boss had a problem with exactly these symptoms, which went away when he treated the AFS module as a singleton. [13:37:51] well, the uniqname perl module, which comes from ... somewhere, which certainly doesn't call rx_Finalize, i treat by forking off a process to do its thing. [13:39:04] well, rx_InitHost calls rx_Init, which has: LOCK_RX_INIT; if (rxinit_status == 0) { tmp_status = rxinit_status; UNLOCK_RX_INIT; return tmp_status; /* Already started; return previous error code. */ } [13:39:12] so one presumes it's safe [13:40:08] --- stevenjenkins has left [13:40:10] --- deason has left [13:41:00] --- deason has become available [13:44:07] --- stevenjenkins has become available [13:46:10] I don't think init...finalize...init is guaranteed to work. [13:46:27] It definitely did not in my experience. [13:46:29] But multiple calls to rx_Init() definitely should work. [13:47:26] yeah, finalize is something iirc safe to call only once [13:47:34] we had this problem with andrew ftpd long ago [13:48:50] Well, not only that, but you can't then rx_Init. We ought to fix that, but I don't know how we can, at least in an LWP program. [13:49:24] I think the specific problem we saw was that init/finalize/init leaked file descriptors. [13:49:27] And threads. [13:49:34] esp. threads. [13:50:01] --- deason has left [13:50:25] In pthreads, it may leak threads. In LWP, it calls LWP_InitializeProcessSupport twice. Hm, but _that_ should be safe. [13:50:28] --- deason has become available [13:50:33] Well, anyway, don't do it. :-) [13:52:42] since this is something i call via remctl, leaking much of anything i don't care too much about, since it's going to do a few things and then go away. but it's obviously causing other problems. [14:24:43] --- Russ has left: Disconnected [14:33:04] On a machine running 1.4.10 on Linux 2.6.28.4, shortly after boot. Trying to read /afs/.cs.cmu.edu/service/etc/autoacct returns connection timeout. There was AFS traffic, and other successful AFS traffic before that, but never any aborts. fstrace shows the following: time 943.620191, pid 2367: Access vp 0x17cc0000 mode 0x40 len (0x0, 0x4800) time 943.620201, pid 2367: GetdCache vp 0x17cc0000 dcache 0x13daabb0 dcache low-version 0xcc, vcache low-version 0xcc time 943.620202, pid 2367: GetdCache tlen 0x4800 flags 0x1 abyte (0x0, 0x0) Position (0x0, 0x0) time 943.620222, pid 2367: Lookup adp 0x17cc0000 name .cs.cmu.edu fid (169:1.16777890.1), code=0 time 943.620234, pid 2367: RPC GetVolumeByName for 1 ( at 0x153619c2) time 943.620235, pid 2367: Analyze RPC op -1 conn 0x0 code 0xffffffff user 0x0 time 943.620245, pid 2367: ProcessFS vp 0x10bf9500 old len (0x0, 0x0) new len (0x0, 0x15) [14:33:15] Argh. Let me try to format that again [14:33:31] time 943.620191, pid 2367: Access vp 0x17cc0000 mode 0x40 len (0x0, 0x4800) time 943.620201, pid 2367: GetdCache vp 0x17cc0000 dcache 0x13daabb0 dcache low-version 0xcc, vcache low-version 0xcc time 943.620202, pid 2367: GetdCache tlen 0x4800 flags 0x1 abyte (0x0, 0x0) Position (0x0, 0x0) time 943.620222, pid 2367: Lookup adp 0x17cc0000 name .cs.cmu.edu fid (169:1.16777890.1), code=0 time 943.620234, pid 2367: RPC GetVolumeByName for 1 ( at 0x153619c2) time 943.620235, pid 2367: Analyze RPC op -1 conn 0x0 code 0xffffffff user 0x0 time 943.620245, pid 2367: ProcessFS vp 0x10bf9500 old len (0x0, 0x0) new len (0x0, 0x15) [14:33:41] That's better. Cell 169 is the dynroot [14:34:06] So, um, we shouldn't be trying to call GetVolumeByName on the dynroot volume [14:34:29] I have to go to dinner. Unless someone says this is a known problem, I guess I'll have to track it down after dinner. [14:36:26] --- Russ has become available [14:49:17] --- Russ has left: Replaced by new connection [14:49:17] --- Russ has become available [15:02:15] not known to me [15:28:56] --- dev-zero@jabber.org has left [15:30:14] openafs-stable-1_4_x now matches the last of CVS. We should pull up the RCSID and gitignore fixes (at least) before throwing it open for business. [15:30:44] In fact, I'll stick versions of those into gerrit now. [15:34:02] should we submit new gerrit changes specifically for 1.4 if we want something in there? [15:34:07] (er, when it's 'open') [15:44:37] --- deason has left [15:50:07] --- deason has become available [15:50:18] --- phalenor has left [15:50:58] --- phalenor has become available [15:54:19] --- Russ has left: Disconnected [16:05:48] Hmmm. The old head 'rxgk' directory from 2004 seems to have snuck into 1.4.x. Shall I kill it? [17:31:24] --- professorc has become available [17:31:54] --- professorc has left [17:36:41] the old rxgk directory has been there since 1.4 was cut. no point removing it now. [17:37:28] andrew: you can certainly queue up change proposals for 1.4 although I'm hoping there will not need to be another one. [17:53:33] --- matt has become available [18:07:09] --- dlc has left [18:16:29] --- matt has left [19:31:22] --- mdionne has left [19:43:38] --- cclausen has left [19:57:15] --- Russ has become available [20:40:48] --- gendalia@gmail.com/owl63CD805A has left [20:41:09] --- shadow@gmail.com/owlBFB92CD4 has left [20:47:35] --- summatusmentis has left [20:50:33] --- summatusmentis has become available [20:51:33] --- gendalia@gmail.com/owlD3D5926A has become available [21:03:15] --- dev-zero@jabber.org has become available [21:12:32] --- Russ has left: Disconnected [21:17:08] --- dwbotsch has left [21:17:52] --- dwbotsch has become available [21:46:45] --- deason has left [23:00:11] --- reuteras has become available [23:10:46] --- dev-zero@jabber.org has left [23:53:13] --- dev-zero@jabber.org has become available