[00:39:12] --- jaltman/FrogsLeap has left: Replaced by new connection [00:39:13] --- jaltman/FrogsLeap has become available [01:16:05] --- jaltman/FrogsLeap has left: Replaced by new connection [01:16:06] --- jaltman/FrogsLeap has become available [01:24:07] --- jaltman/FrogsLeap has left: Replaced by new connection [01:24:08] --- jaltman/FrogsLeap has become available [01:43:37] --- jaltman/FrogsLeap has left: Replaced by new connection [01:43:38] --- jaltman/FrogsLeap has become available [01:49:56] --- Russ has left: Disconnected [01:52:55] --- jaltman/FrogsLeap has left: Replaced by new connection [01:52:56] --- jaltman/FrogsLeap has become available [01:58:56] --- jaltman/FrogsLeap has left: Replaced by new connection [01:58:57] --- jaltman/FrogsLeap has become available [02:14:57] --- jaltman/FrogsLeap has left: Replaced by new connection [02:14:58] --- jaltman/FrogsLeap has become available [02:43:02] --- jaltman/FrogsLeap has left: Replaced by new connection [02:43:03] --- jaltman/FrogsLeap has become available [11:24:57] NULL pointer dereference on startup at rx_event.c:288 that I'm pretty sure isn't the result of the patch I'm working on. [11:25:33] --- Russ has become available [11:25:51] eventSchedule is valid but its func member is zero. [11:28:14] on master? [11:28:22] or 1.6? [11:28:30] Master. [11:29:39] save what you can. I'm sure Simon will want to take a look [11:31:47] I have a dump. But I don't think this has tested my afsi_SetServerIPRank tweaks yet :-/ [11:38:01] --- jaltman/FrogsLeap has left: Disconnected [11:46:01] --- jaltman/FrogsLeap has become available [11:50:34] Yeah, I would like to see. [11:50:59] That suggests that FreeBSD isn't registering a function to call on event reschedule. [11:53:31] Er, what in particular do you want to see? Just a backtrace? [11:56:27] rxevent_post rx_event.c:288 rxi_ReapConnections rx.d:7102 rx_StartServer rx.c:952 afs_ResourceInit afs_init.c:557 afs_syscall_call afs_call.c:128 [11:59:13] I'd be very interested to know if rxevent_Init has been called [12:00:33] Ah, so I should throw a printf or something in there and try again? [12:01:04] I _think_ what we've probably got is a race around initialisation, but I'm not entirely sure. So yeah, a printf there would be handy. [12:01:31] FWIW, my patch on top of a9682775f seems happy. [12:02:58] Anyone know what Perl modules now need to be installed to build master? :( [12:03:33] Just turn off FUSE and move on with life. [12:03:44] Ah, ok. [12:08:56] kaduk: Your problem is that scheduler.func == NULL ? [12:09:38] eventSchedule.func , even [12:09:59] --disable_fuse_client actually didn't avoid the swig errorrs. [12:10:05] Simon: Looks like it. [12:10:42] kaduk: In which case, I am very baffled, because rxevent_Init is called from rx_InitHost, which is called from afs_ResourceInit _before_ it calls rx_StartServer [12:11:59] matt: Sorry, it looks lik ethere isn't actually a way to disable swig if the configuration process finds swig in your path. [12:12:18] That seems kind of poor ... [12:12:39] Patches, as always, welcome :) [12:12:55] Fair enough. Thanks for the pointer. [12:19:24] Yup, sure looks like rxevent_Init got called. [12:29:41] Do you want to ponder a bit more before I go visit bisect-land? [12:31:51] I don't think there's any point in visiting bisect land. It's going to be the rxevent rewrite for rbtrees. [12:32:42] Good point. And bisecting within that single commit is not really feasible. [12:33:29] Can you check (with a printf or an assert) that rxevent_init isn't being called with a NULL second argument? [12:39:00] Yeah. (It'll be a bit; I cleaned my tree in the interim.) [12:39:40] My suspicion is that afs_ResourceInit is getting called from multiple threads, and we're in a race. [12:47:43] Called with null argument. [12:48:11] matt: You around? [12:48:47] hi [12:49:01] Hi. Just looking at 6204. [12:49:19] I did a second upload for some of the howlers. [12:49:33] The C unit tests look awesome, but just wondering why you went down that path, rather than the TAP based tests that are already in tests/opr/ [12:50:44] I have a lot of YFS stuff with CUnit, including ones for avl that exactly matched, so I wanted to reuse. [12:51:16] Also, valgrind and timings tests make for directly comparable perf and displacement tests. [12:51:36] Eg, rbtree is appx 15% larger than Solaris avl. [12:52:19] (I could fix that, using marked pointer coloring.) [12:52:21] It's not written for space efficiency. There are lots of hacks you can use to make rbtrees more space efficient, but they significantly (imo) hamper readability. [12:53:18] I was just pointing out that it was easy to observe side by side. I think that change would be attractive, but I don't have any issue with what's there eitehr. [12:53:40] Yeah, I don't see any problem with using the bottom bit of the pointer to record the colour. [12:53:43] Except maybe, the "gramps" pointer;) [12:54:02] I don't want to remove the parent pointer (which is another optimisation), because it penalises removal operations. [12:54:14] Sure. [12:54:17] But yeah, there are definitely options. [12:55:05] On the unit test front. I think we need to ask Russ and the rest of the gatekeepers about test suites. I know Russ has put a fair amount of effort into doing TAP in tree, and I don't know whether we want another unit test suite. [12:55:21] It might be easier to split off the tests from the rest of the patch - I can +1 the rest pretty easily. [12:55:37] I don't really care, but I think that would be silly. [12:56:49] In fact, Russ already indicated he didn't have a problem unifying with CUnit tests somehow, it was discussed wrt locking tests. [12:57:11] Ah, okay. [12:57:23] (Forgot that myself) [12:58:58] I don't have a problem with CUnit tests. I would prefer that a module stick to one test suite. If we are importing external code (for example from Heimdal) I would like us to be able to import the associated tests and run them in our framework. [13:00:35] Final question - what was the reasoning behind changing from OPENAFS_OPR_RBTREE_H to _OPR_RBTREE_H ? [13:05:27] kaduk: Found your problem. [13:05:47] If you aren't an RXK_TIMEDSLEEPENV rxi_ReScheduleEvents is #defined to 0. [13:17:16] Bah. I really wish git wouldn't let you do a git checkout in a middle of a rebase... [13:17:34] heh [13:19:51] does the build system have any method of executing CUnit tests at the moment? [13:20:06] No, there's no framework for running CUnit stuff. [13:20:29] openafs-dirformat and locking branches do, yes. [13:21:04] Is that in a patch that would be easy to pull into the OpenAFS tree? [13:21:17] It would be easy to split off, sure. [13:21:52] It would be good to get that in to the tree. I don't like having stuff in tree that isn't being built, because it has a tendency to rot. [13:22:06] please pull up the framework so these tests can be added into it. [13:22:15] I can do that. [13:22:20] thank you [13:22:41] kaduk: Could you try 6206. I think it should fix your problem [13:23:08] correction: build yes, run automatically no [13:23:18] I can look into it, though. [13:24:37] We don't run the TAP tests automatically at the moment [13:24:56] Ok, I'll start with splitting out what is done, then push for review. [13:25:02] Great, thanks. [13:34:01] cherry-picking 6206 ... [13:35:59] It's worth having a look at whether you can turn on TIMEDSLEEP_ENV on your OS as well. RX will perform much better with it enabled. [13:37:20] Yeah, I was just thinking "why don't I have that?" [13:37:32] Anyway, the patch seems to make things happy enough. [13:38:36] You don't have it because Derrick wrote it for Mac OS X, and only tested it there. He didn't want to turn it on for platforms it hadn't been tested on. [17:54:45] --- matt has left