[00:32:51] --- Russ has become available [00:51:44] --- reuteras has become available [02:42:34] --- Russ has left: Disconnected [03:28:30] --- ksumner has left [03:30:20] --- ksumner has become available [04:16:28] --- Simon Wilkinson has become available [04:33:51] --- Simon Wilkinson has left [04:40:30] --- reuteras has left [04:49:18] --- meffie has become available [05:53:15] --- Simon Wilkinson has become available [06:01:49] --- Simon Wilkinson has left [06:54:14] --- haba has become available [07:24:57] --- deason has become available [07:34:02] --- haba has left [07:34:38] --- haba has become available [07:47:37] --- Simon Wilkinson has become available [08:29:15] --- haba has left [09:34:19] --- Simon Wilkinson has left [09:49:17] --- rra has become available [09:53:02] --- meffie has left [09:58:41] --- andersk has left [10:12:21] --- andersk has become available [10:42:05] --- meffie has become available [11:23:03] --- rra has left: Disconnected [11:24:50] --- Simon Wilkinson has become available [11:31:06] --- rra has become available [12:11:10] --- Jeffrey Altman has become available [12:11:29] --- Jeffrey Altman has left [12:35:10] --- meffie has left [12:46:59] --- Simon Wilkinson has left [12:47:00] --- Simon Wilkinson has become available [12:58:29] --- edgester has become available [13:01:33] FYI, I had to kick buildbot this morning because it stopped accepting changes. Some new changes have been missed because my mechanism for injecting change is kludgy and blocks new changes while the injection is happening. sorry [13:02:13] I'm inkecting missed changes by hand when buildbot goes idle [13:02:21] er, injecting [13:02:21] --- meffie has become available [13:02:34] --- edgester has left [13:02:49] What is the mechanism for injecting changes? [14:05:20] --- ksumner has left [14:06:37] --- ksumner has become available [14:07:09] --- Simon Wilkinson has left [14:19:53] --- edgester has become available [14:23:22] The mechanism is to ssh to the the buildbot master, run "~/bin/fake-unverified-stream > ~/fake-gerrit-events/1.change" , then kill one of the "ssh gerrit-prod" processes, rerun ~/bin/fake-unverified-stream > ~/fake-gerrit-events/1.change and kill the other "ssh gerrit-prod" process that was running (not the new one) [14:24:53] I set up the buildmaster to provide a fake gerrit ssh interface, which cat's the files from the fake-gerrit-events folder, then do the real ssh to gerrit [14:27:42] the problems are: * the setup typically only picks one change out of the file * there are two ssh connections to gerrit, each one cleans out the fake events folder * the fake-unverified-stream script was busted with the gerrit upgrade. I made it work on my ubuntu box, so I just copy the output to the buildmaster * there is no mechanism to check that events haven;t already been seen, so I tend to wait until all of the builds are done [14:28:44] FYI, this must all be done as the buildmaster user [14:46:15] Good news: adding vnode interlocks around VREFCOUNT prevents this race that causes corruption. Bad news: I used vgonel() which is no longer an exported interface. Also, kib thinks the routine is wrong in other ways, and questions the need for a filesystem to manage this in the first place. [14:48:36] well, you could do it like macos does it [15:05:24] Darwin manages its own VLRU queue? [15:17:05] the darwin port does a better job of dealing with the OS's ideas of how vnode referencing should be done [15:37:43] --- Simon Wilkinson has become available [15:55:24] --- deason has left [16:28:55] --- deason has become available [16:50:24] --- Simon Wilkinson has left [17:36:04] --- meffie has left [17:36:09] --- rra has left: Disconnected [17:54:03] --- Russ has become available [20:25:59] Whoa. lock order reversal: 1st 0xffffff8000e2cb80 rx_freePktQ_lock (rx_freePktQ_lock) @ /usr/ports/net/openafs-devel/work/openafs/src/rx/rx_packet.c:1346 2nd 0xffffff8000e2cd00 rx_refcnt_mutex (rx_refcnt_mutex) @ /usr/ports/net/openafs-devel/work/openafs/src/rx/rx_packet.c:1376 That's a new one, I think. [20:29:16] quite possibly [20:33:13] --- edgester has left [20:37:32] maybe I'm blind but I don't see it [20:38:47] is there a place where rx_refcnt_mutex is held before the rx_freePktQ_lock is obtained? [20:39:17] how recent of a tree, ben? [20:41:15] --- ezyang@mit.edu/barnowl has left [20:42:01] --- ezyang@mit.edu/barnowl has become available [20:44:26] git master fetched, oh, an hour ago, plus #define VREFCOUNT(v) vrefcnt(ATOV(v)) [20:47:25] found it. [20:47:55] Unfortunately, I don't have a trace because that routine gets confused on the call lock/glock LOR of RT 127440 and spews a kernel page's worth of _end()s which is very slow on the serial console. [20:49:04] well, it's going to be rxi_FreeCall's call to rxi_ResetCall sending a delayed ack event [20:58:27] er, no. but yes, in ResetCall. [22:08:42] --- deason has left [22:33:15] --- jaltman has left: Replaced by new connection [22:33:16] --- jaltman has become available [23:54:14] --- reuteras has become available