[02:24:19] --- kula has left [07:23:15] --- deason has become available [09:33:49] --- steven.jenkins has become available [09:51:02] --- Jeffrey Altman has become available [10:07:18] --- Russ has become available [10:36:12] --- Russ has left: Disconnected [10:50:29] --- jaltman has left: Disconnected [10:58:01] --- rra has become available [11:31:10] shadow: Did your reaction when I asked about a 1.5.76 release announcement, beyond the "we forgot about that part," mean that you or someone was going to work on that, that you were looking for people to work on that, that we were skipping it, or just that you'd not thought about it yet? [13:09:40] --- jaltman has become available [13:23:05] I'm fairly sure it means that it was forgotten [13:24:28] 1.5.77 is in the works due to a bug introduced with the PMTU discovery code that bit me hard. See http://gerrit.openafs.org/#change,2664 [13:28:24] is pthreaded ubik supposedly in master? [13:33:38] pthreaded ubik has been in master for a long time; it has been supposedly fixed more recently [13:34:13] Well, I'll have to pull, but what I have now is clearly not thread-safe [13:34:27] For example, there is no locking on ubik_epoch [13:40:52] does it ever change? [13:41:18] I was a bit worried about the locking of writeTidCounter, actually, when looking at this stuff recently... [13:41:40] Yes, it changes. writeTidCounter (and tidCounter) are also problems. [13:41:43] I don't doubt there are other issues; I've just been kept busy by the issues people have actually hit so far [13:42:59] --- Jeffrey Altman has left [13:45:23] Really, someone needs to go over it with a fine-toothed comb and carefully consider not just locking to protect access to individual variables, but also with respect to external state and atomicity of complex operations. As with much of the CM code, ubik is fairly dependent on the fact that things can't be interrupted unless they yield. [13:45:29] Andrew, do you know if there was an analysis of the ubik implementation written up as part of the pthread modifications? [13:46:04] (with specific attention to the details Jeff mentions) [13:46:12] I don't think so, but I can look [13:46:35] I ask because if there hasn't been someone needs to do it [13:46:41] jhutz: yes, but we do at least have the glock-ish DBHOLD/DBRELE [13:51:18] That's not GLOCK-ish; like many of the locks in the CM, it's used to provide exclusion in places where it's necessary to block. Ubik in general assumes control won't transfer to another thread uniess it explicitly blocks or yields. [13:52:07] This is a hard task, much harder than, say, the volserver, which is a large part of why it wasn't done before now (that, and because it was never necessary for performance) [13:53:57] it seems to be held any time we deal with the dbase... not that I've checked everywhere, which makes it not useful [13:58:03] as an example of the type of error that might exist. If the ref count on an object drops to 0, the lwp code has the guarantee that the object cannot be garbage collected or recycled until the thread blocks or yields. That guarantee will not necessarily be true in a pthread world. [13:59:48] you basically have to go through everything carefully, and ask "what is this expecting not to change because it doesn't block". [14:01:32] You also have to recognize that things which seem not to require locking may still require correct use of memory barriers, which in pthreads is most easily achieved by correct application of locks. [14:02:01] yeah, I know; I've done that before and it's difficult... I don't think anyone's done it with pthreaded ubik (or at least, not "carefully" enough for me) [14:02:18] The good news is that fine-grained locking is probably not necessary, so as long as the DBHOLD/DBRELE lock is a pthread_mutex, you can just use it in more places. [14:02:45] Really, we'd be better off not doing pthreaded ubik at all, except we really want to get rid of lwp [14:04:36] speaking of that... is there a timeline for getting rid of the lwp fileserver/volserver? [14:05:16] lwp fileserver can't do rxgk [14:06:50] 2.0 or sometime later is a strong likelihood. However, there has been no target set [14:09:33] Can anyone think of a platform on which you don't have to go out of your way to get the lwp fileserver? [14:10:29] ... that we still care about? [14:12:01] I think "old openbsd", because old openbsd pthreads don't work well... but the only reason I know that is because we only found that out recently [14:12:06] or "did something about it" recently [14:13:00] We need to be better about consistency. Platforms on which we build tviced, platforms on which we build tvolser, and platforms on which we don't install viced are all different. So on some platforms, we build and install tviced, while on others, we build and install viced, then build and install tviced, which makes the dependency analysis wrong on those platforms. [15:16:45] --- deason has left [17:25:53] --- mdionne has become available [17:26:37] --- rra has left: Disconnected [17:42:26] --- Russ has become available [18:30:03] --- deason has become available [18:56:14] --- steven.jenkins has left [19:30:06] --- mdionne has left [21:27:05] --- Russ has left: Replaced by new connection [21:27:06] --- Russ has become available [21:54:26] --- Russ has left: Disconnected [23:10:13] --- deason has left