[00:37:10] --- kaj has left
[00:39:35] --- kaj has become available
[01:32:52] --- Russ has left: Disconnected
[01:41:20] --- Simon Wilkinson has become available
[01:43:15] <Simon Wilkinson> But &tbuf[CVBS] isn't the same as passing tbuf, is it? Surely it's the same as tbuf + CVBS, but without all of the perils of pointer arithmetic.
[01:56:34] --- Simon Wilkinson has left
[02:04:35] --- Simon Wilkinson has become available
[02:05:42] --- Simon Wilkinson has left
[02:24:45] --- Simon Wilkinson has become available
[02:47:29] --- Simon Wilkinson has left
[02:47:34] --- Simon Wilkinson has become available
[02:53:35] --- Simon Wilkinson has left
[02:57:49] --- Simon Wilkinson has become available
[03:07:59] --- Simon Wilkinson has left
[03:07:59] --- Simon Wilkinson has become available
[03:36:52] --- Simon Wilkinson has left
[03:36:52] --- Simon Wilkinson has become available
[03:41:42] --- Simon Wilkinson has left
[03:47:47] --- Simon Wilkinson has become available
[03:48:13] --- Simon Wilkinson has left
[03:58:37] --- Simon Wilkinson has become available
[04:26:25] --- Simon Wilkinson has left
[05:14:57] --- jaltman has left: Disconnected
[05:49:42] --- jaltman has become available
[06:02:28] --- reuteras has left
[06:17:04] <shadow@gmail.com/owl1EA1D463> > But &tbuf[CVBS] isn't the same as passing tbuf, is it? Surely it's
the same as tbuf + CVBS, but without all of the perils of pointer arithmetic.

given what the called function does (parsing right to left) it also is
correct
[06:26:31] --- jaltman has left: Disconnected
[06:29:35] --- rdw has become available
[06:53:03] --- tkeiser has become available
[07:01:00] --- tkeiser has left
[07:07:05] --- deason has become available
[07:38:32] --- tkeiser has become available
[07:44:33] --- tkeiser has left
[07:48:42] --- tkeiser has become available
[07:49:53] --- jaltman has become available
[07:52:11] --- Simon Wilkinson has become available
[07:54:44] --- tkeiser has left
[08:01:35] --- abo has become available
[08:06:32] --- matt has become available
[08:09:24] <kaduk@mit.edu/barnowl> > perils of pointer arithmetic

Er, what are those, again?  I thought the behaviour was pretty well-specified.
[08:10:08] <Simon Wilkinson> The dangers arise if you have casts in the same line.
[08:10:21] <Simon Wilkinson> For example (char*)tbuf + CVBS
[08:10:44] <Simon Wilkinson> But, generally, I find something like &a[b] much more readable than a + b
[08:13:20] <kaduk@mit.edu/barnowl> Also, Simon, do we have any idea how confident we are that xdrproc_t is correct
for FBSD?  I think I have seen stack corruption near calls to xdr_vector, and I
think that might be an explanation.
[08:13:54] <Simon Wilkinson> You should just use the AFS XDR throughout.
[08:14:02] <kaduk@mit.edu/barnowl> (I will try to test, but probably not until tonight.)
[08:14:55] <Simon Wilkinson> Try adding AFS_FBSD80_ENV to set of platforms for which we #define xdr_blah afs_xdr_blah
[08:15:22] <kaduk@mit.edu/barnowl> Fair enough.
[08:15:31] <Simon Wilkinson> I suspect we should actually do that globally.
[08:16:35] <kaduk@mit.edu/barnowl> Do you know offhand where that happens? 
[08:16:59] <Simon Wilkinson> src/rx/xdr.h
[08:17:25] <Simon Wilkinson> You also have to make sure that rx/xdr.h is what gets included, rather than a system XDR.
[08:17:58] <kaduk@mit.edu/barnowl> I ... don't think there's a system xdr?
(will check)
[08:19:14] <Simon Wilkinson> If there's no system XDR, then I can't see where your xdrproc_t problem would come from.
[08:19:25] <Simon Wilkinson> Oh, unless your kernel is built with a wierd regparam
[08:20:41] <kaduk@mit.edu/barnowl> There is a system xdr, it seems.  Though I don't think we could be including its
header.
[08:21:22] --- mmeffie has become available
[08:22:14] <Simon Wilkinson> It's worth checking that you aren't using a register calling convention when you build the kernel, otherwise the xdrproc_t hack won't work properly.
[08:22:42] <shadow@gmail.com/owl1EA1D463> that's the linux i386 issue i was thinking about yesterday
[08:45:09] --- kaj has left
[08:52:29] --- rdw has left
[08:53:39] <kaduk@mit.edu/barnowl> Yeah, that's most of what I was worried about.
[08:53:55] <kaduk@mit.edu/barnowl> (There were some leading comments in the code.)
[08:55:45] <Simon Wilkinson> You can tell if you look at the gcc command line used to build the kernel.
[09:01:12] <matt> ben:  are you using memcache?
[09:01:26] <shadow@gmail.com/owl1EA1D463> pretty sure he is
[09:02:34] <matt> I should note for the record that when I had cache settings that required more memory than fbsd could provide to set up the memcache, I got highly variable results that looked like stack corruption.
[09:03:59] --- mmeffie has left
[09:04:27] --- mmeffie has become available
[09:04:43] <kaduk@mit.edu/barnowl> Yeah, memcache.
[09:05:13] <Simon Wilkinson> I thought that was a standard feature with memcache
[09:05:24] <matt> Smashing stack?
[09:05:53] <Simon Wilkinson> Punishing you in interesting ways if you try and allocate too much memory to the cache.
[09:05:55] <kaduk@mit.edu/barnowl> /afs:/usr/vice/cache:184259
The box has 4G (though for some testing I restricted it to 1G via tuneable
hw.physmem.
[09:07:18] <matt> I didn't follow it up, but if some general cleanup to handle the condition is possible, it would be very nice to write it.
[09:07:51] <matt> ben:  try reducing that massively?
[09:07:57] <matt> I mean, cacheinfo
[09:08:33] <matt> I'm sure you want to track down and fix the reclaim bug anyway ... :)
[09:08:44] <kaduk@mit.edu/barnowl> Will give it a shot.
But this machine exists solely to be my afs playground at the moment, so I'm
skeptical.

Heh.
[09:10:15] <matt> I was certain for a weak that I needed to find ways to reduce stack, and wrote a patch to slim down GetDownD, etc, and later decided that while that might be nice, it wasn't actually the problem.
[09:10:19] <matt> er, week
[09:11:38] <Simon Wilkinson> I thought Linux gave pretty explicit errors if you smashed the stack.
[09:14:05] <matt> I've never done it on Linux--just speaking of FreeBSD 8 at this point.
[09:14:17] <Simon Wilkinson> Ah. Okay, sorry.
[09:14:28] <Simon Wilkinson> What's the stack size on FreeBSD?
[09:14:34] --- tkeiser has become available
[09:14:53] <matt> I don't recall 100% what FreeBSD did do to clue me in once I looked at the right thing--but NetBSD and OpenBSD are terrible--you just overrun the user area and go wild thereafter
[09:18:15] <kaduk@mit.edu/barnowl> The kernel is being compiled with -fstack-protector, which I think is somewhat
recent.
[09:18:30] <kaduk@mit.edu/barnowl> But I don't remember if that notices overruns as well.
[09:18:52] <Simon Wilkinson> With 2215, I think we should be taking the page reference counts, rather than taking page locks.
[09:19:08] <Simon Wilkinson> When you've got page locks, you can deadlock if the kernel tries to reclaim pages before you've released them.
[09:20:43] <matt> I don't think we can omit the lock and still have expected behavior?   The change I sent should protect against any change in behavior that would have caused page->_count to drop to 0 before, and given that...
[09:21:23] <Simon Wilkinson> Like I said, I need to dig further. But I know that returning from readpage with locks still held can lead to deadlocks later.
[09:22:21] <matt> If we didn't return with a lock held, wouldn't the client read stale data?
[09:24:24] <Simon Wilkinson> Not if PageUptodate isn't set, I think.
[09:24:33] <Simon Wilkinson> But, I may be confusing the read path with the write path here.
[09:25:15] <matt> Ah.  But refcount should be protecting against page release, per code.
[09:26:46] <matt> And it sounds like not returning a page locked is just one way to ensure page is guaranteed to be unlocked.
[09:27:00] <Simon Wilkinson> Actually, digging through this a bit further. 
[09:27:32] <Simon Wilkinson> You can return a page locked.
[09:27:40] <matt> Ok.  Good :)
[09:27:49] <Simon Wilkinson> The complication is if you hit a point where you deadlock with pdflush /flush-afs/ whatever.
[09:28:13] <Simon Wilkinson> The issue there is if your thread which unlocks the page needs any resources which might be held by the pdflush thread.
[09:28:36] <matt> I think that's hard to have happen--never seen it in -very- deep load testing.
[09:28:36] <Simon Wilkinson> And that's complicated, because the flusher is, on some versions of Linux, reentrant.
[09:29:17] <Simon Wilkinson> I don't think we can do anything about it.
[09:29:18] <matt> The bg thread does only what rx does, pretty much.
[09:29:57] <Simon Wilkinson> I'll read through my notes when I get back to London and see if there was anything else.
[09:30:07] <matt> Ok, thanks for doing that.
[09:31:03] <Simon Wilkinson> But if there isn't, I'd like to consider turning cache bypass on for 1.6
[09:31:29] <Simon Wilkinson> (Well, on in the build, but not activated for any files unless requested by the user)
[09:31:34] <Simon Wilkinson> Do you think its ready for that?
[09:32:26] --- Kevin Sumner has left
[09:32:28] --- mmeffie has left
[09:32:34] --- tkeiser has left
[09:32:35] --- abo has left
[09:32:37] --- Kevin Sumner has become available
[09:32:49] --- abo has become available
[09:33:01] --- tkeiser has become available
[09:33:25] --- mmeffie has become available
[09:33:31] <matt> I'm heard intimations of that.  I think it is reasonable, given it does nothing unless a threshold is set.  I'm interested in somehow getting connection pooling, though, since best benchmarks were with connection clones.  Patch in gerrit needs an update, in progress.
[09:35:54] <Simon Wilkinson> Does it work without connection clones?
[09:36:29] <matt> Yes, but you can it rx maxcalls fairly quickly.
[09:37:08] <matt> It improved performance without clones, but we doubled performance with them in.
[09:37:26] <matt> I mean, doubled the improvement we were getting.
[09:38:49] <matt> I think the general improvement was linear with additional processors up to 4, based on Hartmut's report--I don't know if anyone has benched with more.  Phalenor was interested in doing this, I believe, but I didn't have time to do a 1.4 backport last year...
[09:46:23] --- Simon Wilkinson has left
[09:50:36] --- tkeiser has left
[09:52:37] <phalenor> matt: I'd be happy to test out whatever you want, doesn't have to be 1.4
[09:53:55] --- mmeffie has left
[09:54:11] <phalenor> on some of our bigger machines, I'd say we would almost care more about performance than stability
[09:55:49] --- mmeffie has become available
[09:58:08] <matt> Well, then 1.5.x is your branch.  I'd be interested in getting comparative results first with the unmodified branch, on memcache with appropriate chunksize (e.g., 18?), with and without cache bypass enabled.  You can use one cm binary, built with --enable-cache-bypass, setting a low fs bypasthreshold to enable bypassing.  Read vs. mixed read-write workloads are interesting--the latter should have a significant penalty, should ideally, workload should be read-heavy.  Then, repeating with cache-bypass refcounting patch in gerrit.
[09:58:30] <matt> When we have a connection pooling patch worth running, repeating with thiat.
[09:58:32] <matt> that
[09:59:31] <matt> The refcounting patch should have no measurable effect on performance, starting from 1.5.x + refcounting would probably be reasonable.
[10:00:34] <phalenor> do you care about disk cache performance? 
[10:02:14] <phalenor> and when you say with and without cache bypass, do you mean with and without --enable-cache-bypass at build time?
[10:02:58] <matt> with and without:  no, with fs bypassthresh default (0?, -1?  don't recall...) or vs. with some small value (which enables bypassing)
[10:05:02] <phalenor> okay. so the only stumbling block I see then is I don't have any machines handy with new enough autoconf, etc to run regen.sh, though that could be rectified I suppose
[10:05:52] <matt> But as regards disk.  You get massive "improvement" but relative to memcache, the results are unrealistically scaled.  But it should work regardless, and reduces disk workload, obviously, though it's now background work, since Simon's changes of summer.
[10:06:01] <matt> Yeah, you just need to put an autoconf somewhere...
[10:07:04] <matt> And parallel fetching is still happening, of course.  I'm just admitting that I barely ran cache-bypassing with disk cache.
[10:07:49] <phalenor> for the most part, our 'big' machines run with around a half gig of memcache because they have the memory to spare (some 16G, most 32, one 64), workstations and machines on 100Mb are still disk cache, as even with 1.4 disc cache becomes network bound
[10:08:02] <matt> Yes, that's the ticket.
[10:08:28] <matt> Oh, and you want to increase -daemons, esp. with more calls patch to come.
[10:09:00] <phalenor> right now we're running with 12, so more than that?
[10:09:26] <shadow@gmail.com/owl1EA1D463> -daemons 12 is probably fine for now. 
[10:09:28] <matt> Actually, that's probably fine.  Worth looking into, perhaps.
[10:10:09] <phalenor> I haven't tested while varying that number, but I suppose I could fiddle with it a bit
[10:10:15] <matt> With clones, I never used more than 4xRX_MAXCALLS = 12 anyway, btw.
[10:10:32] <matt> So there would be no improvement, unless we were starving something else.
[10:10:54] <matt> sorry, 3x
[10:11:05] --- mmeffie has left
[10:13:02] <phalenor> ok, well i've dumped this conversation into our wiki for reference later. no idea when i'll get around to actually doing anything, but it should be 'soon'
[10:14:53] --- mmeffie has become available
[10:15:32] <matt> anything you do will be helpful, thanks
[10:35:57] --- rra has become available
[10:39:10] --- jaltman has left: Disconnected
[10:39:30] --- mattjsm has become available
[10:45:26] <mattjsm> matt - afs_syscall.c - NBSD40 makes use of afs_syscall_create instead of the standard afs_syscall_icreate as well as uses a different number of arguments. Do you know, was that defined elsewhere that i can't find or typo?
[10:47:53] <matt> It probably just didn't track a change.
[10:48:28] <matt> Should match up with what other ports do, except for formatting the arguments using the netbsd way.
[11:35:25] --- tkeiser has become available
[11:35:59] <matt> rra?
[11:36:25] <rra> Yup?
[11:39:34] <matt> So, first, I will move rpctestlib, but I also have another directory brlock with cunit-based tests.  It drives two suites of tests with upwards of 700 tests each.  Do those also go in top-level tests?
[11:41:22] <rra> Yes.
[11:41:29] <rra> Eventually, they should be converted to be TAP tests.
[11:41:34] <rra> So that we can drive them with our test framework.
[11:41:50] <rra> In the meantime, though, we can run them alongside the TAP tests through the same makefile target.
[11:42:57] <rra> I suspect it wouldn't be hard to write a shim on top of cunit to translate the output format so that people could write cunit tests if they wanted to, although I don't know from a procedural standpoint if we want to maintain two different test methods in perpetuity.
[11:43:19] <matt> That would be cool.  We can probably do it, in fact, eventually.
[11:43:39] * rra wants to pull all of the testing code throughout the tree into the tests directory and plug it into a reasonable test framework and harness, but I know that will take a while.
[11:44:01] <matt> I don't have any religion except completeness, correctness, and meeting deadlines ;)
[11:44:06] --- Kevin Sumner has left
[11:44:31] <matt> We have a need to maintain thread state, though, so driving something from within TAP sounds like a good fit.
[11:44:39] --- Kevin Sumner has become available
[11:45:18] <rra> My plan is that over time we'll develop an increasingly rich set of test libraries in the tests directory that will make it easy for tests to do things like spawn file servers, set up test cells, set up test services, and so forth and tear them down when done.
[11:45:19] <matt> These RPC tests simulate sevaral clients, and need to mux their callbacks, as well.
[11:46:33] <matt> There should be a third suite as well which I think will try to use Andrew's fuse stuff.
[11:46:46] <rra> That owuld be neat!
[11:48:11] <matt> I'm glad it fits in--I'm repushing with files moved...
[12:04:10] <deason> libuafs directly would be more convenient than fuse; unless that's what you're talking about
[12:04:33] <deason> or unless you want to test fuse integration
[12:06:27] <matt> sorry, libuafs
[12:17:50] --- kaj has become available
[12:21:08] --- mattjsm has left
[12:59:00] --- allbery_b has become available
[13:29:29] --- phalenor has left
[13:33:12] --- tkeiser has left
[13:35:15] <kaduk@mit.edu/barnowl> BJK AAA call 0xffffff002d43b800
BJK BBB call 0xffffff002d43b800
BJK CCC call 0x28

fsint/Kvice.cs.c:
    139 int RXAFS_FetchStatus(register struct rx_connection
*z_conn,AFSFid * Fid
,AFSFetchStatus * OutStatus,AFSCallBack * CallBack,AFSVolSync * Sync)
    140 {
    141         struct rx_call *z_call = rx_NewCall(z_conn);
    142 printf("BJK AAA call %p\n",z_call);
    143         static int z_op = 132;
    144         int z_result;
    145         XDR z_xdrs;
    146         struct clock __QUEUE, __EXEC;
    147         xdrrx_create(&z_xdrs, z_call, XDR_ENCODE);
    148 printf("BJK BBB call %p\n",z_call);
    149
    150         /* Marshal the arguments */
    151         if ((!xdr_int(&z_xdrs, &z_op))
    152              || (!xdr_AFSFid(&z_xdrs, Fid))) {
    153                 z_result = RXGEN_CC_MARSHAL;
    154                 goto fail;
    155         }
    156
    157         /* Un-marshal the reply arguments */
    158         z_xdrs.x_op = XDR_DECODE;
    159         if ((!xdr_AFSFetchStatus(&z_xdrs, OutStatus))
    160              || (!xdr_AFSCallBack(&z_xdrs, CallBack))
    161              || (!xdr_AFSVolSync(&z_xdrs, Sync))) {
    162                 z_result = RXGEN_CC_UNMARSHAL;
    163                 goto fail;
    164         }
    165
    166         z_result = RXGEN_SUCCESS;
    167 fail:
    168 printf("BJK CCC call %p\n",z_call);
    169         z_result = rx_EndCall(z_call, z_result);

I guess it's time to unwind some of those conditionals.
[13:38:19] --- dwbotsch has left
[13:38:24] --- dwbotsch has become available
[13:39:30] --- phalenor has become available
[14:03:25] --- mmeffie has left
[14:06:08] --- shadow has become available
[14:09:38] --- mmeffie has become available
[14:34:33] --- jaltman has become available
[14:39:13] --- shadow has left
[14:39:30] --- allbery_b has left
[14:47:18] --- tkeiser has become available
[14:48:17] <shadow@gmail.com/owl1EA1D463> "ow"
[14:59:21] --- mdionne has become available
[15:42:05] --- Kevin Sumner has left
[15:42:12] --- Kevin Sumner has become available
[15:45:07] --- tkeiser has left
[15:45:32] --- tkeiser has become available
[15:58:36] --- deason has left
[16:20:30] --- tkeiser has left
[16:26:31] <mdionne> osi_AssertFailK is declared as "noreturn", but in the LINUX26 case it does return - the osi_Assert macro takes care of going BUG() after
[16:27:10] <mdionne> so I'm wodnering if the annotation should be removed, or if the BUG() should be folded into osi_AssertFailK
[16:27:37] --- kaduk@mit.edu/barnowl has left
[16:48:17] --- matt has left
[17:05:30] --- kaj has left
[17:15:18] --- kaj has become available
[17:42:13] --- kaduk@mit.edu/barnowl has become available
[18:44:05] --- rra has left: Disconnected
[19:05:07] --- Russ has become available
[19:16:49] --- kaduk@mit.edu/barnowl has left
[19:27:44] --- deason has become available
[19:28:48] --- mdionne has left
[20:02:04] --- tkeiser has become available
[20:17:03] --- tkeiser has left
[20:22:18] --- Born Fool has become available
[20:30:50] --- kaduk@mit.edu/barnowl has become available
[20:34:03] --- jaltman has left: Replaced by new connection
[20:34:04] --- jaltman has become available
[21:02:17] --- tkeiser has become available
[21:19:59] --- kaduk@mit.edu/barnowl has left
[21:22:54] --- Born Fool has left
[21:37:24] --- kaduk@mit.edu/barnowl has become available
[21:40:47] <kaduk@mit.edu/barnowl>     at /usr/ports/net/openafs-devel/work/openafs/src/rx/xdr_rx.c:185
185         if (rx_Read32(call, &l) == sizeof(l)) {
(kgdb) p &l
$14 = (afs_int32 *) 0xffffff80781ac22c
(kgdb) down
#10 0xffffff8000ab7b3c in rx_ReadProc32 (call=0xffffff0004e97000, value=0x28)
    at /usr/ports/net/openafs-devel/work/openafs/src/rx/rx_rdwr.c:382
382             memcpy((char *)value, tcurpos, sizeof(afs_int32));
(kgdb) p value
$15 = (afs_int32 *) 0x28

It looks like rxi_FreePackets is the only function call intervening ...
[21:41:10] <kaduk@mit.edu/barnowl> Sigh.  Was going to prefix with "this one is more fun".
[21:42:56] --- jaltman has left: Disconnected
[21:48:32] --- jaltman has become available
[22:18:40] --- kaj has left
[22:27:37] --- deason has left
[22:33:06] --- tkeiser has left
[23:14:00] --- kaj has become available
[23:35:12] --- rod has become available
[23:46:09] --- reuteras has become available