[00:34:37] --- Simon Wilkinson has left [00:40:13] --- Simon Wilkinson has become available [01:03:12] --- Russ has left: Disconnected [01:10:08] --- haba has left [02:15:36] --- dev-zero@jabber.org has become available [04:03:59] --- abo has left [04:05:46] --- pod has become available [04:47:34] --- haba has become available [04:57:15] --- Jeffrey Altman has become available [05:12:02] --- meffie has become available [05:44:33] wow -- what was the occasion for all the commits this am? [05:44:53] 1.4 pullups [05:45:25] ah, so those were already in git, just pulled over to 1.4? [05:45:26] bored. we do releases because we're bored. [05:45:31] yes. [05:45:48] I'm really bored. Can we do 1.6? [07:19:27] --- deason has become available [08:11:23] --- dev-zero@jabber.org has left [08:11:42] --- dev-zero@jabber.org has become available [08:12:33] --- abo has become available [08:16:21] --- clc31 has become available [08:17:07] --- clc31 has left [08:39:47] --- meffie has left [08:52:31] --- dev-zero@jabber.org has left: Lost connection [09:25:47] --- haba has left [09:54:39] --- edgester has become available [10:02:31] --- edgester has left [10:13:14] --- meffie has become available [10:20:35] --- haba has become available [10:32:46] --- haba has left [10:34:38] --- haba has become available [10:37:43] --- deason has left [10:37:50] --- deason has become available [10:37:54] --- deason has left [10:41:41] --- deason has become available [10:55:50] --- dev-zero@jabber.org has become available [11:08:41] --- deason has left [11:14:47] --- haba has left [11:15:06] --- haba has become available [11:22:01] --- deason has become available [11:22:48] --- Russ has become available [11:30:16] --- deason has left [11:30:40] --- deason has become available [11:32:08] --- deason has left [11:54:35] --- deason has become available [12:28:53] --- dev-zero@jabber.org has left [12:48:58] --- clc31 has become available [13:13:54] --- dev-zero@jabber.org has become available [13:18:09] --- dev-zero@jabber.org has left [13:18:22] --- dev-zero@jabber.org has become available [13:54:31] --- Rasmus Kaj has left [13:56:28] --- Rasmus Kaj has become available [13:59:49] deason: Are you sure the retry is being driven by afs_Analyze, and not by something earlier? [14:00:14] I'm actually sure it's not afs_Analyze, which I discovered a few seconds before you said that [14:03:10] My suspicion is that this is coming in via a codepath which can call GetDCache multiple times, if it gets a NULL return the first time. [14:03:29] Providing the number of attempts isn't infinite, I suspect we don't need to care about it. [14:06:23] yeah, it's like 4 times, I think [14:06:59] 4 times sounds a lot like GetDCache [14:07:01] If you can get it, fstrace output would be interesting. [14:18:41] BPrefetch perhaps? http://pastebin.ca/1718782 [14:27:16] Looks like it from that trace. [14:28:11] I don't think that the repeated fetches are worth worrying about for this bug fix. In the longer term, we perhaps need a way of GetDCache failing hard, and indicating that failure immediately to the VFS. But that code is such a can of worms, I don't think this fix is the place to do so. [14:29:43] alright then [14:34:01] --- mdionne has become available [14:39:22] Application Area AD is reviewing the SRV record draft to see about sponsoring it for Standards Track. [14:47:36] Okay, so this i fun. Over on IRC, someone is benchmarking a Debian system against a RedHat one. Same numerical kernel version, same numerical OpenAFS version. [14:47:58] Debian does 15MB/s [14:48:03] RedHat does 75MB/s [14:48:17] fcrypt is off on both machines [14:50:15] they compiled a vanilla kernel, or it's the same kernel version number but the stock images from debian/RH? [14:50:27] The latter, and the same with OpenAFS. [14:50:55] and just curious, redhat means fedora, rhel, centos, what? [14:51:04] The Debian OpenAFS packages pull up lots of stable branch patches. [14:52:38] RedHat means RHEL5 [14:53:08] What Debian package version are they using? [14:53:12] 1.4.11 [14:53:31] Do you know what the subversion is? [14:54:01] --- haba has left [14:54:33] 1.4.11+dfsg-5~bpo50+1 [14:54:40] --- haba has become available [14:54:47] although it could be a patch, I'd also wonder if it's something silly like if the mount parameters are different for the cache fs or something [14:55:01] It's memcache. [14:55:08] ah [14:55:30] Which helps by ruling out lots of kernel things that it could be. [14:55:42] Okay, so the fix pulled up to use the right credentials for SELinux wouldn't be it, then, since I think that's only the disk cache. [14:56:22] Indeed. [14:56:36] I'm not seeing any obvious patches unless releasing GLOCK for filldir would slow things down a lot for some reason. [14:57:02] Is both the debian and the redhat machine using selinux? [14:57:04] [bdb4f98a] Protect rx_call iovq from simultaneous attempts to empty it involves adding more locking, maybe? [14:59:21] Does Debian even have SELinux by default? [15:00:20] Nope. [15:06:35] --- clc31 has left [15:07:25] Okay, so vanilla 1.4.11 gives 90MB/s. Debian 1.4.11 gives 15MB/s [15:07:38] same server? jumbograms? [15:07:44] Same client now. [15:08:07] Okay, so that argues for patches. [15:08:31] Patches applied over top of 1.4.11 other than pure file server patches are: [15:08:36] - openafs-stable-1_4_x/kernel-init-vrequest-structure-20090914: properly initialize vrequest structure in the kernel. [15:09:06] - [c9974c7a] Avoid prematurely destroying callback_rxcon - [9b37972e] Linux: 2.6.32 - Adapt to writeback changes - [abdf72bc] Linux: Avoid deadlock in readdir - release GLOCK for filldir - [bdb4f98a] Protect rx_call iovq from simultaneous attempts to empty it - [a410b7fd] Linux - Fix disk cache access for selinux/AppArmor constrained processes (LP: #415766) - [525b594a] Make ktc_curpag generally available (LP: #446521) [15:09:26] Actually, the first of that second batch may also be a file server patch. [15:09:55] There are more patches in -6, but it didn't sound like they were running that. [15:10:58] --- deason has left [15:11:32] I'd be inclined to point my finger at bdb4f98a [15:14:26] They're going to try building with bdb4f98a applied later tonight, and see what that does to performance. [15:59:41] --- dev-zero@jabber.org has left: Replaced by new connection [15:59:43] --- dev-zero@jabber.org has become available [16:18:41] With disk cache, 1.4.x tip is no slower than 1.4.11 [16:23:36] bdb4f98a could be recoded to not grab the lock until after there is a likelihood that the queue is not already empty [16:25:01] We'll have to wait and see what phalenor says, but my test box (which is disk, rather than memcache), doesn't appear any slower with bdb4f98a applied. [16:25:30] Russ: Is there anything in Debian which isn't in the 1.4 branch? [16:25:31] that is a per call lock. it really should not be under significant contention. [16:25:41] Yes. [16:25:50] But nothing that should affect the kernel module. [16:26:10] fstrace message catalog, fixes to compiler specification during Autoconf, that sort of thing. [16:26:13] I've pushed it all into master. [16:26:33] what if his workload is different enough than yours that he's recycling calls faster? [16:27:06] Well, we're running the same benchmark. I think his machines are faster than mine, and obviously he doesn't have a filesystem slowing stuff down. [16:27:34] ok. i lack details, the irc server silently dropped me. yes, i still hate irc [16:27:44] i could read logs but not right thisminute [16:28:55] But what I was worried about (that there was going to be something in 1.4.12 which had a major, universal, performance impact) appears to not be the case. [16:29:15] I guess there could still be something which impacts memcache only, but that code hasn't been touched in years, as far as I can see. [16:29:26] largely no [16:29:50] well. what if you do an ext2 in a loop-mounted file from memory (tmpfs)? [16:29:57] http://patch-tracker.debian.org/package/openafs/1.4.11+dfsg-6 is... not entirely unhelpful if you want to be sure you know exactly how the Debian package varies from 1.4.11. [16:30:09] since you have this test ready to fly [16:32:05] What would that show, though? Whether a faster disk cache can cause problems? [16:32:28] presumably [16:32:42] well, as fast as you'll be able to test, i assume [16:38:51] Hmmm. may have spoken too soon. [16:44:26] i wondered that... [16:45:00] Helps if you git fetch in the sandbox you're building from. [16:46:05] evicted again [16:46:17] evicted from? [16:49:47] Hmmm. No uintptr_t in the kernel on Linux. [16:55:55] pittmfug meeting space this time [17:00:55] News from IRC is that bdb4f98a is off the hook, anwyay. [17:03:29] Performance of actual 1.4.x head (now I've got it to compile) is still pretty comparable to 1.4.11, though. So, dunno. [17:06:30] you think it's an artifact of phalenor's environment? [17:07:10] Or of memcache, or of something in the way that the Debian packages are built. Still too many variables at the moment. [17:08:28] And the numbers I'm seeing here do suggest that 1.4. may be slower than 1.4.11. I'm seeing about 20% less throughput on reads on two comparable runs, for example. [17:08:46] But that's all single run figures from iozone, against a loaded fileserver. So other factors may be in play. [17:09:08] But I'm not seeing nearly the performance loss that phalenor is seeing. [17:21:53] Sadly, the way that the pullups have been done means that it's impossible to bisect on the 1.4 branch. [17:23:19] Eh? How so? [17:23:36] Because broken patches have been pulled up. [17:23:47] Oh, so you can't build the tree at every point. [17:23:48] Then the breakage fixed after all of the patches have been applied. [17:24:08] Yeh. Essentially, you can build the tree before yesterday, and that's it. [17:26:11] Well, if the problem is affecting the Debian packages, the tree before yesterday already had the problem. [17:26:25] I haven't pulled in any of the many patches that Derrick pulled up. [17:26:59] Hmmm. Let me try changing my range, then... [17:32:03] --- abo has left [17:32:04] --- haba has left [17:32:07] --- haba has become available [17:32:07] --- abo has become available [17:38:29] --- meffie has left [17:39:54] So, the speed reduction I'm seeing is from something since c9f7fe37d366e1529c50b4d86caa3b99cd1d005b (ie, in the range that it's hard to bisect in) [17:44:40] --- Russ has left: Disconnected [17:54:33] well, if i xould just manage the motivation to get out of the car and go open the laptop i bet i could narrow it down quickly [17:55:33] Yeh. I can probably pick through whats there. I need a better test environment to do so, though. [18:08:59] --- Russ has become available [18:33:45] --- deason has become available [18:38:23] --- mdionne has left [19:37:12] --- clc31 has become available [20:00:05] does the following sequence of bytes have any meaning to anyone b7 87 42 57 09 6b 03 18 [20:08:17] looks like a key fingerprint [20:11:28] Not one known to the keyservers at least. [20:13:00] I don't think it is a fingerprint. its random garbage that is showing up repeated 8 or 9 times in a row in a buffer that should be filled with NULs [20:17:38] file doesn't recognize it as a magic number at least. [22:18:39] --- reuteras has become available [22:49:15] --- deason has left