[00:34:04] --- pod has left [04:02:04] --- pod has become available [05:20:33] --- jaltman/FrogsLeap has left: Disconnected [05:22:29] --- Simon Wilkinson has become available [05:49:23] --- Simon Wilkinson has left [06:01:25] --- ksumner has become available [06:07:04] --- Simon Wilkinson has become available [06:08:15] --- jaltman/FrogsLeap has become available [06:09:04] --- ksumner has left [06:11:15] --- ksumner has become available [06:52:44] --- lars.malinowsky has become available [07:21:46] --- lars.malinowsky has left [07:22:03] --- lama has become available [07:25:40] --- lama has left [07:38:23] --- deason has become available [07:56:15] who do I talk to about cache bypass panics? [07:56:54] 1.6 candidate? [07:57:01] pre4 [07:57:25] was stable for about 6 hours, which is better than the few seconds it took last time [07:57:56] it was a vmware esx vm, so best I could do was grab a screenshot of the panic from the console. wasn't sure what else I could do. [07:58:07] so you have a backtrace? [07:58:12] Mail me, marc and derrick the screenshot [07:58:21] that's better than nothing. yeah, that works [07:58:31] or put it in afs and give us a path :p [07:59:08] /afs/bx.psu.edu/user/phalenor/public/1.6.0pre4_cbp_panic.jp [08:01:19] --- lama has become available [08:02:40] phalenor: Can you run a gdb command for me? [08:03:01] possibly [08:03:09] I'd like you to gdb your AFS kernel module, [08:03:23] and then list *afs_BackgroundDaemon+)x2fd [08:03:23] sure [08:03:31] Sorry, that's wrong [08:03:42] list *afs_PrefetchNoCache+0x720 [08:04:02] alright, pretend I've never debugged a kernel module before [08:05:00] just run gdb against the .ko? [08:05:10] Okay, find your kernel module - if you've got an RPM its probably /lib/modules//extra/openafs/openafs.ko [08:05:18] Yeah, then gdb against that .ko [08:06:05] no debugging symbols, is that going to be a problem? [08:06:14] Yeah. [08:06:19] Boring. [08:06:32] Do you have a -debuginfo for the kernel build? [08:07:25] it would appear not [08:07:31] --- Russ has become available [08:08:22] Do you still have the tree in which you built the kernel module? [08:09:06] yes [08:12:04] Okay, the .ko in that tree will hopefully not be stripped [08:12:08] could also just read the assembly... [08:12:28] Doesn't (easily) give me the line our source that it came from [08:13:05] aha, got it [08:13:20] Cool, so now (in gdb) type [08:13:21] list *afs_PrefetchNoCache+0x720 [08:13:24] and paste the results here [08:13:31] (gdb) list *afs_PrefetchNoCache+0x720 [08:13:31] 0x17908 is in afs_PrefetchNoCache (include/asm/string.h:206). [08:13:31] 201 } [08:13:31] 202 [08:13:31] 203 static __always_inline void * __memcpy(void * to, const void * from, size_t n) [08:13:34] 204 { [08:13:36] 205 int d0, d1, d2; [08:13:39] 206 __asm__ __volatile__( [08:13:41] 207 "rep ; movsl\n\t" [08:13:44] 208 "movl %4,%%ecx\n\t" [08:13:46] 209 "andl $3,%%ecx\n\t" [08:13:49] 210 #if 1 /* want to pay 2 byte penalty for a chance to skip microcoded rep? */ [08:14:00] Okay, so we're blowing up on a memcpy [08:16:20] hah, neither does the source listing, apparently ;) [08:16:34] Yeah. Very Boring. [08:18:29] The actual problem is in afs_bypasscache.c:392 [08:20:31] not the other memcpy? [08:20:43] well, I guess they're the same anyway [08:55:17] --- Russ has left: Disconnected [09:47:46] so I'm pretty sure it's reproducable for me. last time I tried it, it panic'd within seconds a few months ago. this time it stayed up for a few hours. I'm sure if I turned it on again, it'd panic again. I could bring up a VM to help in testing if so desired. [09:54:45] which linux distro is this? [09:57:42] centos 5.5, with the -238 kernel [09:57:44] 32-bit [10:01:37] you should be able to get a crashdump for post analysis fairly easily then, but maybe that won't be necessary... [10:04:02] I'm not too familiar with this code, but.... what I'd wanna see is the output of "disassemble afs_PrefetchNoCache" if you want to put that somewhere [10:04:17] and I don't suppose you have more of the panic output captured in syslog or anything? [10:19:45] If it's a panic, it won't go anywhere other than console, sadly. [11:25:32] apparently none of hpux, aix, irix have osi_readRandom implemented yet [11:26:50] I though they were implemented as a function that osi_Panic()s if its called [11:27:35] seemingly not. chaz says he can't load a module on irix, missing symbol [11:29:49] would you like a patch? [11:30:09] well, that'd be fine. [11:30:16] --- deason has left [11:31:09] --- deason has become available [11:33:43] --- deason has left [11:34:21] --- deason has become available [11:49:55] --- mfelliott has left [12:10:43] well if anyone wants me to try anything specific, or if you know how to get more of the console output from ESX, I'd be happy to bring up another VM to play around with. [12:11:46] I just wanted to see how much faster/slower cbp was for webserver duty (seems to be faster than reading from a cold cache, but still a wee bit slower than reading from a hot cache) [12:39:42] --- shadow@gmail.com/barnowl1CD2CB82 has left [12:40:34] --- shadow@gmail.com/barnowl1CD2CB82 has become available [13:03:19] irix doesn't have an rng? doesn't it have a /dev/random, and for that matter ssl and kerberos implementations? [13:08:49] --- mdionne has become available [13:11:30] You can't get to them from the kernel [13:11:40] There's no kernel API into the PRNG [13:13:10] well, not "directly"/"easily" [13:13:21] there's no kernel api for dns in most places, either ;) [13:14:02] Yeah. We could call out to userspace. But at the moment no codepaths reach osi_ReadRandom(), so I'm not inclined to implement that right now. [13:15:11] yeah, I'm not suggesting otherwise; just making sure I understand [13:17:06] There's one of the platforms that are in this situation that doesn't actually have any random support beyond using EGD. [13:43:44] --- deason has left [13:55:32] --- deason has become available [14:11:41] --- jaltman/FrogsLeap has left: Disconnected [14:16:49] --- Simon Wilkinson has left [14:17:02] --- mdionne has left [15:15:23] --- Russ has become available [15:42:58] --- deason has left [15:48:12] --- kaduk@mit.edu/barnowl has left [15:49:14] --- kaduk@mit.edu/barnowl has become available [16:26:50] --- andersk has left [16:29:00] --- andersk has become available [18:15:51] --- Simon Wilkinson has become available [19:01:30] --- Simon Wilkinson has left [19:51:44] --- Simon Wilkinson has become available [19:52:49] --- Simon Wilkinson has left [19:56:02] --- Simon Wilkinson has become available [19:57:13] --- Simon Wilkinson has left [20:02:03] --- Simon Wilkinson has become available [20:07:24] --- Simon Wilkinson has left [20:40:02] --- mdionne has become available [20:42:31] --- jaltman/FrogsLeap has become available [21:05:33] --- mdionne has left [23:15:50] --- lama has left