[01:11:55] --- Simon Wilkinson has left [02:32:16] --- Simon Wilkinson has become available [02:46:44] --- Simon Wilkinson has left [02:46:56] --- Simon Wilkinson has become available [03:36:08] --- Simon Wilkinson has left [05:47:27] --- Roman Mitz has become available [06:29:45] --- deason has become available [07:45:37] --- shadow@gmail.com/barnowlFF509CC1 has left [07:47:46] --- shadow@gmail.com/barnowlFF509CC1 has become available [08:38:31] --- rra has become available [10:00:27] --- Chris Garrison has become available [10:00:36] hello [10:04:48] I have an odd issue. Our supercomputers recently stopped being able to see our OpenAFS server directories. Directory stats and such hang. They have MTU set to 9000. If the remote node is set to 1500 MTU, everything works, and stays working for about a half hour of idle time after that. It seems to me that the krb5 ticket is over 1500 bytes, so with jumbo frames turned on, it's getting fragmented along the way. with them off (matching our AFS servers), the tokens don't get fragmented. [10:05:10] I am guessing that a network change between these is the cause. Is there a workaround in OpenAFS? [10:07:46] --- phalenor has left [10:09:08] I'd only expect the ticket to be large if you're using ad, and you haven't turned off the pac for the afs princ [10:09:29] --- phalenor has become available [10:10:18] _if_ that is the issue, there is not much we can do openafs-side; you'd either need to get the ticket size down, or fix whatever in the network is dropping the fragments [10:11:13] but I would hazard a guess that krb isn't related, and rx is just trying to send 9000-byte packets since that's what it detected for the local mtu [10:11:21] --- Chris Garrison has left [10:11:28] --- natefoo has left: Lost connection [10:11:49] but that shouldn't be a problem for small amounts of data; if you can't 'ls' a small dir or something, it seems like something else is wrong [10:11:52] --- natefoo has become available [10:11:59] (rxdebug -version is a simple test)... and he's gone [10:13:28] --- Chris Garrison has become available [10:14:39] Thanks, Andrew. [10:14:51] there's a maxmtu setting on the client, but it doesn't seem to work. [10:17:02] did you see the two messages sent while you left? [10:17:31] iirc, maxmtu doesn't do what you probably expect it to; it will set the mtu in the rx logic, but rx will intentionally send things higher than the configured mtu [10:18:36] but none of this should matter if the payload is less than 1500; I'd look at a simple rxdebug -version ping first [10:20:57] okay. no evidently I missed those earlier. not sure why I dropped out. [10:21:24] I just used your reply as ammo to hold the network guys to task. [10:21:42] if the fileserver doesn't advertize rx jumbograms, though, I don't think fragments should be sent; depending on how old they are, you may need to pass -nojumbo to the fileserver to make that happen [10:21:49] http://jabber.openafs.org/openafs@conference.openafs.org/2012-01-10.txt for messages [10:21:51] though I recall something about ADS being able to make the tickets smaller somehow. dropping unnecessary stuff. [10:22:17] fileservers have -nojumbo on them [10:22:43] yes, you can turn off the pac generation for the afs service princ; but that would only be a problem for authenticated connections; tickets aren't involved for unauth transfer [10:23:14] well, we're getting kinit/aklog fine. the token generated works under 1500 MTU but not 9000 [10:23:50] well it works under 9000 for a little bit after dropping to 1500 and then back to 9000 [10:23:54] minutes. [10:24:05] yeah, but does it work if you don't kinit/aklog at all? [10:24:24] (the ad flag you'd want to set is this, I think: http://support.microsoft.com/kb/832572 ) [10:24:45] I'm not sure what you mean. Unauthenticated access to the fileserver? Haven't tried it. [10:25:11] just 'unlog ; ls /afs/whatever' or something [10:25:14] I don't know that we could talk the AD guys into setting that flag or not. [10:26:12] a network dump would more likely point directly at the problem, though [10:32:50] yeah [10:33:06] we've done some tcpdumps and no one is really sure what's going on. [10:33:16] a lot of this is finger pointing between groups. [10:40:00] well, if you see which packets are getting dropped, that should tell you which specific aspect is breaking [10:40:30] if it's an rx RESPONSE packet (I think that's what it's called?), it's something to do with auth [10:40:55] if it's regular rx data packets, it's the regular traffic, and rx can probably be forced to send less per packet [11:08:33] they're response packets that are being dropped. [11:12:08] Chris: is the afs service ticket issued by AD or by a Heimdal/MIT KDC accessed via cross-realm? [11:12:51] AF [11:12:52] AD [11:13:04] we've run that way for years now [11:13:54] If the answer is AD, the administrators need to NO_AUTH_DATA_REQUIRED flag on the afs account. The flag only affects that one account and nothing else. This prevents the PAC from being added to that service ticket. [11:15:42] I'll have to ask the AD admins about that. [11:16:28] If the answer was cross-realm, either an afs/cell@AD SPN would need to be added to AD or a patch would need to be applied to Heimdal/MIT to strip the PAC from the service ticket. [11:22:05] the answer is AD [11:27:21] I understand. I wrote the other half because this conference room is logged and people do search the logs for answers. I try hard not to repeat myself when I don't have to. [11:31:41] oh I see, that's a good idea. [12:31:49] Looks like it's not set: "afs/iu.edu@ADS.IU.EDU is not set to 33620480 (NO_AUTH_DATA_REQUIRED). It is currently set to 0x210200 (NORMAL_ACCOUNT DONT_EXPIRE_PASSWD USE_DES_KEY_ONLY)" [12:32:38] So if I asked them to set NO_AUTH_DATA_REQUIRED, would it affect the servers or clients negatively? Would any kind of renegotiation be required (server restarts, etc) that would make this better done during maintenance? [12:47:36] the clients will "see" the change when they reauthenticate [12:48:12] "see" it in what way? [12:48:22] I've never heard of a negative impact as a result of setting that, but experiences with ad seem to vary greatly [12:48:47] as in, it'll do something; a user may need to reauthenticate in order for it to take effect [12:49:34] well that's okay [12:50:03] I just don't want to ask for a change and have a couple thousand users disrupted. [12:52:24] No users will be disrupted. Those that already have afs service tickets won't see the change until they request a new one. [12:54:09] excellent, thank you. [13:35:43] --- Chris Garrison has left [13:37:07] --- Chris Garrison has become available [13:57:25] --- Chris Garrison has left [13:58:18] --- Chris Garrison has become available [14:00:17] --- Chris Garrison has left [14:21:27] --- mdionne has become available [14:33:21] --- deason has left [14:35:16] --- mdionne has left [14:35:16] --- mdionne has become available [14:37:39] --- mdionne has left [17:32:19] --- rra has left: Disconnected [18:11:53] --- jaltman/FrogsLeap has left: Disconnected [19:34:09] --- jaltman/FrogsLeap has become available [21:25:36] --- Russ has become available [21:28:06] --- Russ has left: Disconnected [21:29:11] --- Russ has become available [21:51:02] --- Russ has left: Disconnected [21:51:31] --- Russ has become available [22:00:14] --- Russ has left: Disconnected