[00:33:40] --- Russ has left: Disconnected [00:49:12] --- dev-zero@jabber.org has become available [00:49:19] --- dev-zero@jabber.org has left: offline [01:31:32] --- dev-zero@jabber.org has become available [01:32:08] --- dev-zero@jabber.org has left: offline [04:18:53] --- Derrick Brashear has left [04:43:01] i seem to recall someone mentioning kopenafs needing some love but i forgot what it was specifically... [04:46:50] isn't https://fedorahosted.org/pam_krb5/browser/src/minikafs.c (BSD licensed) basically the same thing, but more complete? [04:55:37] and yeah there's similar code in pam-afs-session too [04:55:49] of course [05:30:07] --- Jeffrey Altman has left: Replaced by new connection [06:51:11] --- cclausen has become available [06:52:22] --- Jeffrey Altman has become available [06:59:34] --- deason has become available [07:05:02] Yesterday there was a discussion on the IRC channel of how CellServDB files work. It turns out that IBM in their original contribution gave us some very inconsistent CellServDB behaviors. auth/cellconfig.c which is used on Unix strictly uses the IP addresses provided in the file and ignores the hostnames except for informational purposes whereas WINNT/afsd/cm_config.c (as received from IBM) completely ignored the IP address that was specified and instead called gethostbyname() on the specified host name. [07:05:56] Currently WINNT/afsd/cm_config.c performs a gethostbyname() and falls back to using the CellServDB IP address if gethostbyname() fails. [07:07:55] --- reuteras has left [07:08:24] I have submitted a patch to auth/cellconfig.c that implements the current WINNT/afsd/cm_config.c behavior within auth/cellconfig.c. This is ticket 124946. Comments would be appreciated. If there is opposition to making this change on Unix (because it would be a change in behavior) I can make the change conditional for AFS_NT40_ENV so that all of the tools on Windows behave consistently with what the cache manager does. [07:11:39] huh. i had in my head that the Unix clients did what the Windows clients did, that they ignored the ip unless looking up the hostname failed. [07:26:21] --- stevenjenkins has become available [07:35:28] --- cclausen has left [07:44:14] No, the unix code has pretty much never used the hostname. Changing would be fine, but it is a behavior change and I'd be careful about doing it in 1.4, since it may cause CellServDB files that are currently working to break. [07:47:04] Also, you need to be careful about ubik servers; it is important that the CellServDB files used by them use exactly the set of addresses given by the administrator. In particular, a multi-homed host may have more than one address in the DNS or hosts file, but ubik must not be initialized with more than one address for the same server. Otherwise you'll screw up quorum calculations or worse. [08:47:50] do you believe that it is important that for servers, the IP addresses listed in the CellServDB file be used explicitly? It is possible to test for whether it is a client or server CellServDB path and process the file differently depending on the source. [08:48:00] I wouldn't make this change to 1.4 [08:55:19] --- cclausen has become available [09:01:58] the way the code is currently structured if CellServDB looks like >cell #a cell x..y.z.1 #a-server.domain [x.y.z.2] #a-server.domain [x.y.z.3] #a-server.domain ParseHostLine() will call gethostbyname("a-server.domain") three times and mark the response for the second two as a clone even though it is likely to return the same address. [09:02:24] ParseHostLine() only returns a single address [09:03:40] if multiple addresses are returned by gethostbyname() then all those other than the first are ignored. [09:05:07] the way src/WINNT/afsd/cm_config.c is structured, a callback function is used for each address so it is possible to support more than one address being returned by gethostbyname(). Not that it does so at the moment. [09:21:59] --- Derrick Brashear has become available [09:22:44] I believe it's important for _ubik_ servers, because the CellServDB is used to configure the ubik election process. For fileservers, which are really ubik clients but use the server-side CellServDB anyway, it's probably better in the long run to use DNS. [09:24:38] It would be nice to someday not ignore multiple addresses. But that doesn't affect the need for ubik to not do this, because you can't know that what you get back from DNS will be the right address, and I think it matters that everyone have the same address. [09:25:25] It would become _less_ important with proper multiple-address support, where we know that multiple addresses came from the same line, or if we were to identify ubik servers some other way. Enhanced ubik configuration would help with a lot of this. [09:32:02] I'm adding the multiple address support to Windows now because it is easiest. I will figure out a new interface for cellconfig.c to implement that can be used by all, can support multiple addresses, can process clones, and permit a "ubik" flag that specifies that only the IP addresses from the CellServDB file should be used. [09:32:43] does windows pare stuff beyond the first sting of [a-z.] after the # on host name lines? [09:33:02] if not, extensions can be done easily (and i have previously proposed a few) [09:33:46] "com"pare. sorry, xterm lost focus [09:33:48] it skips space and tab [09:34:07] and then uses the rest as a host name [09:34:15] sure, but does 10.0.0.1 #foo.com hfgkdhfgkhdfghdfh do the right thing? [09:35:01] it will result in a hostname of "foo.com hgakdkaksdkdk" [09:35:23] no spaces in hostnames, so what will it really result in? [09:35:47] that is what will be passed to gethostbyname() [09:36:03] I suspect gethostbyname() will barf and the IP address that is specified will be used [09:37:51] but that is a good point and I will add a check for that [09:39:38] well, if "the right thing" happens now, we coul use that to our advantage [09:39:55] failing that, a new format cellservdb with a new name could be used, falling back to an old one. [09:40:06] what happens if someone does 1.2.3.4 #1.2.3.4 b/c they don't have DNS ? [09:40:28] gethostbyname on an IP should work fine [09:50:56] > Also, you need to be careful about ubik servers; it is important that > the CellServDB files used by them use exactly the set of addresses > given by the administrator. [09:51:08] this gets ugly when you have a ubik server behind a nat wrt the other hosts [10:10:13] --- dev-zero@jabber.org has become available [10:10:17] --- dev-zero@jabber.org has left: offline [10:15:08] we need elections to be based on a uuid of the server and not the ip address [10:15:44] well, yes. but that's orthogonal to what's in cellservdb [10:15:50] I know [10:16:18] but if uuids were used, then the content of cellservdb becomes much less critical [10:16:41] at least from the perspective of it being identical everywhere [10:20:35] uuids++ [10:24:49] --- Derrick Brashear has left [10:34:04] --- bpoliakoff has become available [10:35:21] --- Jeffrey Altman has left [10:46:12] --- Jeffrey Altman has become available [10:59:34] one real problem with the way that cellconfig.c currently works is that it constructs a table of all known cells in memory with their addresses and then performs the lookups. If we query gethostbyname() while parsing the file, we query dns for every host name entry. [10:59:43] that approach is definitely going to be the wrong one [11:35:28] > windows pare stuff beyond the first sting of [a-z.] after the # I'm not sure that matters. I think we've seen things in the past that interpreted the hostname part. [11:36:46] --- Russ has become available [11:36:56] --- dev-zero@jabber.org has become available [11:37:47] > content of cellservdb becomes much less critical Only sort of. The problem is, you can't expect admins to tell you the UUIDs, and you can't discover them if the named server isn't up. So, you need to be able to infer the number of voters from what the admin gives you, even if no other servers are up. [11:38:50] --- dev-zero@jabber.org has left: offline [12:49:11] if the votes contained the uuid of the voter, duplicates can be filtered out [12:52:35] if you have 2 UUIDs up and 3 addresses down, you don't know if the 3 addrs down are all the same voter or not [12:57:11] did you ever hear from them before? [13:06:39] Not if the admin just added them to the CellServDB, and is restarting this server before bringing up the new one in order to avoid ever transitioning through an invalid state. [13:07:52] (you can add a server that's not up, because not all the servers being up is a valid state as long as all of the servers that are up agree on what the servers are. But it is invalid to bring up a server that is not in the configuration of all existing servers, and can cause problems. So BCP is to always add the new server to the existing servers' configuration before bringing it up for the first time) [13:08:04] shutdown, add, restart? [13:09:07] yes, shutdown, add restart, then bring up the new server. because, see, you also don't want service outages [13:10:58] So, traditionally, "frontend" is the thing you interact with that is built on an abstraction, and "backend" is the part that maps the abstraction to something concrete. [13:13:18] sorry, i thought you meant change config, shutdown, restart [13:13:30] whereas i was suggesting inverting order for first 2 [13:15:34] Oh, either would work. Changing the config takes effect on the restart; whether you do it before or after shutdown affects only how long the server is down. In either case, upon restart the config contains a server you've never heard of. Of course, this also comes up when bootstrapping a new cell. [13:18:07] what we need is some indication in the cellservdb that states which addresses belong to the same server regardless of whether or not they are reachable [13:18:35] and whether or not they are addresses that are known to the server they belong to [13:18:54] jhutz: eh, "frontend" had main; I wasn't really trying to make sure I was using them right [13:20:09] that could be the "dns" hostname in the cellservdb if they are all the same [13:32:59] --- bpoliakoff has left [14:01:32] > what we need... Is configuration for Ubik other than CellServDB, so we can use CellServDB for advertising dbservers to fileservers and clients, period. [14:03:03] what would be different about this ubik config file? [14:03:58] uuids, relationships of servers. [14:07:27] is this also involving a configurable half-vote, instead of the lowest ip? [14:07:45] you'd have to checksum the config to ensure everyone agreed where that vote was [14:10:51] Not just configurable half-vote, but configurable numbers of votes for each server, and configurable priority for deciding who to vote for. [14:11:32] You'd have to do something to verify people agreed on the config. And in fact, it couldn't be a simple checksum, because to make some kinds of transitions feasible, there are some kinds of differences you'd just have to allow. [14:12:47] I'd figure you wouldn't want uuids in a place where a admin could try and set them manually [14:12:51] Oh, and we might actually want to introduce a third mode. Right now we have full servers and non-voting servers. I want to introduce a server type that can vote but cannot run. [14:13:33] I'm not sure you'd use UUID's, rather than some other kind of identifier that admins can actually use. If you do use UUID's, I think you'd want to store them not in the config file. [14:50:58] I think the third mode would be quite useful. [14:58:49] the interesting thing is (1) A server cannot ever become coordinator if the candidates with better priority collectively hold a majority of the votes. Thus, any server with this property (e.g. in the current system, the third full server in a cell with three, or the fourth and fifth in a cell with 5) can be treated as if it had the new non-candidate-voter role. [14:59:09] (2) It is not necessary for non-candidates to agree on the number or distribution of votes. [15:00:16] are votes signed? [15:00:19] The reason for (2) is that only candidates ever make determinations about who won the election, and a voter must always cast all of its votes for the same candidate. So as long as all the candidates agree on who has how many votes, you will get a valid result. [15:00:26] or authenticated or whatever its called? [15:02:16] --- Jeffrey Altman has left: Disconnected [15:02:18] Unfortunately, no. A "yeS" vote is represented as a positive return value from an RPC, which is carried in an (unauthenticated) abort packet [15:02:28] That is something we'd like to fix. [15:02:35] so can't I fake out ubik and cause a DoS ? [15:02:45] b/c that sounds like a big security problem [15:03:04] or are votes are ACKed from servers in the server's CellServDB file? [15:04:10] and can't the KeyFile be used to sign / encrypt the votes? Or is this a protocol limitation with how ubik votes? [15:05:05] As I said, a "yes" vote is a positive return value from an RPC. The Rx protocol carries non-zero return values in abort packets, which are not authenticated. [15:05:50] so no "abort packets" are ever authenticated? [15:05:59] or just not in this case? [15:06:15] (you answer would probably make more sense if I actually knew how rx works) [15:06:15] However, you cannot just randomly cast a vote; you can only cast a vote in response to a candidate making a VOTE_Beacon() RPC to you. The RPC is authenticated, so constructing a reply with the correct call number is going to be tricky, though not impossible, and you cannot vote "no" [15:06:31] ah, ok [15:06:49] so I can't just randomly send packets to servers in the CellServDB [15:06:49] Correct. abort packets are never authenticated. Calls, and the output of successful calls, are authenticated. Failed calls return an error code, and that is not authenticated. [15:08:12] I assume there isn't a way to authenticate errors, b/c one of the errors could be "unable to authenticate" [15:09:39] That's a little simplistic, but yes. However, we could rev VOTE_Beacon() to move the vote response into an OUT parameter and have the call always succeed, in which case the response would be protected. [15:12:25] would be incompatible with existing servers, correct? [15:12:51] the new RPCs would be [15:13:20] an attacker could always fake the old method, correct? [15:13:39] or would there be a way to actually not use the less secure method? [15:14:02] No, an attacker could not fake the old method. [15:14:42] an attacker does not get to decide what call to make. You don't vote by making an RPC. You vote by returning a result from an RPC the candidate makes to you [15:15:41] do you authenticate the candidate? [15:16:10] I could pretend to be the candidate and make a fake request, right? Oh, I guess that doesn't affect anything at all though [15:16:15] nevermind. not thinking enough [15:16:25] --- Jeffrey Altman has become available [15:23:46] No, you cannot pretend to be the candidate. [15:23:54] Not successfully, anyway. [15:24:11] and yes, if you could, you could do bad things, like claim you were already sync site [15:37:30] --- Simon Wilkinson has become available [15:38:08] --- Simon Wilkinson has left [16:00:57] --- deason has left [16:01:08] --- deason has become available [16:46:01] --- mdionne has become available [16:53:38] please review /afs/athena.mit.edu/user/j/a/jaltman/Public/OpenAFS/cellconfig-gethostbyname-1.patch [18:35:52] --- mdionne has left [18:38:50] --- mdionne has become available [18:53:02] --- mdionne has left [19:14:53] --- Jeffrey Altman has left [19:34:07] --- Jeffrey Altman has become available [20:36:48] --- cclausen has left [21:35:31] --- cclausen has become available [22:15:51] --- deason has left [22:48:05] --- Jeffrey Altman has left [22:49:35] --- dev-zero@jabber.org has become available [22:49:37] --- dev-zero@jabber.org has left: offline [22:59:00] --- Jeffrey Altman has become available [23:05:37] --- reuteras has become available