[00:09:11] --- Derrick Brashear has left [01:27:54] --- Derrick Brashear has become available [01:27:58] first session [01:28:03] generic quota mechanism [01:28:05] --- Simon Wilkinson has become available [01:28:17] christof: at the moment we have only minquota/maxquota [01:28:49] perhaps we should create an array of tag-value pairs of elements [01:28:54] derrick: why fixed array? [01:29:06] christof: could be not fixed array [01:29:21] christof: want to track boundary and actual state [01:29:31] simon: want to generalize for more than osd [01:29:42] christof: can bind the tags to anything [01:29:57] simon: current scheme can be implemented in this [01:30:20] do tags need to be shipped with afs? [01:30:26] jeff: you need an enumeration rpc [01:30:59] simon: does it just return a human readable string? [01:31:08] jeff: localization desires [01:31:37] simon: would just list supported =tags for that server or that volume? [01:31:39] jeff: yes [01:32:22] aside: http://www.ietf.org/id/draft-benjamin-extendedcallbackinfo-00.txt [01:32:39] jeff: this scheme gives clients unaware of tags the ability to dump raw data [01:32:47] simon->christof: is this an action item for you? [01:32:51] christof: ok. [01:32:59] simon: (summarizes the path forward) [01:35:10] simon: gut feeling to ignore the posibility that we need to move to 128bit ints when we do this [01:37:42] christof: partition usage also? [01:37:51] simon: i think this is more quota-focus [01:38:09] (zfs discussion) [01:40:14] simon: feeling as to whether this goes into FetchStatus? [01:40:31] derrick: if it can represent a potentially-boundless set of values, it's a new rpc [01:41:48] hartmut: question of whether a versioned union could be used in new rpcs [01:42:05] simon corrects: FetchVolumeStatus [01:42:19] so we should just rev that [01:42:28] and the volserver-similar struct [01:43:25] simon: VolGetStatus has to change. (ali may also care) [01:44:27] hartmut: volintinfo needs to then be able to encapsulate this data so it can move with them [01:45:00] then dumps need to have a new tag which encapsulates the array [01:45:20] tom: listmultivolumes rpc will be done for dafs; would be good to get everything into a new rpc [01:45:39] simon: old client and old rpc needs the information it can have, loses what it can't [01:46:08] ultimately informational, not used by cache manager [01:46:21] simon: tom, suggesting just rev'ing existing rpcs? [01:46:27] tom: yes. avoid more round trips [01:46:54] simon: should we discard the existing fields in the new rpc? [01:47:05] derrick: new rpc users know how to decode; yes [01:48:10] simon: do you even need enumerate tags? [01:48:18] or does the fileserver just not let you set things? [01:49:14] jeff: what happens with a single unknown tag in an array? [01:49:20] simon: reject whole rpc [01:49:26] jeff: how do you know which failed? [01:49:33] simon: maybe we do need to enumerate [01:50:20] question of max tag issue won't fix [01:50:24] we need enumerate tags [01:50:30] (maybe?) [01:51:12] simon: to provide meaningful error messages, need enumerate tags rpc [01:51:27] tom: looking for 3 tags [01:52:36] implementation type (for whole server) [01:53:27] current volume state (raw, mapped (online, offline, preonline, busy, salvaging)) [01:54:00] simon: are the values int32? [01:57:06] tom: ideally this belongs in the volint interface simon: seems reasonable except implementation type [01:57:34] simon: implement something in addition to simple capabilities bitmap? [01:58:08] simon: use a capability for now, and if we need to revisit, we can [01:58:58] tom: that's all we need [02:00:46] simon: tom, will you write that? [02:00:48] tom: yes [02:01:16] simon: do it as part of the rpc refresh or re-rev rpcs after? [02:01:47] derrick: can we do this first? [02:01:53] simon: just don't want to block rpc refresh on this [02:02:17] tom: i-d form? [02:02:19] jeff: yes [02:04:58] metadiscussion: new drafts to ietf should be draft-(individual)-afs3-(whatever's a draft) [02:06:35] moving on [02:06:40] rtt calculations [02:09:22] jeff: currently rx does rtt calculation using van jacobsen and phil karn's work [02:09:30] currently the values generated are inaccurate [02:09:32] 2 issues [02:09:39] 1) selection of packets as input is flawed [02:09:51] should exclude retransmits [02:10:10] (as documented by phil karn in 1987; we don't know which of the packets resulted in the reply) [02:10:47] currently we assume from the first packet, so the rtt could potentially be the real rtt+the retransmit interval [02:11:13] 2) not actually including all packets we could gather an rtt over, we filter out a large number for various reasons, so our measurement set is poor [02:11:25] we end up having skewed unrepresentative changes [02:11:44] side effect: very few retransmits, fewer than desirable to maintain performance [02:12:04] we hand out data, then block waiting far longer than we should rather than filling the pipe [02:12:47] operational experience shows more retransmits, a larger sample set and better performance [02:13:00] jake's work was 2 patches [02:13:19] 1) a way to export rx peer structures via api, previously only via rxdebug [02:13:26] which is integrated [02:13:46] 2) he has a later patch which uses the rtt values as an input to compute server rankings [02:13:56] this cannot be usefully used without the rtt fix [02:14:52] (jeff gives background of how rankings were previously calculated) [02:15:48] jake's cm implementation uses a background thread to periodically recompute server rankings [02:22:36] jeff: algorithms there are rough, they could use performance analysis and we should look at other ways to process the data [02:22:48] jake is trying to get an undergrad research grant to continue this [02:25:24] Derrick: Window size is easy. Nothing to negotiate [02:25:35] Hartmut, Jeff and Derrick discussed earlier [02:25:48] If you support a larger window size, and the other side supports it, it just works. [02:25:58] If the other side doesn't support it, then it will just throttle you. [02:26:17] hartmut: Window size is only one byte. Can't go any larger than that. [02:26:28] derrick: Is this a problem? [02:26:34] tom: 1gigbit will make you sad [02:26:44] hartmut: Problem is WAN with high RTT [02:27:01] (actually tom said transcontinental 100gig would make you sad) [02:29:05] hartmut: On a WAN, it would be nice to be able to have higher window sizes [02:29:24] derrick: Is it controversial to say we should push the window size as high as we can with our current implementation [02:29:43] derrick: We probably need to think a lot before we start considering reving the RX header [02:29:53] jeff: agrees, but uses more words to say so [02:30:25] derrick: OpenAFS should push it to at least 128, or to whatever the actual limit ends up being - 254, providing there's not an issue with it being a power of 2 [02:30:37] jeff: we need to test in an environment with delays in it [02:30:53] derrick: Will take as an action item that he will test this out, and move it forwards [02:31:04] Moving on ... [02:31:46] derrick: RX already has negotiation by using the ping/ack payloads [02:32:04] size is used to determine how to decode it [02:32:14] can add more options by changing the packet size [02:33:06] What things do we want to negotiate in RX, and what is the fallback procedure so that we don't throw away data [02:33:40] Providing we keep the original start to the payload, old clients will keep working, even with an extended packet size [02:34:07] If we want to do more calls, we also need to rev rxkad [02:34:51] New challenge has to be existing challenge with more stuff at the end [02:35:44] Derrick: Not sure what the path forwards is, because this isn't obviously AFS3 protocol, as it's RX. [02:36:21] Simon: Thinks we should do this on the afs3-stds list [02:36:48] Jeff: Thinks we should directly approach people at Universities with a history of RX use/development to review [02:37:10] Derrick: Wonders if we will find people [02:37:57] Matt: Should talk about this on afs3-stds, because this is where we should talk about stds things [02:38:06] (discussion about where we discuss RX) [02:40:06] simon: afs3-stds is the best place for common discussion; private discussion may also take place or those others may choose to comment to the list [02:40:37] Derrick: Should there be a draft fall out of this? [02:40:42] Jeffrey: Yes [02:41:05] e.g. draft-brashear-rx-call-option-negotiation [02:41:42] Derrick will go away and write this up [02:41:55] derrick: Are there other things that we know that should be being negotiated? [02:42:07] matt: packet size? [02:43:07] (discussion of mtus) [02:43:37] Derrick: going to do path mtu discovery, but that doesn't cause protocol changes [02:45:04] Marcus worried about implications of path mtu discovery. [02:46:36] (discussion about avoiding it in cases where client is just doing short lived connections) [02:47:13] Goal is to get the most out of rx/udp for long running services. It's a trade off. Don't make short lived connections worse, but not focussing on making them better. [02:47:37] Derrick: Another thing is delivering large payload rx packets, that aren't jumbograms [02:47:53] Need to test and confirm that implications do the right thing, but no protocol implications. [02:47:59] s/implications/implementations/ [02:48:20] Current rx library should support it, but not necessary well. [02:48:44] Derrick is going to go off and do this. [02:49:06] ... but no protocol/documentation issues? [02:49:17] Room seems to think we just want to see the code. [02:49:48] rx/udp discussion is pretty much done. We came out ahead [02:56:56] Hartmut & Christof leave for plane ... [03:14:24] next up: rxgk [03:15:26] simon: splits into 2 parts [03:15:33] negotiation/key establishment [03:15:36] data encryption [03:15:40] there' [03:15:48] --- tkeiser has become available [03:15:48] s a key negotiation rpc [03:16:04] via an unencrypted rx connection [03:16:35] in the rx specific part, it's defined as application-dependent who you negotiate with [03:16:45] in afs it's proposed to be against the vlserver [03:16:52] avoids upcalls, gss in the kernel [03:17:13] GssNegotiate RPC, this is slightly stateful as multiple round trips are possible [03:17:21] the opaque in/out tokens allow state [03:17:47] for security reasons they need to prevent hijacking another in-process connection, however the opaques are implementation defined [03:18:16] the client sends startparams: enctype, levels, lifetime, bytelife, nametag and a nonce [03:18:44] levels include the obvious ones, plus bind, which allows e.g. transport-specific security to be bound; not proposed to be implemented yet [03:18:58] s/transport/network protocol/ (e.g. ipsec) [03:19:36] when after you've done enough round trips and you've finished negotiating, you get a clientinfo blob [03:19:57] errorcode, flags, enctype, level, lifetime, bytelife, expiration, a gss mic, ticket and the server nonce [03:20:23] the ticket is an opaque identifier [03:20:37] (previously was defined in the draft but it should be implementation-dependent) [03:20:52] an afs proposed ticket will be made but not part of this [03:21:05] by the time you get this info block, there's a gss context established [03:21:16] the mic provided is calculated over the provided start params [03:21:23] so no owngrade attack is possible [03:21:43] gsswrap encrypts this block that comes back, tying in the server nonce [03:22:01] server never gives the client a readable key [03:22:34] the prf negotiated from the sec context, uses the 2 nonces from each side and embeds the calculations for the prf into the ticket [03:22:53] uses gss pseudo random with the client nonce||server nonce [03:23:14] the K output length is the key gen seed length specified in 3961 [03:23:28] the gssapi layer is not tied to kerberos in spite of using 3961 for this table [03:23:44] in the simplest form you've negotiated a key to use for the rest of the session [03:24:06] ticket is used as paert of security class and is used for establishment of the security class [03:24:17] tk, the transport key, is derived from the overall [03:24:27] key, using the prf+ operation [03:24:39] and the random-to-key operation [03:24:56] the client asserts a timestamp as the input to this key [03:25:04] the challenge is simply a version and a nonce [03:25:10] xdr-encoded, sent to client [03:25:28] response: version, start time, token, authenticator [03:26:03] authenticator: limited by rx max calls. [03:26:10] needs to be variable length [03:26:31] encrypted with transport key [03:26:39] every enc operation uses key derivation [03:26:48] one set of derivation values for every kind of operation we do [03:27:02] authenticator encrypted in the tk; the [03:27:17] none, epoch, cid, call numbers decrypted and if it matches a sec context is established [03:27:25] 3 security layers: [03:27:40] encyption. has pseudoheader: call, seq, data len, service id [03:27:48] encrypted with 3961-style enc function [03:27:55] 2 derivation values per direction [03:28:45] integrity adds the header to the mic generation, then ships without the header [03:29:17] (fire alarm) [03:29:49] auth-only sticks payload straight onto the wire [03:29:57] that's the simplest way to run it [03:30:06] more complex mechanism avoids cache poisoning [03:30:13] can also solve migration problems [03:30:19] in this mode: we have a token [03:30:34] the traffic should be authenticated as coming from both user and cache manaher [03:31:11] the attack this precludes is the user cannot spoof being the server to the cache manager to inject data into the cache of a client which will be executed by the system or another user [03:32:20] combinetokens allows a keyed cache manager to have input to the token such that both the user and the cache manager have both proven they are involved [03:32:35] combinetokens takes 2 tokens, combines, gives you one. implementation-defined [03:32:50] actual openafs implementation will be more complicated [03:33:22] token0 and token1 and used to get key0 and key1 and then uses key combination to create keyN to return in tokenN [03:33:34] in the afs case it includes the user and the cm identity in the token [03:33:55] key combination algorithm is from the kerberos working group [03:34:17] KRB-FX-CF2, includes 2 pepper strings, used similarly but not quite like salts [03:34:22] not quite hmacs [03:34:43] outsource security implications to krb wg [03:35:09] some are keen to keep rx part of the description away from "how it works with afs"; this document splits into those parts [03:35:21] with afs, combinetokens will be more complex [03:35:31] we want to allow for "departmental fileservers" and mixed cells [03:35:46] (kad, rxk5 and rxgk servers deployed) [03:36:36] for afs, the extended CombineTokens RPC includes a target host and service: it will give you the right kind of token for the desired service, maybe includes the afs global key, maybe server specific, and maybe none, in which case you use another mechanism [03:36:41] questions? [03:37:01] elizabeth: is this at the volume level? [03:37:13] jeff: at the server (connection) level; not specific to afs [03:37:31] simon: i will implement the non-afs-specific version of combinetokens, but afs will not use it [03:38:22] jeff: you get the same crap as yesterday [03:38:34] how do you solve the first packet problem? use the solution from yesterday [03:38:42] same for binding the client uuid to the authenticator [03:38:48] simon: can't be a uuid [03:38:58] application-opaque [03:39:07] there may be api issues with that [03:39:39] jeff: implementation note for afs: we will need to extend vl_registerrpc so types of authentication can be registered in order for the afs combinetokens to be useful [03:40:07] simon: we also need for dept fileservers a repository of private keys [03:40:12] derrick: so vlserver is a kdc? [03:40:22] marcus: it's the logical place to put it [03:40:32] simon: separate data store [03:40:53] marcus: you could use the ubik "table-like" feature [03:41:11] simon: we'll come back to departmental fileservers as 2nd phase of implementation [03:41:41] derrick: use a second vlserver "rx service"? [03:41:57] marcus: reuse vlserver connections? [03:42:11] simon: combinetokens probably needs an rxgk-protected connection [03:42:20] protected with the single token as a result of negotiate [03:43:12] side discussion of connection overhead [03:43:35] simon: server combinetokens can be overloaded/DoS'd by combinetokens in the clear [03:43:42] rxgk protection removes that [03:43:55] marcus: what if combinetokens was done on the client? [03:44:02] simon: client can't decrypt [03:44:07] it's opaque [03:44:16] --- tkeiser has left [03:44:20] new token is not simply result of smashing 2 tokens together [03:44:39] --- tkeiser has become available [03:46:21] side discussion of what is known about the token by whom [03:47:12] jeff: said in talk, missing in document [03:47:32] simon: the thing i said for the afs draft will be in the afs draft [03:47:35] not this [03:47:49] jeff: bytelife [03:48:12] it's advisory which is the right way to go in jeff's opinion [03:48:19] server can issue a challenge whenever it wishes [03:48:24] rxkad dtrt already [03:48:34] need to define a mech to request a challenge from a client [03:49:06] jeff: derrick, can it be done in a ping payload? [03:49:24] derrick: i think that's difficult as far as protecting payload [03:49:33] simon: should be application-specific [03:49:43] for example can be done now by establishing a new connection [03:50:03] jeff: the client as part of a response could be send me a challenge every (10mb? whatever) [03:50:18] simon: rxk5 wouldn't need this but in general it's more global in scope than rxgk [03:50:36] jeff: could include this bytelife the security layer data header? [03:50:42] simon: bytelife negotiated [03:51:11] derrick: just issue a challenge that often? [03:51:24] simon: ue the shortest byte life [03:51:34] marcus: disagree [03:51:44] jeff: even if the fallkback is breaking connections? [03:51:55] marcus: what happens if 2 calls happen at the same time? [03:52:08] and the server bytelife is exceeded [03:52:13] and there are outstanding calls [03:52:34] simon: bytelife is soft, won't interrupt in-flight calls [03:53:01] marcus: you'll have to come up with an exactl number [03:53:02] --- tkeiser has left [03:53:18] --- tkeiser has become available [03:53:45] simon: it's an advisory request [03:54:01] matt: server should enforce bytelife [03:54:18] simon: next new call would be on a new connection to solve this [03:55:44] marcus: does a challenge cause a new transport key to be selected? [03:55:46] simon: yes [03:55:56] marcus: how do you handle in-flight data? [03:56:02] simon: track which key per call [03:56:30] tom: what if it's an enourmous call [03:56:47] simon: no good answer [03:56:55] tom: time isn't the best thing either [03:57:03] simon: no. it comes from the gss layer though [03:58:00] marcus: skew time is important to calculate the start time [03:58:10] a one second in the future ticket was a problem [03:58:20] simon: gss layer handles time, should we honour expiration or not? [03:58:22] bytelife is harder [03:58:55] only choice for bytelife is to make it optional and make it the client's problem [03:59:21] when the client starts a new call it could start a new connection [03:59:58] tom: version ordinal key operations for connection? [04:00:28] simon: if we add a crypto header, part could be a key index [04:01:04] tom: rx could be used for an async rep mechanism, not efficient to shut it down [04:01:15] marcus: bytelife has 3 components [04:01:22] when client should think about getting a new key [04:01:26] when server should issue a challenge [04:01:38] when server should stop allowing valid key [04:02:00] jeff: what if the challenge could simply be replied to again? [04:02:03] --- matt has become available [04:02:12] derrick: add an epoch to the challenge response? [04:02:17] simon: you'd not get a new key [04:02:41] jeff: we don't need to; just say the next call to pseudo random is the next key [04:02:57] when a new packet arrives with a new key id, you call that and you're using a new key [04:03:06] simon: server calls and gets new key [04:03:11] jeff: client mech is trivial [04:03:28] simon: packet with key 2 arriving before a packet with key1, hold on to it [04:03:32] jeff: for a window size [04:03:42] simon: we already need to hold packets for decrypt-in-order [04:03:48] --- tkeiser has left [04:04:07] --- tkeiser has become available [04:04:14] jeff: in the rxgk response, make it a 64 bit time wwith 100ns granularity [04:04:17] same as yesterday [04:04:25] (rxgk response) [04:04:41] probably the same things for lifetime and bytelife and expiration time [04:05:16] simon: bytelife becomes a 32 bit log2 of the number of octets [04:06:35] jeff: inconsistent descriptions of security levels in the document between 7.6 and the place where 4 are listed [04:08:06] (jeff: explains bind layer; if we don't do crypto over tls we are vulnerable to mitm) [04:08:14] jeff: ivecs? [04:08:16] simon: no [04:09:20] simon: jeff suggested making ivec be pseudo header; i prefer to just have them be decrypted in order [04:09:26] marcus: is sequence number per call? [04:09:28] jeff: yes [04:09:49] marcus: how are the channels ordered? how do you know what order? [04:09:56] simon: maybe we need an ivec [04:10:10] jeff: with window size growing, is queueing packets going to eat memory? [04:10:15] simon: maybe we need an ivec [04:10:21] marcus: chaining them together? [04:11:20] discussion of whether we need per-channel keys [04:12:07] simon: we won't do chaining, either use ivec of 0 with a safe crypto system or use pseudoheader as ivec [04:12:11] jeff: the latter is safer [04:12:19] kerberos likes ivec 0 but why? [04:12:27] marcus: key as ivec burned kerberos [04:12:56] jeff: peer sent mic, needs to mention padding and length of the output [04:13:06] simon: you already know it, it's a property of the encryption type [04:13:07] jeff: ok [04:13:14] jeff; acknowledgements are wrong [04:13:18] simon: i know i need to fix it [04:13:43] marcus: link up usage and [04:13:54] jeff: wait, can you add marcus' diagrams? [04:14:00] matt: can add pdfs [04:14:05] simon: text is canonical [04:14:15] marcus: i admire your ascii format but... [04:14:19] jeff: i have this tool jave [04:14:24] simon: lunch! [04:14:34] marcus: you define seclevels, etc [04:14:44] the numbers you pick are not rxkad [04:14:59] i discovered i needed to map rxkad and my own levels. fix yours too [04:15:01] simon: ok [04:15:10] (make them match rxkad to start with) [04:15:19] marcus: byte limit questions already addressed [04:15:28] will verify revised wording [04:15:40] reservations about combine tokens [04:15:51] derrick: generic, afs or both? [04:15:52] marcus: yes [04:15:58] want a local-only version [04:16:01] even if server only [04:16:05] describe in more detail [04:16:11] simon: what do you need? [04:16:16] marcus: what's in the token? [04:16:27] simon: implementation-dependent, will go in the afs document [04:16:37] needs to be written up in more detail [04:16:41] extension also need to be defined [04:17:07] marcus: start time could be chosen randomly [04:17:15] how is it different from client nonce? [04:17:23] simon: it's just another nonce, used elsewhere [04:17:40] marcus: your authenticator defn came from the afs3.0 spec? [04:17:43] simon: not sure [04:17:49] marcus: read the citi paper about this [04:18:00] simon: i suspect it came from arla [04:18:16] will look at it [04:18:44] marcus: define usage; you use key to check validity of response. say what key,response [04:19:04] pseudo-header. is this part of the payload? [04:19:23] simon: only when encrypting; otherwise not when only integrity [04:19:34] only point it's shipped it's shipped encrypted [04:19:44] jeff: other payload was outside secrity classes [04:19:55] marcus: just make sure you mention the first packet coverage is mentioned here [04:20:16] the encrypted header call/seq/etc matches data you can calculate other [04:20:31] you don't need this to prove that data [04:20:37] simon: you could just use a checksum [04:20:57] marcus: you don't need this so it's not really a pseudoheader [04:21:28] simon: 3961 says it's hard to include something in the checksum that's not then in the encrypted payload [04:21:47] jeff: can't do iv-based enc/dec where the pseudoheader and the buffer are provided [04:21:57] simon: other option: include just a cksum of the pseudoheader [04:22:06] most checksums are larger than this 96 bytes [04:22:28] marcus: i don't like it anyway; i think maybe avoid using the 3961 routine directly? [04:22:32] --- tkeiser has left [04:22:41] simon: prefer using 3961 for the standards benefits [04:23:00] --- tkeiser has become available [04:23:09] marcus: love suggested i use as crypto in rxk5 in smaller chunks, and that api would lend itself to this [04:23:22] simon: esp with integrity we'd get benefits with in-place work [04:23:30] but it's not limited to this [04:24:14] simon: love's general point was don't use 3961 at all because he dislikes the crypto implementations but it's a huge win to not do it yourself [04:24:24] we don't need to do work; we take the ietf's work [04:24:55] jeff: can you (marcus) give this feedback to ietf that their framework caused you (these) problems when it was used outside kerberos? [04:25:19] marcus: i haven't done so yet [04:26:57] marcus: is the pseudoheader being used to generate the ivec now? [04:27:06] simon: previously no. now, not sure, need to revisit [04:27:51] simon: use rxk5 pseudoheader? [04:27:54] marcus: sure [04:28:56] simon: clear is a bad idea [04:29:01] marcus: i compile with it in rxk5 [04:30:07] jeff: nrl wants auth clear [04:30:32] i think rxk5 and rxgk should default to minlevel to auth and let you hurt yourself if you want less [04:32:20] simon: crypto will bite on poor-performance hardware [04:34:26] marcus: an afs integration paper? [04:34:37] simon: next month [04:34:52] marcus: how long until there's sample code? [04:35:02] esp with combine tokens working [04:35:17] simon: running code within 6 months [04:35:40] complete implementation not integrated within 12 [04:35:59] integration? dunno [04:36:05] jeff: depends on the standards process [04:36:19] simon: we'll see what standardization process does [04:37:01] marcus: how hard to get this working in the cache manager? [04:37:12] simon: dead simple because gssapi done at aklog time [04:37:30] jeff: using rxk5 settoken mech [04:38:12] simon: we're not doing gss negotiate against every fileserver to avoid needing to pull every gss backend into the kernel [04:38:21] upcalls are hard [04:42:26] --- matt has left [04:42:59] --- Derrick Brashear has left [04:44:28] --- Simon Wilkinson has left [04:46:07] --- tkeiser has left [06:33:45] --- matt has become available [06:34:05] --- Simon Wilkinson has become available [06:34:12] miniosi [06:34:20] How is it different from libosi? [06:34:49] Disabled some macros, stubbing out osi_trace [06:34:51] --- Derrick Brashear has become available [06:35:48] Major subsystems are: [06:36:24] build environment [06:36:55] (pthreads, lwp, kernel; datamodel; standard portable types) [06:37:02] also compiler detection [06:37:45] PLATFORM/datamodel.h [06:39:08] simon: will this work in compiler-selectable environment? [06:39:15] tom: uses macros to detect compilers [06:39:36] other classes [06:39:37] syncs [06:39:39] threads [06:39:42] data structures [06:39:50] data structures can probably be forgotten for now [06:40:00] the other api: mem [06:40:16] provides allocators, deallocators, implements slabs [06:40:56] simon: linux kernel api version? [06:41:05] tom: relies heavily on solaris model [06:41:13] only consumer is currently libosi [06:42:16] provides a rx queue like, but provides element offset [06:42:31] (lots of merging needs to happen, this exists elsewhere) [06:43:42] jeff: any reason not to use the os versions? [06:43:49] derrick: provide common api wrapping [06:43:58] jeff: windows malloc sucks [06:45:06] tom: most platforms fall back to kmalloc [06:45:26] sync primitives [06:45:29] mutex [06:45:32] condvar [06:45:35] rwlock [06:45:42] derrick: semaphore? [06:45:44] tom: no [06:45:51] shared locks (ala cm) [06:45:56] spinlock [06:45:59] spin_rwlock [06:46:08] simon: shared locks in cm? [06:46:33] tom: backed by existing implementation [06:46:46] simon: not stealable as-is for his purpose [06:46:53] tom: lock api looks like pthreads [06:47:08] matt: except trylock [06:47:13] tom: fixed privately [06:47:34] tom: osi_result [06:47:39] influenced by nspr [06:47:47] 32 bit signed int result code [06:47:53] half is errors, half is notifications [06:48:18] negative is error code, positive is results [06:49:31] discussion of why error code handling is fragile and needs to be treaded lightly upon [06:52:41] discussion centering on keeping osi_result internal to libosi [06:53:31] jeff suggests an optional field to pass in to collect richer semantics for warning-on-success if it's handled [06:54:11] tom: hardware introspection apis [06:54:16] cpu, cpu_cache [06:54:31] hi perf data structures, checking alignment sizes for caches [06:55:20] in some cases it guesses, telling you it does [06:56:03] atomic [06:56:06] matt: still have it? [06:56:09] tom: unsure [06:56:15] matt: we should not publish yet [06:56:17] tom: needs work [06:57:17] simon: other thing about atomics, some of our kernels support atomics internally; should we wrap the common operations? [07:02:06] arguments about what to do about atomics [07:03:04] simon: having an atomic type and functions that use it, abstractly and implemented correctly, do it, but if the platform supplies the implementation, wrap theirs [07:05:58] metadiscussion [07:06:19] derrick: can we review this in small parts of the api? pull in, implement? [07:06:27] (consensus: yes?) [07:06:29] tom: threads [07:06:34] probably contentious [07:06:41] probably doesn't work for windows [07:06:49] thread id versus thread handle [07:07:08] has one interface: create. may also have join, but it's unused [07:07:38] simon: meta question: how many non pthreaded platforms will we support? [07:08:02] what's the benefit of having a non-pthread api to this? [07:08:25] tom: want to add event handlers for create/delete threads for example [07:08:42] get rid of pthread_once and use events to do it [07:09:05] linked lists of thread-local objects could be used via auxiliary parameters [07:11:44] (aside: matt objects on principal to mandating us of os-provided atomic interfaces given mcas existing) [07:12:44] metadiscussion: handling of versioning of data structures in libosi, comments wrt sonames, etc and how much care must be used to avoid symbol conflicts and different requirements [07:14:09] simon: problem with wrapping pthreads is it imposes overhead to developers if we aren't using the additions [07:15:58] consensus we will not pull in these changes until used [07:16:05] tom: initialization/finalization [07:16:31] takes osi_program_type_t and an opaque object to allow overriding certain things [07:17:00] every main() needs to call this [07:17:32] discussion of whether implicit use of this now needs an init function [07:18:37] tom: some configuration are guesses until you've init()ed. some things break [07:20:50] not needed for sync and base types [07:21:06] consensus: punt for now [07:21:41] time [07:22:13] tom: provides osi_time_t osit_timeval_t osi_timespec_t 32 and 64 and provides get functions for all of them [07:22:27] similar to lwp fasttime [07:22:53] for solaris userspace uses a fasttrap system call to get whether you need to increment [07:23:29] unique versus nonunique version of these apis for when you need unique times [07:23:37] matt: this is the most useful part to me [07:25:52] string [07:25:55] tom: needs work [07:26:07] jeff: secure string api from microsoft may be implemented [07:26:30] implement strl* because tom needed it [07:30:23] osi_inline [07:33:06] tom: used in time [07:33:14] derrick: can we disable [07:33:19] simon: use static_inline for now [07:33:32] osi_lib_init osi_lib_fini [07:33:56] portable _init and _fini [07:35:48] simon: do we have what matt needs? [07:35:53] matt: time covers it [07:36:06] tom: osi_proc [07:36:18] tom says mickey thinks it may be a problem for windows [07:36:47] struct proc, or pid [07:36:54] likewise, ppid when you need parent [07:37:25] derrick: unix-only feature [07:37:47] api subset [07:42:25] simon: name kernel to kosi? [07:42:30] tom: osi_kernel [07:43:24] tom: osi_signal [07:43:34] (we'll kill osi_kernel) [07:43:54] signal is posix-only, it's like sigaction, sigwait api, imports softsig [07:47:27] The first import for libosi should be [07:47:34] Phase 1: buildenv, compiler, types [07:47:45] Phase 2: platform/datamodel.h [07:47:48] 3: Time [07:48:10] Each phase should update the rest of the tree so that the imported pieces don't duplicate code that's in the rest of the tree [07:49:20] Phase 0: Figure out how to actually make this build [07:52:07] Tom / Matt will move this forward - Tom will do it, but Matt may push him to make it faster. [08:05:39] --- Derrick Brashear has left [08:05:51] --- matt has left [08:08:20] Now discussing directory RPC proposal. [08:08:20] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:08:34] Matt's not sure what's in his document (posted to afs-stds today) is right [08:08:47] Jeff is discussing use cases - dump versus iterate [08:09:49] --- Derrick Brashear has become available [08:10:18] referring to http://michigan-openafs-lists.central.org/archives/afs3-standardization/2009-September/000423.html [08:10:55] tom: what if we used a btree and could establish a canonical sort order on the wire [08:11:01] jeff: jhutz and i agree there [08:11:06] 's no one canonical sort order [08:11:20] could be maintained in a b+ tree tho, deliver results by walking the leaves [08:11:36] --- shadow@gmail.com/owl64E84380 has become available [08:11:36] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:15:58] Still sorting about sort order [08:16:37] --- shadow@gmail.com/owl64E84380 has become available [08:16:37] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:18:14] s/sorting/talking/ [08:18:37] Going to drop sort order from the draft. [08:18:45] matt: do we need an iterator in a split interface? [08:19:24] jeff: What is the iterator iterating over? [08:20:12] Jeff: We should focus on use cases, rather than fighting over the API. [08:20:43] Matt's use case is to dump the whole directory, reading a split RPC [08:21:21] The use case is to acquire all of the objects for all of the elements in the directory. [08:21:37] --- shadow@gmail.com/owl64E84380 has become available [08:21:37] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:21:40] consistency worries [08:22:58] First potential use case is the Windows CM - so you can fetch everything and build a B-Tree from it. [08:23:03] (Maybe) [08:26:38] --- shadow@gmail.com/owl64E84380 has become available [08:26:38] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:27:50] Use case is to abstract the directory format between the client and the server - so the client doesn't need to know how a server stores a file [08:27:57] sorry, stores a directory [08:28:50] Discussing whether an iterator needs to stay there [08:31:38] --- shadow@gmail.com/owl64E84380 has become available [08:31:38] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:36:39] --- shadow@gmail.com/owl64E84380 has become available [08:36:39] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:41:39] --- shadow@gmail.com/owl64E84380 has become available [08:41:39] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:43:38] --- matt has become available [08:43:49] dafs [08:44:08] state is serialized to disk, avoids initcbstate3 following restart, lowers resource use burden [08:44:26] of course changing the fs parameters this can invalidate the saved data [08:45:02] want a way to force new capabilities and require not initcbstate3 [08:45:07] not require that is [08:45:29] jeff: revised tellmeaboutyourself that takes my capabilities as an in [08:46:41] --- shadow@gmail.com/owl64E84380 has become available [08:46:41] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:47:38] all we can do now is match up what we talk to and its uuid [08:48:02] the challenge is multiple servers on a single ip, with port assymetric routes [08:49:38] . [08:51:41] --- shadow@gmail.com/owl64E84380 has become available [08:51:41] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:52:54] prdb extensions: derrick showing the afsig.se document : http://web.archive.org/web/20060211111127/http://www.afsig.se/snipsnap/space/prdb+extensions [08:53:10] Marcus would like a way of creating multiple names in the same RPC [08:53:32] Jeffrey agrees. [08:53:41] Tom: we should make it a vector [08:53:57] Derrick: Is there any reason to remove more than one authname at a time? [08:55:19] Marcus: Should add a RenameAuthName [08:55:37] Can we rename between two different authentication types? [08:56:07] (discussion about how replace/rename should work semantically) [08:56:41] --- shadow@gmail.com/owl64E84380 has become available [08:56:41] --- shadow@gmail.com/owl64E84380 has left: Lost connection [08:57:03] Rename will take a vector of renames, but you have to keep the same type [08:58:37] (Rename is a vector of triples of type, old_opaque, new_opaque) [08:59:40] You can have more than one thing of each type [08:59:47] Derrick will go and make this happen. [09:01:42] --- shadow@gmail.com/owl64E84380 has become available [09:01:42] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:06:43] --- shadow@gmail.com/owl64E84380 has become available [09:06:43] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:11:44] --- shadow@gmail.com/owl64E84380 has become available [09:11:44] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:16:44] --- shadow@gmail.com/owl64E84380 has become available [09:16:44] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:21:46] --- shadow@gmail.com/owl64E84380 has become available [09:21:46] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:26:46] --- shadow@gmail.com/owl64E84380 has become available [09:26:46] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:31:46] --- shadow@gmail.com/owl64E84380 has become available [09:31:46] --- shadow@gmail.com/owl64E84380 has left: Lost connection [09:36:47] --- shadow@gmail.com/owl64E84380 has become available [10:16:06] --- matt has left [11:34:13] --- Derrick Brashear has left: Disconnected [11:40:48] --- Simon Wilkinson has left [14:53:30] --- Derrick Brashear has become available [22:18:50] --- Derrick Brashear has left: Disconnected [22:54:09] --- Derrick Brashear has become available