[00:38:24] --- dev-zero@jabber.org has left [01:16:40] --- dev-zero@jabber.org has become available [01:23:57] --- Russ has left: Disconnected [02:15:13] --- haba has become available [03:52:00] --- mmeffie has left [06:46:53] --- deason has become available [06:59:12] --- reuteras has left [07:08:56] --- matt has become available [09:04:03] a question for anyone familiar with the FE/CB/etc structures in callback.c... [09:04:47] is there any way to traverse and remove just some CBs from an FE without needing to hold H_LOCK throughout the entire traversal? some background: [09:05:59] for XCBs, on e.g. volume releases, we need to break callbacks for particular fids for XCB-enabled hosts, and break the whole-volume callback for non-XCB hosts... [09:06:44] so we first tell the fileserver to break callbacks on a set of fids, then say to break all callbacks for a volume for non-XCB hosts [09:07:36] the problem is, traditionally, BreakLaterCallBacks will unchain the entire FE and all CBs for it when we detect any CB for the FE that is marked as _LATER/_DELAYED [09:08:34] what we do for the non-XCB breaks is just mark the FEs/CBs for non-XCB-enabled hosts as _LATER/_DELAYED... so some CBs are _DELAYED and some aren't [09:09:35] we could traverse all of the CBs for each _LATER FE to determine if we need to unchain it, but that means we're holding H_LOCK while traversing all of the CBs for an RO volume, so we're holding it for O(clients) time [09:10:01] which seems longer than we'd want to hold something like H_LOCK... is there any way around this? [09:23:42] although, hmm, I suppose BreakVolumeCallBacksLater already holds it for that long (but without FSYNC_LOCK), but still; it'd be nice to avoid the additional time holding the lock if we can avoid it [09:55:13] --- mmeffie has become available [10:15:58] --- dev-zero@jabber.org has left [10:24:33] delayed, since i was eating, and then debugging, and helping someone set up afs, and... [10:25:05] anyway, i think we'd need to change how we do locking to accomodate split handling for XCB and non-XCB later/delayed [10:29:10] --- Russ has become available [10:29:28] We just got another file server crash with 1.4.11, this time with core file. [10:29:45] does the core have a useful backtrace, i hope? [10:30:10] segfault in malloc. [10:30:23] #0 0xb7defa78 in _int_malloc () from /lib/i686/cmov/libc.so.6 #1 0xb7df1655 in malloc () from /lib/i686/cmov/libc.so.6 #2 0x0806673a in h_GetHost_r (tcon=0x71174890) at ../viced/host.c:1557 #3 0x0806860b in h_FindClient_r (tcon=0x71174890) at ../viced/host.c:2187 #4 0x0804fbd5 in CallPreamble (acall=, activecall=0, tconn=0xabed8150, ahostp=0xabed814c) at ../viced/afsfileprocs.c:314 #5 0x08050f38 in SRXAFS_GetCapabilities (acall=0x8808160, capabilities=0xabed8194) at ../viced/afsfileprocs.c:6338 [10:30:30] "don't run out of memory". seriously, though, i bet it's heap corruption [10:34:51] Running out of memory causes malloc to return 0, not crash. [10:35:33] How long does this take to break? Is using a debugging malloc feasible? [10:36:34] given the last report, probably "a while" [10:36:37] and "no" [10:36:50] what OS? [10:36:51] 28 days in this case. [10:36:52] Debian. [10:37:56] env MALLOC_CHECK_=2 ? [10:38:14] I don't know how much that affects performance, but... [10:49:20] fwiw, the snowleopard klog problem is Andrew_StringToKey. I don't know why yet [10:57:20] derrick: do you have any suggestions for splitting locking in the short term? [10:57:34] none. [10:57:35] --- dev-zero@jabber.org has become available [11:01:33] Jeff: I'm fixing the Windows man page thing for restorevol. Sorry about that. [11:09:42] jhutz> env MALLOC_CHECK_=2 Since some time that is default on some RHEL and since I did not have the time to debug it, I since then run bosserver with MALLCO_CHECK_=1 as it would blow up otherwise. I should really try with MALLOC_CHECK_=2 again some day. [11:12:53] On another topic. [11:13:01] I'm interested in prototyping a new RPC mechanism getting directory listings. [11:13:08] Really, 2 is the default? Wow. But if you're already running with 1, setting it to 2 should not affect performance any. [11:13:13] jhutz> BreakCallBack(client->host, &fileFid, 0); I don't know if the bug is _there_ but there definitely is a bug somewhere that a callback is missing in some "rename" cases. And the not-being-able-to-run blogbench bug might bite some user of us who MIGHT be doing a mv-clobber from his parallel code. [11:13:13] The obvious motivation is to remove exposure of internal directory formats, especially since ours must be revised. [11:13:51] I've run an initial idea by a couple of folks. Things I've been asked to consider the following problems, and I'd like to spark discussion: [11:13:59] yes "2" as default was sneaked into their libc some year ago. [11:14:06] 1. character encoding for filenames, how to do it, and, if relevent, multiple encodings [11:14:18] 2. generalization of directory listing to stream files, or similar future problems [11:14:19] I don't see any performance issues with that. [11:14:29] 3. entry hash information--I'm skeptical about some versions of this, but, if the client were interested in a computed hash a particular server happens to have precomputed, and there is no requirement for legacy computation, I haven't founda problem with it [11:14:49] Well, there's nothing we can do about the current internal format being exposed. If we change the internal format, we're going to have to support conversion to the old one for the old FetchData RPC. But I don't think I have a problem with defining a new RPC for fetching directory contents in a format that is explicitly not the same as the fileserver's internal representation. [11:14:51] http://pastebin.ca/1561194 [11:15:02] (jhutz: right) [11:17:49] I'm not particularly concerned about 3; I was just thinking, the hash table is something we lose by not using the existing format [11:17:49] > character encoding for filenames Messy, but basically, UTF-8 or unspecified local 8-bit encoding, or an arbitrary mixture (but only one encoding per entry!). Matching is normalization-insensitive, with the added rule that two bit-for-bit identical strings match even if they are not valid UTF-8. [11:17:57] I don't know if it has any real noticeable impact [11:18:56] What I don't wish to do is put a future server in the business of precomputing hashes it doesn't use [11:19:12] But maybe it belongs in a legacy entry format [11:19:32] I agree on it being optional [11:20:44] jhutz: ok. what are your thoughts on AFSNAMEMAX regarded as UTF-8? [11:21:08] The "intresting" problem for a $USER is if they somehow want to fetch files and don't quite see or know how they are encoded. Some additional fs commands which can list all fids in a dir and then get the file by number may be a solution. [11:21:58] I'm not sure what the question is, but... realistically, the reason for having maximums is not to have some sort of policy about how many characters long something is; it's to be able to allocate storage. So, the number code actually needs to know is the maximum number of octets something can be. [11:23:43] haba, that should never happen, because... (1) if a filename is unicode, and you know it, then it doesn't matter how you normalize it, because we use normalization-insensitive comparison. (2) even if you don't know the filename, or it's not unicode, you can always list the directory, and then use a filename bit-for-bit as it appears in the directory listing. [11:23:59] Isn't AFSNAMEMAX the present effective maximum name for a file? [11:25:00] or, the maximum length of a file name [11:25:54] I'm not sure which constant that is, and I don't have time to dig into code right now. But, there is certainly a constant which is the maximum length of an AFS filename _in octets_, as such names appear in directory entries, and another constant (which may be the same) which is the maximum length permitted in RPC filename arguments [11:27:20] Ok, sure. For future purposes of the protocol, we care about the RPC filename argument constant, I think [11:28:37] For protocol purposes, yes. Note that the way the RPC protocol works means that that's actually just a parser-enforced limit, and you can increase the maximum length on a decoder without breaking interop with older encoders. [11:29:19] jhutz (2): If my terminal can represent what is there I can see it. So was the town named H?l? now Hålö eller Hälö eller Hölä? [11:30:46] (someone else made the filename for you and you may not even have öäå on your keyboard and cutpaste might not work....) [11:31:53] If your terminal doesn't suck, then the fact that there's no font position available to render that character won't change the fact that copying and pasting will work. If your terminal does suck, there are ways around that, like using shell globbing (hint: H?l? will match all of those). And if you're not using a terminal, then you probably don't have the copy-and-type problem (but if you do, the GUI is probably at least as likely to work as a terminal) [11:31:54] ls -li . ; find . -inum $inodenumber -exec rm -rf {} \; [11:33:04] or "less H?l?" [11:33:34] find -inum is a nice trick. I remind you so that both "å" and "ö" are nouns in swedish, so you might find two files named "?". [11:33:56] Yeah, then you have a problem. :-) [11:34:05] å = small river ö = island [11:34:24] that's a time saver [11:34:26] well, if they're indistinguishable on the screen, I don't see how we could solve that [11:35:04] Keep often used words short :) [11:36:08] I was going to ask why a language would waste scarce single-vowel words on common nouns rather than heavily used syntax like English "a" or "I" or Spanish "y", but I guess those are appropriate for Swedish [11:36:38] Really, I don't think the UI problems are ours to solve. They are not specific to OpenAFS, and neither should a solution be [11:37:33] Our job is to give you a reasonable chance, and software a 100% chance, of being able to give us a working name for the file you want. [11:38:00] I just want to improve the percentage of reasonable a bit :) [11:38:11] It's some other layer's job to give you ways of ppicking files whose names you don't actually know or can't type. [11:38:32] --- Jeffrey Altman has become available [11:39:13] Well, 256 was a roomy file name length in the 90s. Is it still? [11:40:45] deason, jhutz: I think the combination of ls -li and od -c and find . -maxdepth 1 -inum works. [11:41:21] just need to stay on posix [11:41:41] or s/od -c/cat -v/ [11:42:54] I did _not_ think about the inum trick. Have to relay that to our support desk who gets questions about this from time to time. I usually have used wildcards to resolve the problem. [11:43:54] sorry I wasn't here for the start of the Unicode normalization is hard problem discussion [11:44:13] and the directory format discussion [11:44:25] * haba will now go and find sushi for dinner. [11:44:32] hopefully still going on [11:45:03] Before you go off and redesign what was done many years ago at a hackathon and commit resources where there is already funding, you should be aware the the SBIR grant covers this work [11:45:33] maybe with "ål" [eel]. [11:46:07] --- haba has left [11:46:36] the only way to ensure that a Unicode file name entered on a Mac is the same as the Unicode file name entered on Windows or Linux is to perform a normalized comparison. [11:47:55] I don't know what was proposed, since you're the first one to mention it, and I've spoken to a few prominent folks now. But obviously it didn't follow the afs3-standardization process, nor get implemented, it seems [11:48:18] no it didn't [11:48:28] But I'd like to learn from it. [11:48:29] jaltman: I thought such a comparison is what was said [11:48:44] it was [11:49:11] the only portion I have in scroll back is jhutz commenting that utf8 comparisons are unnormalized [11:49:36] all of the SBIR grant work will go through the standardization process [11:50:09] all of it will be made publicly available in a separate git repository and gerrit instance for the community to comment on as it is implemented [11:50:59] the directory extension work was designed by jhutz, lha, myself and some of the Arla folks at a hackathon in Sweden. Perhaps 2005. [11:51:15] if you look at the OpenAFS roadmap this is one of the items on the list. [11:51:26] there should be links from there to the design effort [11:51:55] I have heard of directory extensions, however, work to this point has I believe mixed the issue of extensions to AFS formats, and AFS protocol formats. It is legit for people to talk about future RPCs. [11:53:00] there may have been RPCs proposed. i don't remember. snipsnap is down, i suspect, but istr it's cached by archive.org [11:53:01] what matters is the format the RPC sends the data to the client in. You want that to be a B+ tree [11:54:12] and the contents of the B+ tree need to support case sensitive and case insensitive lookups as well as DOS 8.3 compatible file names [11:54:40] The in-memory B+ tree implementation within the Windows client will provide you with the required info. [11:56:22] Both display forms and normalized forms should be included in the directory but the keys should be normalized. [11:57:19] Isn't there a logical distinction between what clients do with sorting and what servers need to produce? [11:59:18] Wouldn't it's overall efficiency depend on the representations in use at both ends? [11:59:44] The Windows client has shown that being given a B+ tree by the file server is unimportant to performance. Therefore, it would be possible for the protocol to avoid providing any hash table or B+ tree and simply permit the clients to build them on the fly. The downside of this is that small devices have to provide the additional storage for both the data pages containing the network byte order entries as well as the host ordered B+ tree. [12:00:46] On Windows, the cost of performing the network to host byte order translations far exceed the cost of maintaining the local B+ tree provided that the local B+ tree can be updated as a side effect of local operations or trusted input provided by extended callbacks. [12:02:21] the secondary benefit of this approach is that the Unix client can build a lookup model that is appropriate for its environment. [12:02:58] So btree is a sorting/fast lookup strategy, as with hashing suggested by existing format. [12:03:09] No, that's not what I said. What I said is that we should be doing normalization-insensitive comparisons. [12:03:57] The existing hash table approach is useless when you need to support both case-insensitive and case-sensitive lookups [12:04:35] Why it seems plausible to be optional or somehow selected. [12:04:36] for case insensitive lookups the hash table must be built off of a normalized representation of the string. For example, "all lower case" [12:05:18] Essentially what that means is that each client needs to take the directory contents, normalize to its own normalization rules, case fold if it needs to do case-insensitive lookups, and construct a B+ tree (or whatever data structure it finds most efficient) using the resulting names as keys. [12:05:31] On a platform such as MacOS X which permits mounts to be either case-sensitive or case-insensitive (actually Windows does this as well), you want the directory lookups to match the mount mode [12:05:49] Right. [12:05:59] The Windows client does this today. [12:06:20] Of course, it's messier when you introduce unicode. [12:06:39] which the Windows client deals with today as well [12:07:59] Because you need to be able to support lookups of both unicode (even when the client doing the lookup does not use the same normalization rules as the server or the client that created the file) and of the bitstring exactly as it appears in the directory (because it might not _be_ unicode). [12:09:14] Filenames actually stored into directories by the server and returned to clients should be exactly the name provided by the client, without any kind of normalization done by the server. Though the server may also wish to store normalized forms, for use in cases where the server itself needs to do a directory lookup. [12:09:55] With the current interfaces that are used in the Windows SMB protocol all names are sent as Unicode no matter what. If the name in the AFS directory cannot be converted from "utf8" to "ucs2" and back without data loss, the client needs to maintain a mapping record. If the name cannot be converted to "ucs2" at all, a random name must be created to permit that file to be displayed to the user [12:10:28] Agreed. [12:10:47] We have done all of this hashing out in the past. [12:10:55] ZFS gets this right. [12:11:00] At least I think it does. [12:11:23] It permits the Unicode enforcement to be specified on a per file system basis [12:11:55] If a file system is marked Unicode only, then all strings are stored in utf8 with both display and normalized forms. [12:12:21] any attempt to store a name that cannot be converted to utf8 is rejected. [12:12:34] I think the AFS volume should get the same. [12:12:46] type of property option [12:13:18] What if you want to implement a new volume format altogether? I'm just interested in the protocol level, frankly. [12:13:19] --- abo has left [12:13:44] --- deason has left [12:13:47] --- stevenjenkins has left [12:13:53] --- deason has become available [12:13:58] which part of this discussion does not apply to the protocol? [12:14:16] --- abo has become available [12:14:35] --- deason has left [12:14:38] --- deason has become available [12:15:41] I'm trying to shoehorn a complex discussion into a smaller space. I realized that directory listing is a "smallish" (relative to some others, complex as character encoding in particular becomes) problem that is a bit on critical path to experimenting with new volume types. [12:16:17] you can experiment with new volume types using the existing RPCs. I've done it. [12:16:34] you just need to construct the only directory format for use with old clients. [12:16:45] it's been done before [12:16:49] OpenAFS at least will need to do that for quite some time [12:16:53] um. nasd-afs. [12:17:02] yup [12:17:04] It may, yet, it's not unreasonable to do so. [12:17:16] whose code, alas, fell down a hole because panassas' founders wouldn't have been served by it being released, i'm guessing [12:17:35] --- stevenjenkins has become available [12:19:00] I'm aware I can construct the old format, but it's not intrinsically interesting to do so. [12:20:15] if you construct the old format then you don't need to modify clients. If you want to modify the clients, then create a new wire format to play with [12:20:30] Right, that's what I'm experimenting with. Hence, pastebin. [12:20:51] you said you wanted to keep the effort focused. if the focus is on a new volume format, then focus on that [12:21:19] It's a matter of effort. I can get an unsorted, split directory listing rpc working in a day. [12:21:41] It is clearly not the RPC which makes like easy for small devices. [12:21:43] well, that's true, but just because you can't doesn't necessarily mean it's what we want to standardize [12:22:15] derrick: of course [12:22:32] now, for doing testing, go nuts [12:22:45] If you think about it, chatting about it here, -after- talking to other folks, isn't that exactly what I -should- do? [12:22:49] the benefit of the precomputed hash tables is that the client running on a micro device doesn't have to spend cpu cycles constructing it. [12:23:04] I see that. [12:23:30] er. bind "that". which that is exactly what you should do? [12:23:38] I was thinking of it as tied to legacy format. Obviously, B+ tree avoids that problem. It seems plausible to allow selection. [12:23:54] that being, precomputed hash tables or equivalent [12:24:18] one of the big costs is network byte order representation. clients should be able to specify the order that is useful to them [12:24:21] I'm not trying to close the space. [12:24:41] jeff: that may be true, although, isn't that wired into XDR too? [12:24:57] only if you are xdr encoding "stuff" and not "here's a blob" [12:25:12] --- dev-zero@jabber.org has left [12:26:09] I don't think you want to lose the existing model of directory pages since it permits a client to search without being forced to read the entire directory contents [12:27:33] I guess so, yet the "page" model we currently have seems too tied to internals. [12:27:45] you can separate that [12:27:51] its just a format [12:28:12] the fact that it is how things are represented internally as well is just a convenience [12:28:36] fair enough. [12:29:12] So is the RPC from sweden one I'm going to like? I'd like to see what I'm talking about. [12:29:41] hang on [12:29:44] there wasn't specific rpc specified. what was specified were directory contents [12:29:57] that's fine. [12:30:21] and it is not sufficient to standardize in its current form [12:30:29] that's fine too. [12:30:59] the development of the I-D will be performed according to the grant schedule (unless someone decides to save us money and do the work for us) [12:31:27] * Jeffrey Altman really needs to get the announcement letter finished [12:32:01] --- Jeffrey Altman has left [12:32:06] --- stevenjenkins has left [12:32:47] --- abo has left [12:33:31] --- abo has become available [12:35:09] --- haba has become available [12:35:33] http://web.archive.org/web/20061009073933/http://www.afsig.se/afsig/space/AFS+directory+format+extensions [12:35:38] --- stevenjenkins has become available [12:36:10] note, no revised directory lookup RPC suggested [12:37:10] Clearly, such an RPC or RPCs need a format or formats. I thought I was being clever by dodging pages. Maybe I was being the opposite. [12:38:57] ZFS gets this right for unicode, if you turn that on. I'm not sure it actually contemplates a filesystem that contains a mixture of unicode and non-unicode 8-bit filenames, such that it can be configured to work correctly for both at the same time. [12:42:38] For ascii at least, aren't those equivalent? [12:43:40] I don't think we want to do what we designed in sweden. At the time, we were talking about shoehorning new information into the existing directory format. In retrospect, and especially with clients doing their own normalization and index construction, I think it's more useful to have a new FetchDirectory RPC of some sort, and just have the existing FetchData RPC always return the old directory format, as best as we can. [12:44:12] ascii is incredibly boring [12:44:51] The phrase "8-bit filename" implies a filename containing at least one character which requires more than 7 bits to represent. There are no such characters in ASCII. [12:45:18] All right, sorry. I'm afraid I'm not expert in character encoding matters. [12:48:53] --- Jeffrey Altman has become available [12:49:39] > I don't think you want to lose the existing model of directory pages > since it permits a client to search without being forced to read the > entire directory contents is this useful beyond getting a particular dir entry when a user requests a specific filename? [12:49:41] --- dev-zero@jabber.org has become available [12:50:11] because having a different rpc for specifically grabbing a known entry would be useful in that it allows you to give access to a directory without granting access to list all entries in the parent [12:54:22] Is RXAFS_Lookup() what you want? [12:54:52] ah right, there's that too [12:54:53] It's useful because it allows the client to fetch and process a _large_ directory a page at a time, instead of slurping the whole thing into memory at once. The fileserver can't do a lookup for a client, because it doesn't know the client's normalization rules. [12:55:46] We don't implement RXAFS_Lookup() [12:55:53] which we [12:56:02] OpenAFS [12:56:05] (i know what you mean but i want you to be clear) [12:56:07] I wonder if both approaches (paged and unpaged) would be useful to support. [12:56:12] no, that's not (strictly) true [12:56:44] It returns EINVAL (not even UAEINVAL or RXGEN_OPCODE) [12:56:45] iso-8859-1 is an example of an 8-bit one. åäö are edv with the 8th bit set. [12:57:37] viced does not currently usefully implement it [12:57:42] i will grant that much [12:58:09] Well, since viced is the only RXAFS provider in OpenAFS... [12:58:20] haba: right [12:58:36] sure, but you can use the openafs client with a server that provides it [12:59:43] all I wanted to know is whether or not the functionality that RXAFS_Lookup would provide is what he was thinking would be desirable. [12:59:53] he == Andrew [13:00:42] * Jeffrey Altman getting lunch [13:04:53] --- Rrrrred has left [13:06:06] --- Rrrrred has become available [13:14:08] jaltman: yes, I believe so [13:15:12] jhutz: if the rpc for reading dirents was split, it's still possible for a client to read piecewise, isn't it? [13:16:08] It was intended to solve that problem, by me. But it's not random access. [13:16:12] --- pod has left: Lost connection [13:16:47] yes, but if we had a Lookup, what would we need random access for? [13:18:47] And, the RPC -could- take a starting position, effectively becoming paged. [13:19:56] Sure, you can use a split RPC that way, if you're willing to tie up a server thread while the client does its processing. That's not OK. [13:20:11] > if we had a Lookup, what would we need random access for? Because the server doesn't know the client's normalization rules. [13:21:06] Just to avoid the topic resurrecting--is that equivalent to the statement that server cannot sort? [13:21:15] (jhutz) [13:23:21] No. The server doesn't know the client's sorting rules, either, but that's a different problem. [13:23:25] I don't see how that prevents it; you just may need different parameters than the existing RXAFS_Lookup [13:23:56] I mean, for creating a file, the server checks if the dir ent already exists, right? [13:24:09] Yes, but the parameter you need is "here is a chunk of code that implements my normalization algorithm" [13:25:27] I don't see how it's different than specifying an entry for creating files or directories; the server needs to look up the entry there, too, so it can ENOENT [13:25:38] The server checks that the entry doesn't already exist according to the server's rules. That could actually be as simple as exact-match, and possibly should be. The same for other operations -- you require the client to handle the lookup and provide the entry it wants verbatim. Note that the client doing such an operation has already done the lookup locally. [13:26:43] jhutz: Can you speak more to them problem of sorting? [13:29:03] sorting rules are determined by the character set the data is being interpretted as [13:29:11] jaltman: WRONG [13:29:34] and case folding or not. and ... [13:29:37] sorting rules are determined by things like the language in which the data is written. [13:29:38] it certainly is true for Unicode versus not [13:29:54] German and Swedish sort ä differently. [13:31:36] So sorting is not actually a problem for FetchDirectory()? [13:31:49] (but a problem at the client?) [13:32:08] Correct. [13:32:16] It's not a problem for the AFS client, either. [13:32:39] It's entirely a UI problem. Sorting is done by 'ls', or Explorer, or whatever [13:33:46] So what seems most broken about http://pastebin.ca/1561448 [13:34:28] I thought we may have been talking about sorting for the purposes of some kind of structure for speeding up lookups (some kind of tree or something) [13:34:33] if we were going to transfer that over the wire [13:34:54] I thought we weren't, but we were going to look at byte ordering. [13:40:53] it would be beneficial if there were a flag fleld for every dir entry that can be used for example to indicate whether or not the name is expected to be valid utf8 [13:41:02] ok [13:41:54] in order to properly support Windows shortnames, the shortname should be stored in the directory entry and returned with a single entry instead of computing the short name based on the FID [13:42:51] From a listing viewpoint, that means the shortname is another name? [13:43:10] a shortname is a second name for the entry [13:43:19] ok [13:43:53] its different from a hard link in that either name when moved causes the alternate form to be moved as well [13:43:56] same for deletion [13:44:10] although a short name does not have to exist [13:45:30] In terms of your discussion about journaling, you might want to look at the Microsoft USN Journal. One of its features is that it embeds into the directory entry a journal index value of the journal entry that refers to the most recent change. [13:45:57] This permits the journal to be quickly accessed without requiring the ability to Fetch the data or status of the file [13:47:05] Ok. [13:47:41] EOB (end of battery) [13:47:42] Wrt long and short names, in ome reasonable implementation, there would be one name, and other names might be attributes, say. [13:47:56] --- haba has left [13:48:42] on a separate topic. for those who do not follow the Windows commits. it appears that none of the OpenAFS code validates the output of the pioctl operations. If the data is not of the expected form, the recipient can write beyond the end of buffers, read beyond the end of the pioctl output buffer, etc. [13:49:46] No one seems to care about protocol limits on filename length. 256 bytes must be sufficient. [13:49:53] Attributes are another topic. the question is whether they are part of the direntry or stored outside the direnty in an extended attribute or alternate stream. [13:50:07] iirc, some platforms have a limit of 256 anyway [13:50:09] one of steven jenkins coworkers gave us code to start santizing that but it wasn't ready for primetime [13:50:22] it's in RT [13:51:01] 1024 characters (not bytes) is the longest I have ever seen. [13:51:17] What would you want to write in the xg file so it doesn't come up again? [13:51:44] Windows supports 1024 ucs2 characters when UNC format is used [13:51:57] Is that in fact 2048 bytes? [13:52:41] it isn't because UCS2 when converted to utf8 which is what we store can take up to 4 bytes per character [13:53:51] wow [13:54:38] and UCS4 which is starting to be supported can require up to 6 bytes [13:55:20] I thought ZFS was breaking all limits, but they have a limit of 255 bytes for a filename [13:56:34] that will be changing [13:57:04] reference? [13:57:26] none [13:58:35] or at least, none that I can share with you [14:35:36] --- pod has become available [14:54:18] A directory is a mapping from name to vnode. Period. It may contain metadata about the mapping, such as a charset or language tag for the name, or an associated shortname. It does not contain metadata about the vnode. So no, the quesetion is whether attributes are part of the vnode or stored outside the vnode in an extended stream. They are not part of the directory entry. [14:56:22] > UCS2 when converted to utf8 which is what we store can take up to 4 > bytes per character Nope. UCS2 has no characters wider than 16 bits, all of which take at most three octets in UTF-8. See the table on page 4 of RFC3629 [14:56:35] jhutz: Yes. [14:56:48] So I have two questions. [14:57:20] 1. are 2 names sufficient, in which case I might propose name1 and name2 [14:57:55] 2. is 256 bytes enough, is 512 bytes enough, or would you propose a more complex representation at associated cost [14:58:26] > UCS4 which is starting to be supported can require up to 6 bytes I'm not sure where you're getting that. Even if you store them in 4 bytes, Unicode code points are never larger than 0x10FFFF. There is currently no UTF-8 representation for values larget than 0x1FFFF. [15:00:38] 3. (forgive me if I'm dense) am I understanding you that you feel comfortable with a FetchDirectory() rpc that has no 'sorted' flag, no sorting rule concept, and returns results, therefore, in whatever page-consistent order the server finds convenient [15:01:04] > 1. are 2 names sufficient, in which case I might propose name1 and name2 I don't know. We could plausibly want to store a unicode name, a name in some other 8-bit format, and a Windows short name. Also note that while I agree with Jeff that short names need to be stored so that they remain deterministic and do need to be provided to clients along with the rest of the directory entry, I don't think the fileserver needs to respond to short names in RPC arguments. [15:01:42] jhutz: needs to respond: ok, but it would hand them back in the entry [15:02:14] jhutz: so that would be 3 names--each of 256 bytes, more, or different max lengths? [15:03:06] Sorting has never been the server's problem, in this or AFAIK any other filesystem protocol. Anyone who wants to change that will have to make a fairly strong argument including a real use case. [15:03:19] check [15:04:28] > 2. is 256 bytes enough I don't know. I don't think our limits should be dependent on the limits set by some other filesystem or soem operating system's API, whether in octets, code points, or characters. [15:05:25] An operating system that accepts an 8K filename and then breaks when the filesystem can't handle filenames longer than 13 characters is... broken. [15:05:40] Jeff says ZFS is leaving the 256-byte limit behind, but cannot comment. The other yardsticks he suggested seem notorious B-I-G. [15:07:18] Our limits should be expressed in octets, though it's fine to think in terms of wanting to be able to represent a certain number of Unicode characters with a certain distribution of code values (it matters; in most languages (but perhaps not most text), characters with smaller UTF-8 representations are much more common) [15:07:33] jhutz: 13: that could be. we need to develop an intuition about what -we- want to support. [15:07:42] jhutz: octects: does that mean string is the wrong type? [15:08:43] iiuc, it's an octet sequence, though not in name [15:09:14] No, string<> is still the correct type, because it has very special semantics you can't get any other way. If filenames were binary data we'd have to go to opaque<> or something, which would be a lot more annoying, but we can and do prohibit NUL's, which makes the in-memory representation of a string<> as a C string usable. [15:09:27] check [15:17:04] So is this getting closer? http://pastebin.ca/1561567 [15:20:32] is there any reason to not make the filename limit, say, 4k? [15:21:21] I didn't think overestimating it had bad consequences unless you went extremely high [15:24:20] Certainly it [15:24:33] is in the spirit of transcending all limits we knew before. [15:24:48] If the limit is 4K, then a client making an RXAFS_CreateFile call can cause the fileserver to allocate 4K of memory before the RPC handler even starts. A client calling RXAFS_Rename can make the fileserver allocate 8K [15:25:34] Matt, I really don't have time at the moment to review documents, so I'm not even looking. Just participating in the conversation is using more time than I should be spending. [15:33:45] jhutz: got it. I wasn't expecting to finalize anything; I wasn't expecting to hear that this was any different from other rpc refresh topics, which I understood would go to afs3 for discussion; I wasn't seeing this as an "I-D" unto itself. [16:10:12] --- deason has left [16:20:46] --- deason has become available [16:44:07] --- matt has left [17:29:03] --- Simon Wilkinson has left [18:05:13] --- Russ has left: Disconnected [18:27:01] --- Russ has become available [18:42:14] --- dev-zero@jabber.org has left: Replaced by new connection [18:42:16] --- dev-zero@jabber.org has become available [19:14:21] --- mmeffie has left [21:38:21] --- deason has left