[00:17:11] --- manfred furuholmen has left [00:18:54] --- manfred furuholmen has become available [00:25:33] --- Simon Wilkinson has become available [00:32:12] --- manfred furuholmen has left [00:34:00] --- dlc has left [00:34:15] --- manfred furuholmen has become available [00:57:04] --- Simon Wilkinson has left [01:02:27] --- abo has left [01:08:42] --- abo has become available [01:25:15] --- dev-zero@jabber.org has left [01:29:15] --- dev-zero@jabber.org has become available [01:53:45] --- dev-zero@jabber.org has left [02:04:43] --- reuteras has left [02:04:49] --- reuteras has become available [02:38:35] --- dev-zero@jabber.org has become available [02:52:04] --- dlc has become available [03:39:17] --- abo has left [03:41:39] --- SecureEndpoints has become available [04:20:32] --- manfred furuholmen has left [04:37:19] --- abo has become available [06:26:16] --- matt has become available [06:56:58] --- manfred furuholmen has become available [07:03:31] --- manfred furuholmen has left [07:14:37] --- Simon Wilkinson has become available [07:26:02] --- reuteras has left [07:54:57] --- Rrrrred has left [07:55:23] --- Rrrrred has become available [08:00:11] --- Simon Wilkinson has left [08:37:20] --- Moose has become available [08:38:08] hmm [08:38:29] Kula, SCS & I just invented a new game for the next workshop: Bad AFS Jokes [08:38:41] Your mama's so fat she needs a whole partition for her volume [08:38:55] Your mama's so ugly no fileserver will accept her callbacks [08:45:59] callbacks: ouch [08:49:42] --- manfred has become available [08:50:31] Your mama's so corrupt she crashed the salvager. [09:04:09] --- jhutz@jis.mit.edu/owl has left: Disconnected [09:04:15] --- jhutz@jis.mit.edu/owl has become available [09:12:05] --- dev-zero@jabber.org has left [09:14:38] --- Russ has become available [10:02:51] --- manfred has left [10:16:35] --- dev-zero@jabber.org has become available [12:47:22] --- manfred has become available [13:11:14] holy crap on a crapstick, the afs loader just gave me a "You should not be seeing this message" error [13:11:31] then cover your eyes :-) [13:11:33] (on a linux client. with a new kernel. Hooboy, I bet i need to redo the client. well, i was gonna anyway) [13:11:54] What kernel, what AFS? [13:21:38] --- manfred has left [13:33:14] it doesn't matter [13:33:35] it's a homegrown kernel-from-hell, and i upgraded it to new-hell [13:33:49] and the error message isn't even from afs itself. hahahah. [13:33:55] we needed 1.4.8 client transcripts anyway [13:40:01] --- manfred has become available [13:47:38] --- manfred has left [13:53:55] --- manfred has become available [15:11:31] --- manfred has left [15:40:58] well, we've known for a while now that linux nfs and sun nfs don't play nice [15:41:57] Uh, context? [15:42:34] that whole conversation earlier about the nfs translator and the sun rpc symbols and the gpl [15:46:18] So, I don't know what you mean by "linux nfs and sun nfs don't play nice", or what relevance it has to that conversation. There certainly used to be some pretty serious interop problems a long time ago, but that was a long time ago, and is completely unrelated to Linux unexporting symbols that a kernel module might need to implement an ONC RPC based protocol. [16:09:36] was thinking about that your comments that the translator (aka nfs) has to support sun inetboot - and the two don't play nice, and that when sun inetbooting for me, I have to in the middle of it set NFS_CLIENT_VERSMAX=3 o.w. the sun can't inetboot off of the linux server [16:32:24] I'm surprised even that works for you. The problem is not that sun NFS doesn't "play nice". The problem is that the PROM's IP stack (not inetboot, not NFS) cannot handle fragmented IP datagrams unless the fragments arrive in order, and Linux sends fragments in reverse order because that is most convenient for it. [16:32:36] (again, Linux's IP stack; it has nothing to do with NFS) [17:21:49] fragments shouldn't have to arrive in any particular order [17:22:04] they should be able to arrive in random order and still be assembled correctly [17:22:52] but, from what you say, I guess that's why nfsv4 seems to work fine when booted fr a regular os boot as compared to the inetboot [17:24:01] That is true, but the amount of memory available in the PROM for data and especially code is quite small, so they took a shortcut knowing that their servers would send them in order and that ethernet switches generally do not reorder packets (and when this code was originally written, it only supported booting from a server on the same network anyway) [17:24:30] so perhaps nfsv3 on linux is sending smaller packets that don't need to be fragmented [17:24:38] Well, that's the other thing. I wasn't aware inetboot supported nfsv4 at all. [17:24:46] nfsv4 is the default [17:24:55] Uh, I doubt that. [17:25:08] It's a completely different protocol. [17:25:19] ok, perhaps 3 is the default unless the server claims to support nfsv4 [17:25:56] what I can tell you is that the inetboot can't nfsmount what linux is serving out unless I change, on solaris client inetbooting, NFS_CLIENT_VERSMAX from 4 to 3 [17:26:18] even though the linux nfs server is indeed doing nfsv4 [17:26:39] What do you mean, "on the solaris client"? Where do you make this change? [17:27:46] you make it in the inetbooted environment at: /etc/default/nfs [17:28:19] Ah, in a place that has nothing to do with inetboot. [17:28:47] well, you do a : boot net - install on solaris [17:28:56] that starts an inetboot [17:29:08] it dumps you to a shell after a bunch of bogus warning messages [17:29:23] edit the default/nfs file and set NFS_CLIENT_VERSMAX=3 [17:29:29] exit the shell, the installer continues [17:29:41] --- Dale Ghent has become available [17:29:44] at some point it asks you where the install media is [17:29:50] and you point it to the nfs linux server [17:30:10] No, "inetboot" != "a network boot". inetboot is the name of a particular software component which is used during a particular phase of the network boot process. It handles fetching the kernel itself and a few key modules (as requested by the kernel) needed to get the system up and running. By the time anything can read /etc/default/nfs, the real OS is up and running and both inetboot and the PROM's IP stack are long gone. [17:30:13] which only works if you set that clientmax to 3, o.w. it gives a non-detailed error message about not being able to mount things [17:30:50] k, then, it's a different problem and something else is broken [17:31:19] inetboot essentially serves the same function as ufs boot blocks [17:31:32] tho, I should add that he /etc/default/nfs is just a ramdisk running [17:32:05] er? [17:32:06] Right; if you're having problems with a net-booted Solaris system mounting a volume from a Linux nfsv4 server, that is completely unrelated to the problem that makes netboot not work at all when the NFS server you're booting from is Linux. [17:32:26] yeah, the initial fetching is via tftp, I think ... nfs not involved [17:32:46] it is [17:32:46] Yes, inetboot serves the same function as UFS boot blocks, but for network boots. [17:32:58] Only the fetching of inetboot is via TFTP. [17:33:29] --- abo has left [17:34:10] --- abo has become available [17:34:16] well, whatever... still a PITA :) [17:34:40] --- stevenjenkins has left [17:35:08] dale, i will inject your afs_conn patch into rt and get it dealt with later tonight, unless you want to stick it in rt [17:35:41] 'boot net' issued, OBP gets its IP info (DHCP or bootp), contacts the install server specified in the DHCP/BOOTP response via TFTP to get the inetboot image. It then executes the retreived inetboot with whatever options were passed to 'boot' including the install server info, inetboot then makes a NFS mount to the install server and boots the kernel it's expecting to find... and also passed along any flags to the kernel that where passed to it [17:36:07] derrick, does it pass muster (in terms of the rename?) [17:36:33] Correct. where "any flags" is actually "only specific flags, plus the kernel image name and command-line arguments", because each stage parses the flags word into a bitmask and then re-emits it before passing it on to the next phase. :-( [17:37:03] I'll go ahead and stick it in RT for ya [17:37:56] i need to verify it further but it seems reasonable. the struct conn rename isn't unreasonable: it's not public [17:37:57] --- stevenjenkins has become available [17:37:59] wonder if inetboot just uses nfsv3... [17:38:09] let [17:38:15] let's look at the source! [17:38:33] http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/psm/stand/boot/sparc/common/inetboot.c [17:38:43] --- Moose has left [17:40:51] let's look at a beer. another beer, i mean. the last one was too small. maybe a much larger one [17:43:25] 1 Liter! [17:43:33] 5 liters is 1 too many [17:43:36] I assume it does, though I haven't looked at the source lately. [17:43:46] I have a dogfish head black & blue that will be opened shortly [17:43:47] Note that wanboot may work differently. [17:43:51] 5 liters is 3 too many [17:44:07] I disagree. 4 liters was just about right. [17:44:14] ouch [17:44:21] mmm... Hofbrau House [17:44:29] that 5th liter was definitely the problem [17:44:37] hofbrauhaus sadly not opening on 1/20 like planned. i have to wait til 2/24 [17:44:49] shadow - which one? Philly? [17:45:08] pittsburgh. philly? hah! they don't get on [17:45:11] one [17:45:22] thought it was Philly, not Pitts for some reason [17:45:41] 3 blocks awy [17:45:43] way [17:45:47] whatever [17:45:56] suits me fine... Pitts has the Church Brewery [17:46:29] damn... I keep forgetting to stop down at Ithaca Beer to try some of their new stuff [17:49:01] church brew not 3 blocks away [17:53:39] Where, excatly, is it? I haven't been able to figure that out. [17:54:03] is what? [17:57:23] the hofbrauhaus [17:57:50] bug submitted [17:57:52] suth water st at south 27 [17:57:56] th [18:04:10] here's a question [18:04:43] is there a tool that'll let you take a volume dump and output a tar file ? (no, the afs permissions don't matter) ? [18:05:02] not directy [18:05:11] one could be writtne [18:05:50] either wrapping dumpscan libary or tweaking dumptolo [18:05:56] dumptool [18:06:27] ah good [18:06:48] thanks (was asking that for a fellow cow-orker) [18:30:42] I am very tempted to add the ability to mount a volume dump under the Freelance root.afs volume as a feature for a future Windows release. [18:36:47] --- Dale Ghent has left [18:37:14] Yeah, emitting a tarball from a volume dump seems doable. It seems like the hard part would be marrying dumpscan with something that understands tar [18:37:48] SecureEndpoints: i would like to see that feature; I'd like to see more centralized featues that worked similarly [18:38:57] you mean a dumpserver which can fcan sevre the afs prototocol? [18:39:06] what do you mean by centralized? [18:39:19] a server which can serve afs dumps via the afs protocols? [18:39:27] no, just an extended ability to have clones online, navigable from the location database [18:39:41] sure sowould the rest of us [18:39:48] well, I'm just saying... [18:39:53] how do you do wit without extendinf yhr vldb? [18:40:01] or using the all the volume ids? [18:40:45] Extend the volserver and/or fscm interfaces, maybe, so if the CM knows where the RW site is for a volume, it can ask that server what other clones it has. [18:41:18] that's not e a bdd idea [18:44:19] Yeah, it needs some fleshing out, but I think with a new FS RPC, a bit of extension to the volume package, and some clever CM enhancments, you could get something pretty nice. [18:44:35] That has a very appealing flavor [18:45:28] Incidentally, I can totally imagine how a (read-only) dumpserver would work, built almost entirely on code I already have. [18:45:38] sure [18:51:29] i know hos to extend volume packaghe ~easily [18:52:01] I would like to avoid wasting the volume id space on clones. combining a volume id and a timestamp might work well. Especially for the existing user interfaces we would be back ending [18:52:09] Well, it's really just "tell me all the clones of this RW", which probably already exists in some form. [18:52:37] Given the current fscm interface, you can't avoid it. [18:53:55] is changing the interface out of the question? I think its ok for existing clients to not be able to see snapshot clones while new clients can [18:54:04] Inventing a whole new RPC interface where files are identified by more than just a fid, and volumes by more than just a volume ID, and so on, is a much larger and messier project than adding minor extensions to the fileserver and cache manager to allow the latter to present a useful view of the clones people are already creating. [18:55:13] Changing the interface is not out of the question, but it is a much larger change, both protocol-wise and in implementation. [18:55:33] I expect that if snapshot clones become useful that we are going to have to be able to support thousands of clones [18:55:42] yes [18:56:24] (it will require changing how both the FS and CM track volumes and vnodes, and all the internal interfaces related to that, and all the fsint and cbint calls related to that, and so on) [18:57:04] no argument. [18:57:20] whereas I think you can do something to find existing clones between now and the workshop [18:57:31] s/you can/someone could/ [19:00:52] that is a surprising amount of agreement [19:04:11] the short-path approach doesn't seem out of the question to you, SecureEndpoints? [19:10:23] I'm agreeing that doing the right thing is a lot of work [19:13:17] that's true. its clear that using clones as they exist today can work for at least a umich-scale environment, jhutz's proposed design is not un-right, in a meaningful sense [19:14:08] while I believe that the short term approach would be useful to some sites I do not believe it is a general purpose snapshot mechanism nor do I believe it would be worthwhile integrating into the MacOS X or Windows environments [19:14:30] it could be in macos if it could look like timemachineh [19:14:39] but i donf'y have faith it could [19:14:47] being forced to support the short term approach may very well compromise the ability to support the broader approach on the same systems [19:15:26] it feels premature to jump there [19:15:30] i'd have to see what the shot term approach as to agree or not. you may well be correct [19:18:21] efforts that can deliver results with smaller investment of time and resources are in my experience usually worth pursuing [19:18:29] even if they are replaced [19:18:51] the problem is that once we deploy an interface we must support it [19:19:00] deprecating interfaces is extremely hard [19:19:22] perhaps there would be a way in this case to separate the interface from the user visible function [19:19:34] it is perfectly fine for sites to deploy their own extensions that we do not deploy as part of the openafs distribution [19:21:35] in fact the protocol standardization process leaves the door open for assignments in order to permit this to occur [19:21:37] sure look at the stupids thay are still in rx from pre 3.4a. doesn't mean we can't see if we can get that far but i wouldn't promise we could [19:32:27] i would happily review a fscm extension. i would not wow that hurts [19:32:34] i need asprint [19:32:42] aspirin [19:33:20] anyway, i would not write it,. and i would not promise i will like any version. but i will look and if it's something reasonable i would try to suppose it [19:33:24] support it [19:44:10] --- matt has left [20:00:06] Jeff, it sounds like you're arguing against doing something good because it will make it harder to do something perfect later, or make it harder to convince people to adopt the perfect thing when the good thing already does what they want. [20:06:17] a major version number change sounds like a good place to deprecate and change interfaces, etc [20:16:22] and when it is politically unacceptable to get rid of functionality you end up never changing the major version number [20:19:06] Jeff, for the people that would be happy with the restricted number of clones that approach would support, they would not want to migrate if they have already built an infrastructure around it. For the folks that want to be able to implement TimeMachine style functionality on top of AFS, it would not suffice. [20:22:04] we have this long list of features that are non-controvertial that can be worked on. why doesn't anyone do so? [20:22:34] matt's gsoc student was working on per file acls. not completed. why not finish it? [20:23:14] fixing the file locking in the file server so that denial of service cannot be performed? or adding byte range locking support? [20:23:46] working on the rx/tcp code that is in the tree? [20:24:54] all those features would be nice :) [20:25:02] fixing the bugs in DAFS that prevent it from creating volumes [20:25:22] where "it" is any build of the 1.5.x file server [20:25:50] O_DIRECT and O_SYNC implementations [20:26:25] real time performance analysis and automated cache tuning [20:27:27] the HostAFSD backend [20:27:35] just to name a few [20:29:34] pts aliasing [20:29:59] there is this long list of stuff that was specified at hackathons years ago [20:30:31] would you say a significant amt of progress came out of this past year's GSOC? [20:31:04] it was a worthwhile experience. it didn't produce a lot of usable code [20:31:25] we got Jake out of it [20:31:35] took too long for the coders to get comfortable with what was what? [20:32:01] and perhaps Vamshi and Kiran who are now working on an msft grant on a disconnected implementation on Windows [20:32:22] hopefully that would be done so that it can be ported to the other archs [20:32:58] speaking of windows, how's the IFS stuff coming along? [20:33:10] openafs is an overwhelming code base. students do not learn C these days in school. and design and development of distributed systems is just hard. [20:33:16] 3.5 months is not enough time [20:34:00] the ifs stuff is mostly done. there are some areas I'm still not happy with and there are user interface issues that need to be dealt with [20:34:02] I've noticed that schools (Cornell included) seem to be going Java for their comp sci stuff. C is only learned if it is taken as an extra class. [20:34:46] even when I went to school students were taught Pascal. If you wanted to learn C you did it on your own [20:35:06] I specifically didn't go to Georgia Tech b.c. they were going from C to Pascal [20:35:32] the real problem with Java is that student that use it never learn anything about to deal with systems that do not perform automatic garbage collection. [20:35:34] tho, Cornell's philosophy is to teach you such that you can easily learn any particular language [20:35:39] and they have no idea what a pointer is [20:35:56] tho, not teaching stuff in C seems to make me think something might be left out [20:39:08] Oh, it's worse than that. Java is an awful teaching language; it hides details you really don't need to know about but forces you to understand all sorts of complex machinery in order to build even the simplest program, so that introductory CS classes become more about learning Java than about learning about data structures or algorithms. [20:39:08] --- dev-zero@jabber.org has left: Lost connection [20:40:21] C is actually really good for learning about data structures, though something a bit higher level would work well, too. Pascal, for example, which was actually designed to be a teaching language by someone who knew something about computer science pedagogy. [20:42:43] OTOH, to a certain extent, this stuff is just Hard. All through college I was constantly amazed at the number of people around me who just didn't "get" various things, specifically in CS. And the people around me were students in Carnegie Mellon's computer science program! [20:46:42] --- dev-zero@jabber.org has become available [20:53:25] --- matt has become available [20:56:49] of course, part of it depends on how it's taught [21:53:04] --- matt has left [23:04:25] --- reuteras has become available [23:55:12] --- Russ has left: Disconnected