[00:32:38] --- lars.malinowsky has become available [00:55:16] --- Russ has become available [02:28:18] --- Russ has left: Disconnected [03:09:23] aamaadmi: I'm around now... [03:21:24] --- Simon Wilkinson has left [03:22:20] --- Simon Wilkinson has become available [03:23:19] --- aamaadmi has left [03:23:28] --- aamaadmi has become available [03:28:40] Simon Wilkinson: Yeah. [03:28:57] So, the quick answer to your question is "not easily". [03:29:12] Basically, filesystems tend to deal in names as little as possible. [03:29:34] So, even the cache manager doesn't know the name of a file when it's doing a read or a write - it just knows the inode number. [03:30:07] (Even if the cache manager _did_ know the name, it doesn't send that information to the fileserver. Making it do so would be a major protocol change) [03:30:38] Okay, so let me change my question to the problem and then maybe you suggest me what to do. [03:31:05] I basically want to know what all "directories" got dirtied from point of time A to point of time B. [03:32:00] If I get the file paths, then the task becomes trivial, if I don't I will need to play the volume at hand I guess ? [03:34:24] I saw some mention of "volutils" by Kim Kimball, bt unfortunately could not locate code( if it is Public ) [03:35:19] You're probably best asking on the OpenAFS mailing list about volutils. I'm pretty sure that it is publicly available, just not sure where from. [03:35:59] Do you guess any other online way of doing it. [03:46:04] It really depends on the volume of data you are going to be dealing with. [03:46:26] The problem is that the "directory" is a pretty artificial unit of organisation when you get down to the way that filesystems are implemented. [03:47:47] Erm, I do not follow completely, cannot we assume a linux like directory access and traversal ? [03:49:13] The problem is that when you get down to how the majority of filesystems are implemented, directories are just an overlay. [03:49:34] A filesystem, ultimately, is a way of mapping an identifier (typcially a 32bit number - the inode number) to a blob of data. [03:49:54] FIDs are AFS's version of inode numbers. So, what the AFS RPCs do is map FIDs to blobs of data. [03:50:27] On top of that, you have directories. A directory is a blob of data which contains a list of names, some metadata about those names, and the inode number corresponding to each of those names. [03:51:22] The 'root' directory is stored at a known inode number (typically 1). By fetching the blob of data representing the root, looking up the name you are looking for, fetching the blob of data corresponding to that name (which may be another directory), and so on, you can traverse a whole directory tree. [03:51:26] Doing this in reverse is hard. [03:54:36] Yes, I understand the point now. [03:56:19] And I don't think that OpenAFS community finds it enticing enough to make protocol changes enough to log path names. [03:59:09] It would be a very intrusive change. And the cache manager doesn't really have the information that's required either. [03:59:27] Bear in mind that the same 'name' can point to multiple inodes. [03:59:46] On some operating systems you have the information necessary to work it out. On others its harder. [04:00:15] The way to acheive what you seem to be trying to do with AFS is to do it on a volume level. AFS provides you with information about whether a particular volume is dirty or not. [04:01:25] What sort of information that is ? [04:06:59] If you do a vos examine, you can see when data was last written to the volume. [04:17:03] Hmm, but I guess I will be best of using something like a directory traversal. [04:17:13] Though brute force, that looks more like it. [04:18:02] Or write a hack which could work on my file system, [04:18:57] I guess I could extend the vaList that is passed to audit.c via the CacheManager, to append the file information. [04:19:07] You could certainly produce, and maintain a hash table that goes from FIDs to names as part of your audit log handler. [04:19:41] That vaList only contains the information that's known to the fileserver. The fileserver doesn't know the name of the file - all it is asked to do is to perform an action on a particular FID. [04:20:55] (the only time the fileserver sees a "name" is when it is asked to perform an operation on a directory, such as creating a new object, renaming an existing object, deleting an object, etc) [04:22:02] Infact actually, all I want to do is to know which all directories are dirty. I am not concerned with the "files" as such. [04:22:41] Are you interested in which directories have had new objects created in them, or in directories where files have changed> [04:22:42] ? [04:23:22] All directories where files have changed( write/create/append/del ) [04:23:29] s/del/rm [04:25:52] Okay, so you do need to know about each file. But your hash table could then contain for each FID the name of its parent directory. [04:26:04] I still think you should take a look at whether using volumes will work for you. [04:26:59] I do not understand what you mean by using volumes, I thought that's what meant by creating a HASH table ? [04:27:26] Using AFS volumes. A volume is a unit of organisation in AFS - it contains one, or more directories and any number of files. [04:27:55] Volumes already store information about whether they are dirty or not, when they were last modified, and so on. [04:28:15] Anyway, back later. [04:28:27] Ahh, okay I will catch up with you about that. [06:16:13] ok, i'm dumb. i should have had some tea. sorry. [08:06:24] aamaadmi: as far as I am aware NASA never signed off on the public release of volutils. [08:07:18] jaltman/FrogsLeap: Darn, that's bad news for me I guess. [08:09:26] why not just enable auditlogs, direct them to a named pipe, and filter them? [08:16:32] shadow@gmail.com/barnowlA9683AD7: I do not completely follow. Can you elaborate a bit please. [08:16:57] What sort of pipe are we talking about, here. [08:17:28] "a named pipe" is exactly descriptive of the type of pipe we are talking about. [08:17:51] and the very first google hit for named pipe describes it [08:19:06] No but how do you extract File Names from FID, that's what I am not understanding [08:21:14] you need to start with a list, which, well, you have to generate once. once you have it, you don't extract: you get an audit event when a file is created, renamed or destroyed. if one is created, you get a fid and a filename. there's nothing to discover: you have the exact information you need. if it's renamed, you find that out. if it's destroyed, same [08:21:43] as far as generating a list, the afs dump scanner in src/tests can certainly do it [08:22:13] there is no premade script which does these things. it's not as simple as "here, download this package". all the pieces are available, but assembly is required. [08:24:30] The thing is, this HASH map can be really large [08:25:17] HASH map of the FID's and the file names ( I know I will have to live with it anyhow :) ) [08:26:45] One thing you missed out, in the Audit log we do not get the file path/name. [08:27:17] Infact I discussed with Simon, FID which is a map from AFS space to inode which is File System space is all you get, [08:28:36] i never said hash. i have many tools in my arsenal. [08:28:46] > One thing you missed out, in the Audit log we do not get the file > path/name. [08:28:56] i missed nothing. i told you precisely how to collect that information. [10:38:32] aamaadmi: What you are missing about volutils is how it is used. volutils is used to scan all of the contents of AFS in order to produce a database that maintains a listing of every (volume, relative path, FileID). This information is then augmented with the audit log data that is processed by another tool in real time. (the filter reads the audit log data via a named pipe that the file server writes to.) [10:40:01] The audit log data contains every access and every change to the file system. It includes the addition of all new files and the removal of existing ones. It includes renames. it includes data store operations. With all of that information you can answer the question of which files have been altered. [10:52:49] --- Russ has become available [12:58:24] --- jaltman/FrogsLeap has left: Disconnected [13:17:06] oh fun. more hcrypto pain. [14:43:58] --- jaltman/FrogsLeap has become available [17:17:20] --- jaltman/FrogsLeap has left: Disconnected [17:17:36] --- jaltman/FrogsLeap has become available [17:46:40] hmm, do i have the wrong ? [17:47:43] roken.h is generated from src/external/heimdal/roken/roken.h.in [18:19:13] --- jaltman/FrogsLeap has left: Replaced by new connection [18:19:14] --- jaltman/FrogsLeap has become available [18:39:52] turned out i hadn't re-run ./configure recently enough [18:40:56] huh, afsd segfaults if CellServDB doesn't have a '>' in the '>cell.tld #cell' line of the only cell in the file [18:43:06] Oof. Apparently afsd could use some error-checking love. [18:48:05] --- jaltman/FrogsLeap has left: Disconnected [18:48:17] --- jaltman/FrogsLeap has become available [19:07:25] no big shock [19:07:34] > aamaadmi: What you are missing about volutils is how it is used. [19:07:55] it's also not needed... you can generate the same information with afsdump_scan over one-time dumps [19:29:24] --- jaltman/FrogsLeap has left [19:36:42] --- jaltman/FrogsLeap has become available [19:57:37] --- jaltman/FrogsLeap has left: Disconnected [20:00:17] --- jaltman/FrogsLeap has become available [20:01:05] --- jaltman/FrogsLeap has left [22:20:34] --- reuteras has become available [23:11:13] --- lars.malinowsky has left [23:55:29] --- reuteras has left