[00:16:38] --- meffie has left [00:16:38] --- jblaine has left [00:19:09] --- jblaine has become available [00:19:22] --- meffie has become available [00:39:35] --- pod has become available [06:55:40] --- lama has become available [07:13:37] What's the consensus re: syslog vs. /usr/afs/logs going forward? [07:15:10] there's a consensus? [07:15:40] for simple installs, syslog is probably too annoying. for enterprise, syslog is probably the answer [07:16:29] There are 'simple' OpenAFS installs? [07:16:47] More simple than dealing with syslog already on every UNIX box in the world? [07:17:20] on a stock linux install, you never have to touch syslog, it just works [07:17:29] And this would be different how? [07:17:44] log to daemon.* [07:17:47] done [07:18:03] enjoy. i won't be doing that [07:18:10] What's the issue? [07:18:12] making afs write to syslog most likely require editing syslog.conf, and either cluttering /var/log/messages (ew), or creating a new destination, dealing with rotation, etc [07:18:26] no it doesn't [07:18:29] at all, on all counts [07:18:51] didn't work on my fileservers [07:23:01] /usr/afs/logs can be managed using the existing bos getlog command. I can't do that with syslog. [07:23:19] Yay, custom junk for no reason! [07:23:44] Just makes no sense to me. [07:23:59] Let's implement and maintain something that already exists. [07:24:06] once again its a question of ripping out the rug from existing deployments which is something we care about. bos already exists [07:24:15] --enable-syslog [07:24:38] I don't know what kind of environments people deal with, but logs sitting out on server disks are useless [07:24:45] there's no standardized remote daemon management tool for servers. if you have something standard to replace bosserver which happened to integrate syslog support including log config and fetching, then yay. otherwise, you enjoy your syslog and i will do what i am doing and be happy [07:24:54] I don't comprehend environments without a syslog master host where all significant information is kept [07:25:12] well, they exist outside your comprehension then :) [07:25:14] not only that, but something like this makes coarse metric analysis a no brainer [07:25:41] Splunk, logstash, loggly.com, etc = simple time-series based graphing [07:25:56] at interval, syslog(some_AFS_stat) [07:26:13] but that data is not what is written to the /usr/afs/logs. [07:26:14] sure. bos getlog = simple instand log fetching too. give me that, authenticated, without a web frontend i have to screw with and we'll talk [07:26:15] you are assuming something useful is logged :) [07:27:09] you act like setting up OpenAFS isn't already ridiculously involved [07:27:12] afs doesn't log anything useful for analysis other than watching for gross errors unless you have debugging turned up. then I would have to question syslog's ability to deal with that volume of messages [07:27:19] the data that would interesting to me in syslog is audit data. for that we provide the ability to audit log to a pipe which you can redirect to syslog or whatever else you require [07:27:22] that configuring syslog.conf would be suuuuch an undertaking [07:27:27] setting up afs is very simple if you have a kdc. [07:27:56] drop some binaries in place, make a directory, touch a file, run like 5 commands, done [07:28:09] phalenor: because it doesn't have it NOW is really of no importance to me [07:28:14] most of those being "bos create" [07:28:34] there's a lot of status quo mindset in OpenAFS IMO [07:28:57] even considering the "don't disrupt" issues [07:29:22] phalenor: And syslog can absolutely handle the load [07:29:22] take a step back. what data do you want in syslog? [07:29:58] oh, i don't object to you having syslog. just don't cram it down my throat [07:30:00] I want errors, warnings, restart notices, and eventually metric data [07:30:22] all of which can be alerted on simply with any modern tool that processes log data, like Splunk [07:30:30] or graphed [07:30:39] first you have to generate that data [07:30:43] sure [07:30:50] and you are aware that OpenAFS does not [07:30:56] the metrics, sure [07:30:56] --- deason/gmail has become available [07:30:59] well, in some cases it does [07:31:05] yes [07:31:08] not even the categorized errors, warnings, restarts, etc [07:31:30] restarts can be generated easily. install a bosserver notifier. i have been distributing one for about 10 years [07:31:34] openafs logging is effectively debug out [07:31:57] and it will remain that way until people grasp what COULD BE [07:32:36] thing is, i don't think collecting metrics and logging are necessarily the same thing [07:33:03] I'd rather collect metrics with a few commands than have to parse a log file. [07:33:03] what is a metric? basic information reported somehow. [07:33:16] phalenor: You don't *parse* the log file [07:33:20] it is parsed for you [07:33:22] sure. why does that somehow have to be syslog [07:33:31] or any log [07:33:32] it doesn't [07:33:57] metrics collection is not the same as logging. I don't want to waste my network bandwidth writing log entries for every RPC a server performs [07:34:09] I said coarse. [07:34:27] ok. then focus on the real problem and not on "omg syslog", imo. because syslog is just more of the same [07:34:34] and the same that i have is already fine [07:34:44] metrics collection is something that is performed by a service and made available by an RPC interface for querying [07:34:47] Just like every other "monitoring tool" in existence gathering metrics with RRD and showing you an average of the last few minutes and not every RPC [07:35:24] so you want to collect with xstat tools. fine. [07:36:00] and do any of those tools get their data from log files? no. snmp, openldap's back-monitor, xstat*, etc. [07:37:32] phalenor: you're making the argument that Established Monitoring Methods do not work off of syslog data, historically, and I am making the case that syslogging basic metric data every 5 minutes for a Very Basic View Into Your Shit is not a stretch [07:38:09] I have never suggested here (perhaps you guys keep missing the "coarse"), "Guys let's do high-end real-time performance monitoring with syslog()!"\ [07:38:15] so syslog it from the agent that collects it from (xstat, bos, etc). do it. you don't need anyone's help. distribute it. it will get improvements. [07:38:42] sure, and you can do that. write a tool that probes all of your services that you wish to monitor at the interval you care about and write the data wherever you wish. however, I wouldn't write it to syslog. I would put it into a database for RRD [07:41:51] I will look into it. [07:42:19] I just feel like 9 different people over the last 20 years have been homebrewing their own stuff [07:42:28] I can't imagine what Tom K went through [07:42:48] publishing metrics is not new tech [07:43:01] sure. if openafs distributes something we could solve that. we could have a separate git module and a separate release cycle but official packages [07:43:54] and I hate to say it, but what Tom K went through resulted in yet another set of code/parsers/reporter code [07:44:04] when, IMO, there is a fundamental flaw [07:44:24] Tom as in Keiser? he got a portion of that data from my cell without me having to do a darn thing. [07:44:40] yes, he queried your cell [07:44:49] I am talking about ALLLL of the work he did with that data afterward [07:45:07] parsing xtat output, etc [07:45:07] which he would still have to do even with syslog 'coarse' metrics [07:45:10] well, that he can query an unmodified cell has value [07:45:12] OMG [07:45:12] so ask him to open source it. what he did with the data afterwards has nothing to do with data collection [07:45:45] phalenor: There are existing tools for graphing any field in syslog data over time. [07:45:54] phalenor: It's not 1990 anymore. [07:46:31] that would be a lot of data going into syslog [07:46:46] syslog can handle it [07:47:15] uh yeah, syslog can handle 5 messages from 20 servers every 5 minutes [07:47:24] Anyway. [07:47:36] I'll start something on openafs-devel [07:50:44] I'd be curious, purely for academics' sake, to compare the resource requirements for SNMP agent polling vs. syslog pushing [07:51:21] I'd put $20 on SNMP being grossly more costly [07:51:46] apples and oranges, for sure, but like I said -- purely academic [07:54:23] --- jaltman/FrogsLeap has left: Replaced by new connection [07:54:24] --- jaltman/FrogsLeap has become available [08:17:51] --- jaltman/FrogsLeap has left: Replaced by new connection [08:17:52] --- jaltman/FrogsLeap has become available [08:20:22] --- jaltman/FrogsLeap has left: Replaced by new connection [08:20:23] --- jaltman/FrogsLeap has become available [08:23:10] --- jaltman/FrogsLeap has left: Replaced by new connection [08:23:11] --- jaltman/FrogsLeap has become available [08:44:10] --- meffie has left [08:45:51] --- jaltman/FrogsLeap has left: Disconnected [08:53:57] --- jaltman/FrogsLeap has become available [09:00:38] --- jaltman/FrogsLeap has left: Replaced by new connection [09:00:39] --- jaltman/FrogsLeap has become available [09:24:24] --- jaltman/FrogsLeap has left: Disconnected [09:26:38] --- jaltman/FrogsLeap has become available [09:35:16] --- jaltman/FrogsLeap has left: Disconnected [09:35:23] --- jaltman/FrogsLeap has become available [10:03:58] --- jaltman/FrogsLeap has left: Replaced by new connection [10:03:59] --- jaltman/FrogsLeap has become available [10:07:09] --- jaltman/FrogsLeap has left: Disconnected [10:07:16] --- jaltman/FrogsLeap has become available [10:46:41] --- jaltman/FrogsLeap has left: Disconnected [10:51:33] --- jaltman/FrogsLeap has become available [10:59:24] So if I submitted a patch to xstat_fs_test that adds -syslog as an option which syslogs the collected data instead of printing it to screen, it would be rejected? [11:01:02] (on principle) [11:02:30] I would suggest taht xstat_fs is the wrong tool for syslog output. You should write a tool that collects data and sends it to syslog [11:03:04] instead of using the existing code that collects data and sends it to screen [11:04:36] you realize that xstat_*_test was written as a debugging tool. It doesn't exactly have the behavior I would want in a general purpose data collection service [11:08:28] I realize that. I am trying to make use of what exists, which, aside from spitting data to the screen in a non-format, does something more useful for myself and others who have a modern syslog aggregation+analytics world [11:10:04] I think it's realistic to prophesize that xstat_*_test, regardless of their original intentions, are what will exist for xstat data collection for many years [11:10:08] I would argue that what you are proposing is a hack. Much like the many hacks that you have complained in recent weeks about us being forced to support into the future because people depend upon it. If you want to commit us to supporting something, don't hack it. Design it [11:10:28] we don't ship them [11:10:43] or at least not on all platforms [11:11:19] what, xstat_*_test? [11:12:01] you want a library to collect xstat statistics information, don't you? (and then output it to syslog or wherever) [11:12:10] they build with 'make' from the top level [11:12:16] then use the xstat library, not a program that links to xstat and prints stuff to stdout [11:12:19] we don't use shipped packages [11:12:35] there are a lot of things that build with make [11:12:51] if it builds with 'make' from the top-level, it shipped with it [11:13:11] different definition of 'shipped' then. [11:14:20] (if it builds with make from the top-level and doesn't require --build-unsupported-too or somesuch, that is) [11:15:59] deason: Yes, that's the "discussion". Everything other than printf() calls every frequency-seconds is already done. My view is 'allow a flag to alter the destination of the output' [11:17:25] and there's your flip-side argument to make a new binary because the output destination is not stdout [11:17:46] ah, no, for some reason I thought you were talking about wrapping xstat_*_test, interpreting the output and printing it to syslog [11:17:47] and there's Jeff's argument to start an N-year project to implement something grand [11:17:55] oh HELL NO [11:18:23] I'm saying call syslog() instead of printf(), and with as more concise format for the syslog line(s) [11:18:29] via --syslog [11:18:34] err, sorry, -syslog [11:19:05] I think the view of many is that we don't want xstat_*_test to become the statistics-gathering tools that are considered the de facto "best" way to collect statistics, but I think that ship may have sailed at this point [11:19:14] exactly [11:19:33] they are intended to be "test" programs; that is, examples of how to use the xstat framework, so they don't work very well in many ways [11:19:48] they work far more well than nothing [11:20:01] and resources are in short supply, clearly [11:20:15] so it is very obvious to me that the "most correct" way forward is to write something different, whereas you want something that is quick and works [11:20:49] quick, works for me, works for others, and is not disruptive [11:20:51] the point of view I hear from jeff and probably others is that we're getting a bit sick and tired of quick'n'dirty solutions, since it seems like that's all we have these days, and we're stuck with the results afterwards [11:21:09] but that's just my impression [11:21:17] I can completely understand that [11:24:49] > instead of using the existing code link the collection part and replace the print part. multiple source files [11:26:15] and so, here's the rub -- if I copy xstat_fs_test.c, make my "own" 'xstat_logger' which defaults to printing to the screen in a much more sane format than the godawful mess that _test is, allows -csv to stdout where reasonable for a collID, and allows -syslog ... would THAT be rejected on principle as well? [11:27:24] not copy. split. [11:27:54] take the xstat collection out of the test program. link it into that. link it into your new thing [11:28:24] put the stuff that's common between them into xstat_fs_common.c, then make xstat_fs_whatever.c, and link xstat_fs_common.o and xstat_fs_whatever.o into your new thing [11:28:38] old thing is xstat_fs_test.o and xstat_fs_common.o linked [11:28:42] yup [11:28:44] okay, I'll take a look [11:28:47] I follow [11:51:15] This doesn't look very extractable :) [12:08:58] --- Jeffrey Altman has become available [12:19:58] * jblaine sees everyone from openafs-info is diving into the Tiny Simple Tasks... [12:57:50] --- meffie has become available [13:03:54] jblaine: i have something maybe useful for you; /afs/sinenomine.net/user/mmeffie/public/patches/xstat-fs-test-format-2100430.diff [13:05:43] (not exactly what you want, but) [13:06:08] I have no global AFS cell access [13:06:18] can you email it to jblaine@kickflop.net perchance? [13:06:24] also, note the afsmonitor (yuck) uses xstat libs and writes to files. [13:07:24] you can see that path on http://git.sinenomine.net/~mmeffie/patches [13:08:25] if anything should log to syslog, maybe that could. you could even make a front end to the file that defines the stuff you want to monitor. [13:11:39] gosh [13:12:25] it's a shame that never made it into openafs.git [13:12:34] thank you [13:16:26] My thing is, I don't want to have patches around like this in our environment to apply with every release we roll out. It makes no sense for people to keep reinventing wheels like this in silo'd places. [13:16:57] So my take is, I'm either going to do something that's going to get merged, or I'll just do without. [13:16:57] i didnt push it because, i suck, and i wanted to see if i could integrate with christov's work [13:17:32] christov's work? Is that another fork of code sitting around somewhere for N years with cool stuff in it? :) [13:17:42] based upon the date of the patch I assumed mike was trying to do something that would be consistent with the effort to add additional output types to vos, pts, etc. [13:18:00] i started my *just* before that. :) [13:18:56] I suggest you go back and read the mail history. there are also patches for adding uuid printing sitting in gerrit that got blocked on the desire to have someone do something consistent. [13:30:47] are initcmd()s restricted in signature? [13:30:54] --- lama has left: Lost connection [13:31:03] seems they'd have to be [13:33:29] --- lama has become available [13:35:51] ah, I see, [14:22:16] --- lama has left [14:41:41] sweet, I extracted the poller from xstat_fs_test.c [14:41:41] --- jblaine has left [14:43:25] --- jblaine has become available [14:50:12] --- meffie has left [15:43:23] --- deason/gmail has left [15:45:12] --- lars.malinowsky has become available [17:35:54] --- rra has left: Disconnected [17:53:39] --- Russ has become available [19:56:27] --- jakllsch has left [20:00:09] [jblaine@dev openafs]$ git push fatal: The remote end hung up unexpectedly [jblaine@dev openafs]$ [20:02:39] oh, haha [21:46:44] --- jaltman/FrogsLeap has left: Disconnected [21:50:50] --- jaltman/FrogsLeap has become available [22:15:10] --- jaltman/FrogsLeap has left: Disconnected [22:32:13] --- Russ has left: Disconnected