On Sun, 9 Oct 2011, Matevz Tadel wrote: >>> In our experience, without the very detailed I/O monitoring, we: >>> 1) Don't get any monitoring for a client that crashes (disconnects without >>> a >>> close). >> That information can be put in the summary record, if need be. I say need >> be >> because it's a relatively rare event (yes, it does happen in spurts). > > There could be a separate "close on disconnect" trace type that is sent in > this case and includes all the information usually associated with close. Sorry, I didn't make myself clear. In the summary record we could supply how many forced close events there were. What you say is along the lines of what I was talking about, indicating whether the close was real or forced. The "manual close" on disconnect is the appropriate action in the collector in any case. >>> 2) Don't get monitoring while a client is running. Example: it's been 5 >>> hours >>> since a job has started; is this because it is getting 1 byte / second, or >>> because the job takes 5 hours and 1 minute? >> some more client input would make things more effecient. > > What do you mean? The the client would also send monitoring information, > either directly to the monitoring host or via the server? I mean the client sending information to he collector through the server. This is the best way of doing it because then you need not configure every single client. > I'd still vote for a single trace entry for a whole vector read. And then > have a new option for full vector read unroll as it really pushes monitoring > overhead to a new level. Now even I have enough ;) Good, I was leaning that way myself. Detailed monitoring is a real bear and you can easily get to the point that the monitoring itself equals (or even exceeds) the data being transferred. Andy