LISTSERV 16.5 - XROOTD-DEV Archives


On Sun, 9 Oct 2011, Matevz Tadel wrote:
>>> In our experience, without the very detailed I/O monitoring, we:
>>> 1) Don't get any monitoring for a client that crashes (disconnects without 
>>> a
>>> close).
>> That information can be put in the summary record, if need be. I say need 
>> be
>> because it's a relatively rare event (yes, it does happen in spurts).
>
> There could be a separate "close on disconnect" trace type that is sent in 
> this case and includes all the information usually associated with close.
Sorry, I didn't make myself clear. In the summary record we could supply 
how many forced close events there were. What you say is along the lines 
of what I was talking about, indicating whether the close was real or 
forced. The "manual close" on disconnect is the appropriate action in the 
collector in any case.

>>> 2) Don't get monitoring while a client is running. Example: it's been 5 
>>> hours
>>> since a job has started; is this because it is getting 1 byte / second, or
>>> because the job takes 5 hours and 1 minute?
>> some more client input would make things more effecient.
>
> What do you mean? The the client would also send monitoring information, 
> either directly to the monitoring host or via the server?
I mean the client sending information to he collector through the server. 
This is the best way of doing it because then you need not configure every 
single client.

> I'd still vote for a single trace entry for a whole vector read. And then 
> have a new option for full vector read unroll as it really pushes monitoring 
> overhead to a new level. Now even I have enough ;)
Good, I was leaning that way myself. Detailed monitoring is a real bear 
and you can easily get to the point that the monitoring itself equals (or 
even exceeds) the data being transferred.

Andy