Print

Print


Hi Patrick,

There is an interaction between the interval specified to the script and 
the "int" parameter supplied to the receiver which invariable causes time 
dilation. I don't have a good solution for that other than to say 7 
minutes is close enough. Note that the documentation

http://xrootd.org/doc/dev45/cms_config.htm#_Toc454223033

says "estimated" time when specifying the "int" value. I know that isn't 
comforting but we have the issue of trying to co-ordinate two async 
processes that compute the load tatistics so the timing will invariably be 
off without a lot more wffort.

As for the redirector providing those statistics that's controlled by the 
"ping" directive.

http://xrootd.org/doc/dev45/cms_config.htm#_Toc454223053

and should be printed every 10 minutes or so (by default). So, if it is 
actually 110 minutes then something is amiss here. You can, of course, 
report that as a bug in github if this is causing problems.

Andy

On Fri, 28 Oct 2016, Patrick McGuigan wrote:

> Hi Andy,
>
> I am looking at using something very similar in my setup and I was wondering 
> about the data that I am seeing in the log files.
>
> In my data servers I am using:
> cms.perf int 5m pgm /usr/share/xrootd/utils/XrdOlbMonPerf 300
>
> In my cmsd.log files I see lines like:
> 161028 00:05:59 449 Report_Usage cpu=7 net=72 xeq=0 mem=99 pag=0 dsk=53 
> 23414869
>
> being generated every 10 minutes.  I would have expected to see this every 
> five minutes based on the interval of 5 minutes.  Am I missing something?
>
> Also, does the log line indicate data being collected from the performance 
> monitoring program, or data being forwarded to the redirector?
>
>
> In the redirector I see lines like:
> 161028 01:45:59 27500 Node: storage-23-14.local load=9; cpu=10 net=13 inq=2 
> mem=99 pag=0 dsk=23368366 utl=53 shr=[100 3 0]
>
> The period between these lines (for the same data server) appears to be 110 
> minutes.  Is the line above simply a periodic summary of the conditions that 
> were last reported from the dataservers?
>
>
> Thanks,
>
> Patrick
>
>
>
> On 10/18/2016 03:23 PM, Andrew Hanushevsky wrote:
>> Hi Max,
>> 
>> The load is calculated by the script you specify to run using cms.perf
>> 
>> http://xrootd.org/doc/dev44/cms_config.htm#_Toc454223033
>> 
>> if you haven't specified it then only the file system loads are reported 
>> and
>> everything else is zero. We provide a sample script in the "utils" 
>> directory of
>> the source repo called XrdOlbMonPerf and it should work for any Linux 
>> system.
>> However, do verify as we normally don't update this script unless someone
>> complains and offers a solution.
>> 
>> Andy
>> 
>> -----Original Message----- From: Fischer, Max (SCC)
>> Sent: Tuesday, October 18, 2016 2:45 AM
>> To: [log in to unmask]
>> Subject: empty usage statistics (for cms.sched)
>> 
>> Hi all,
>> 
>> I'm investigating some broken load balancing in one of our sub-clusters.
>>> From a set of similar servers using a shared file system, about 1-2 show
>> considerably higher load (750 vs 30).
>> My attempt was to change `cms.sched` to prefer machines with low load, but 
>> there
>> was no effect from changing this policy.
>> 
>> After switching on logging of `cms.ping` [1] statistics, it turns out that 
>> *all*
>> load statistics from servers are reported as 0:
>>    161018 11:32:09 25026 Node: f01-101-136-e.gridka.de load=0; cpu=0 net=0 
>> inq=0
>> mem=0 pag=0 dsk=871483960 utl=49 shr=[100 24 0]
>> 
>> Do I have to adjust the configuration of server or manager to make this 
>> work?
>> The manager is running v4.3, the servers are running v4.3 and v4.4.
>> 
>> Cheers,
>> Max
>> 
>> [1] cms.ping docs
>> http://xrootd.org/doc/dev43/cms_config.htm#_Toc436250534
>> ########################################################################
>> Use REPLY-ALL to reply to list
>> 
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>> ########################################################################
>> Use REPLY-ALL to reply to list
>> 
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1