Print

Print


Hi Matevz,
Thank you very much for the detailed answres.

See my comments/answers inline
________________________________________
From: [log in to unmask] [[log in to unmask]] on behalf of Matevz Tadel [[log in to unmask]]
Sent: 20 July 2012 20:20
To: Julia Andreeva
Cc: Daniel Dieguez Arias; [log in to unmask]
Subject: Re: Getting data from USCMS XRootD Federation Monitoring in machine-readable format

Hi Julia,

On 07/20/12 02:00, Julia Andreeva wrote:
>
>
>      Hi Matevz,
>
> I asked Daniel, to use next time the xrootd mailing list. I think I did not get
> confirmation that my subscribtion was approved, let's see whether it works.
>
> I have couple of questions:
> 1)
> What is shown on the Monalisa display:

ML shows the summary monitoring information -- this is what xrootd servers
report themselves about their status and activity.

I also sent to Daniel a link to the CHEP paper (being reviewed) that desribes
what we do:
http://uaf-2.t2.ucsd.edu/~matevz/CHEP2012-S-12-00218.pdf

> Only remote access?
> Mixture of remote and local?

This is a bit tricky :) as the summary monitoring information is completely
oblivious about individual connections ... it just tells how much data any given
server sent out ... and then we aggregate this by site to make dashboard plots.
This is a complementary view to what is provided by detailed monitoring and
file-access reports -- we see it more like a server/site performance/monitoring
database we can use to drill down into in case of trouble. Also, our alarm
system is based on that ... for almost a year now, the only thing that still
happens every now and then is authentication problems (servers with bad certs,
LDAP or GUMS issues). But since the new client knows how to reconnect in case of
auth trouble, these errors do not show in failing jobs.

We have a couple of sites that also use xrootd for internal access, University
of Wisconsin being the major one (they have several clusters at the uni and they
use HEP cluster to serve the data). UW then configured their internal access to
go via different redirector/servers (easy as they use hadoop) and so we can
separate internal and external access also in the summary monitoring.

Another site like this is MIT ... I'd say practically all of their traffic is
internal but we don't / can't separate this in the summary monitoring.

Now, a very special case is the Omaha cluster which connects to Nebraska servers
via a private sub-net (not exactly sure how network is configured). This access
looks like it is local to UNL but in fact it travels about 100km north. So, I'd
say, in this case the summary monitoring is more correct.


-------
OK, now I understand.  
-------
> 2)
> In order to decouple local access from the remote one, as a first approximation,
> would it be enough to consider transfers which have the same domain for client
> and server to be local , and all other cases to be remote?

Yes, from detailed monitoring one can do that ... with the caveat for Omaha above.

---------
Is there any other way to decouple Omaha access (certain pattern in the hostname for Omaha nodes for example?)
---------
> 3)
> According to my knowledge, Monalisa repository provides a way to subscribe to
> any information which is in the repository, we were thinking to use this
> possibility in order to retrive data in the machine-readable format and to make
> consistency checks with data we get from ActiveMQ (when file is closed) and
> aggregate on our side. But for this purpose we really need to understand what is
> shown as a throughput on the ML display.

If you know how do that, that makes the most sense, sure. We can also ask ML
people for help.

About consistency ... we produce daily reports from file-access reports (from
OSG Gratia, same thing that also gets sent to CERN ActiveMQ) and the results are
consistent with what is shown in ML. In fact, the inconsistency of these two
lead us to extend/improve xrootd detailed monitoring about a year back (major
thing was that vector-reads requests were not reported in the detailed monitoring).

Actually ... why do you want to repeat these cross-checks?

--------
Crosschecks is for our system.  We need to make sure that what we show on the Dashboard UI is correct.
Daniel made a good progress in integrating of the xrootd transfer in the Global WLCG Transfer Dashboard.
This is ou usual practice , before making system available for pilot users, to run consitency checks with some 
relevant information source. The only possibility we were thinking of is UCSD ML repository.
May be we can also use Gratia reports, is there any documentation of how we can get them?


Have  a nice weekend

Cheers

Julia


Cheers,
Matevz

>
> Thank you
>
> Cheers
>
> Julia
>
> On Fri, 20 Jul 2012, daniel dieguez arias wrote:
>
>> Hi Matevz,
>>
>> Thanks for the information. I add Julia to the cc.
>>
>> Cheers,
>> Daniel.
>>
>> On 07/20/2012 01:04 AM, Matevz Tadel wrote:
>>> Hi Daniel,
>>>
>>> On 07/19/12 06:48, daniel dieguez arias wrote:
>>>> Dear Matevz,
>>>>
>>>>
>>>> I am Daniel Dieguez. Currently I am working in WLCG Tarsnsfers Dashboard. In
>>>> order to make consistency checks between WLCG and USCMS XRootD Federation
>>>> Monitoring, it would be nice to get data from USCMS in machine-readable format.
>>>
>>> I suspect you get the data from UCSD monitoring via popularity, the thing
>>> we're doing with Domenico, right?
>>>
>>> Are you aware of the differences between XRootD summary monitoring (what is
>>> shown in MonALISA) and the XRootD detailed monitoring that sends out the
>>> file-access reports at file close-time? I mean ... are you aware that you
>>> will not get comparable results as in MonALISA we separate internal transfers
>>> within Univ. of Wisconsin, which can be more than 1 GByte/s. We are trying to
>>> get other sites (MIT, most importantly) to do the same, but this isn't
>>> entirely trivial to setup so T2 admins tend to avoid this topic.
>>>
>>>> Currently I am looking at the plots shown on
>>>> http://xrootd.t2.ucsd.edu/dashboard/
>>>>
>>>> Could you, please, guide me, how could I get output in csv , xmk or json?
>>>
>>> Hmmh, this comes from a Postgres DB of MonALISA ... I could give you a DB
>>> dump, I think it was ~10GB the last time I checked if backups are being made :)
>>>
>>> Cheers,
>>> Matevz
>>> .
>>>
>>
>>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1