Print

Print


Hi Julia,

On 7/21/12 2:25 AM, Julia Andreeva wrote:
> Hi Matevz,
> Thank you very much for the detailed answres.
>
> See my comments/answers inline
> ________________________________________
> From: [log in to unmask] [[log in to unmask]] on behalf of Matevz Tadel [[log in to unmask]]
> Sent: 20 July 2012 20:20
> To: Julia Andreeva
> Cc: Daniel Dieguez Arias; [log in to unmask]
> Subject: Re: Getting data from USCMS XRootD Federation Monitoring in machine-readable format
>
> Hi Julia,
>
> On 07/20/12 02:00, Julia Andreeva wrote:
>>
>>
>>       Hi Matevz,
>>
>> I asked Daniel, to use next time the xrootd mailing list. I think I did not get
>> confirmation that my subscribtion was approved, let's see whether it works.
>>
>> I have couple of questions:
>> 1)
>> What is shown on the Monalisa display:
>
> ML shows the summary monitoring information -- this is what xrootd servers
> report themselves about their status and activity.
>
> I also sent to Daniel a link to the CHEP paper (being reviewed) that desribes
> what we do:
> http://uaf-2.t2.ucsd.edu/~matevz/CHEP2012-S-12-00218.pdf
>
>> Only remote access?
>> Mixture of remote and local?
>
> This is a bit tricky :) as the summary monitoring information is completely
> oblivious about individual connections ... it just tells how much data any given
> server sent out ... and then we aggregate this by site to make dashboard plots.
> This is a complementary view to what is provided by detailed monitoring and
> file-access reports -- we see it more like a server/site performance/monitoring
> database we can use to drill down into in case of trouble. Also, our alarm
> system is based on that ... for almost a year now, the only thing that still
> happens every now and then is authentication problems (servers with bad certs,
> LDAP or GUMS issues). But since the new client knows how to reconnect in case of
> auth trouble, these errors do not show in failing jobs.
>
> We have a couple of sites that also use xrootd for internal access, University
> of Wisconsin being the major one (they have several clusters at the uni and they
> use HEP cluster to serve the data). UW then configured their internal access to
> go via different redirector/servers (easy as they use hadoop) and so we can
> separate internal and external access also in the summary monitoring.
>
> Another site like this is MIT ... I'd say practically all of their traffic is
> internal but we don't / can't separate this in the summary monitoring.
>
> Now, a very special case is the Omaha cluster which connects to Nebraska servers
> via a private sub-net (not exactly sure how network is configured). This access
> looks like it is local to UNL but in fact it travels about 100km north. So, I'd
> say, in this case the summary monitoring is more correct.
>
>
> -------
> OK, now I understand.
> -------
>> 2)
>> In order to decouple local access from the remote one, as a first approximation,
>> would it be enough to consider transfers which have the same domain for client
>> and server to be local , and all other cases to be remote?
>
> Yes, from detailed monitoring one can do that ... with the caveat for Omaha above.
>
> ---------
> Is there any other way to decouple Omaha access (certain pattern in the hostname for Omaha nodes for example?)
> ---------

At the moment, it is:
   client_host=128.111.134.10
   client_domain=unl.edu
As I said, I'm not exactly sure how the network is configured ... and I do find 
it a tad strange that all hosts report exactly the same ip address.

>> 3)
>> According to my knowledge, Monalisa repository provides a way to subscribe to
>> any information which is in the repository, we were thinking to use this
>> possibility in order to retrive data in the machine-readable format and to make
>> consistency checks with data we get from ActiveMQ (when file is closed) and
>> aggregate on our side. But for this purpose we really need to understand what is
>> shown as a throughput on the ML display.
>
> If you know how do that, that makes the most sense, sure. We can also ask ML
> people for help.
>
> About consistency ... we produce daily reports from file-access reports (from
> OSG Gratia, same thing that also gets sent to CERN ActiveMQ) and the results are
> consistent with what is shown in ML. In fact, the inconsistency of these two
> lead us to extend/improve xrootd detailed monitoring about a year back (major
> thing was that vector-reads requests were not reported in the detailed monitoring).
>
> Actually ... why do you want to repeat these cross-checks?
>
> --------
> Crosschecks is for our system.  We need to make sure that what we show on the Dashboard UI is correct.
> Daniel made a good progress in integrating of the xrootd transfer in the Global WLCG Transfer Dashboard.
> This is ou usual practice , before making system available for pilot users, to run consitency checks with some
> relevant information source. The only possibility we were thinking of is UCSD ML repository.
> May be we can also use Gratia reports, is there any documentation of how we can get them?

I know about this page from Gratia:
 
http://rcf-gratia.unl.edu/gratia/xml/facility_transfer_volume?exclude-vo=NONE&protocol=xrootd
There is the "download as csv" option at this page. I can investigate how to 
extract other data (the schema is rather arcane). I'll send you an example of 
our (uscms) auto-generated daily mail report (it contains user information so 
it's not entirely public).

A nice weekend, to you, too ... well ... at least what's left of it :)

Cheers,
Matevz

>
> Have  a nice weekend
>
> Cheers
>
> Julia
>
>
> Cheers,
> Matevz
>
>>
>> Thank you
>>
>> Cheers
>>
>> Julia
>>
>> On Fri, 20 Jul 2012, daniel dieguez arias wrote:
>>
>>> Hi Matevz,
>>>
>>> Thanks for the information. I add Julia to the cc.
>>>
>>> Cheers,
>>> Daniel.
>>>
>>> On 07/20/2012 01:04 AM, Matevz Tadel wrote:
>>>> Hi Daniel,
>>>>
>>>> On 07/19/12 06:48, daniel dieguez arias wrote:
>>>>> Dear Matevz,
>>>>>
>>>>>
>>>>> I am Daniel Dieguez. Currently I am working in WLCG Tarsnsfers Dashboard. In
>>>>> order to make consistency checks between WLCG and USCMS XRootD Federation
>>>>> Monitoring, it would be nice to get data from USCMS in machine-readable format.
>>>>
>>>> I suspect you get the data from UCSD monitoring via popularity, the thing
>>>> we're doing with Domenico, right?
>>>>
>>>> Are you aware of the differences between XRootD summary monitoring (what is
>>>> shown in MonALISA) and the XRootD detailed monitoring that sends out the
>>>> file-access reports at file close-time? I mean ... are you aware that you
>>>> will not get comparable results as in MonALISA we separate internal transfers
>>>> within Univ. of Wisconsin, which can be more than 1 GByte/s. We are trying to
>>>> get other sites (MIT, most importantly) to do the same, but this isn't
>>>> entirely trivial to setup so T2 admins tend to avoid this topic.
>>>>
>>>>> Currently I am looking at the plots shown on
>>>>> http://xrootd.t2.ucsd.edu/dashboard/
>>>>>
>>>>> Could you, please, guide me, how could I get output in csv , xmk or json?
>>>>
>>>> Hmmh, this comes from a Postgres DB of MonALISA ... I could give you a DB
>>>> dump, I think it was ~10GB the last time I checked if backups are being made :)
>>>>
>>>> Cheers,
>>>> Matevz
>>>> .
>>>>
>>>
>>>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1