Print

Print


Hi Matevz,

Ah, I see you are as confused as Wilko (and perhaps me :-). We just spent 
some time understanding what is going on for you (yes, my explanation made 
some assumptions).

a) I assumed you first talked to a meta-redirector that asked several hadoop 
sites whether they had the file. You then got redirected to some xrootd and 
opened the file there. Indeed the "redirect" setting in this case is 
immaterial as the lookup has already been done.

b) If (a) was not true, instead you went directly to a particular Hadoop 
cluster and opened the file on it's local redirector then the "redirect" 
makes a big difference. In this case, immed means no lookup is done and you 
get sent to some server which will honor or fail your request.

Andy

-----Original Message----- 
From: Matevz Tadel
Sent: Tuesday, February 04, 2014 3:53 PM
To: Andrew Hanushevsky ; xrootd-dev
Subject: Re: cms.dfs question

Hi Andy,

Thanks for the explanation! To make sure I understand:

1. When redirector does not know if a file exists, it still has to perform 
the
lookup, as configured.

2. When using "lookup central", we are actually "measuring" the limit of 
hdfs
lookup on a single node (the redirector).

Now I understand that redirector has to perform the lookup when it doesn't 
know
if a file exists ... otherwise it can not report to meta-manager(s).

Would it make sense to have the equivalent of "lookup none" for open 
requests.
The client can then deal directly with a data server. It's true that the
redirector does not "learn" anything useful in this case so it can lead to 
more
trouble down the road, especially with mis-behaving users/clients.

Matevz

On 2/4/14 3:20 PM, Andrew Hanushevsky wrote:
> Hi Matevz,
>
> You are getting caught in the lookup" phase. Distributed lookup will 
> always
> scale better then central lookup, when a lookup *has* to be performed. The
> redirect part is what to do when a lookup can be avoided because the 
> information
> is already cached. Immed is always the best option is you have a true
> distributed file system underneath.
>
> Anyway, I can't say that I have convinced people that distributed normally 
> has
> better scaling, and I have tried.  Unfortunately, the majority still seems 
> to
> gravitate to centralized vertical design options because they are more 
> comforting.
>
> Andy
>
> -----Original Message----- From: Matevz Tadel
> Sent: Tuesday, February 04, 2014 2:46 PM
> To: xrootd-dev
> Subject: cms.dfs question
>
> Hi,
>
> We (AAA) are doing redirection rate scaling tests and noticed a large 
> difference
> between *hadoop* sites based on how cms.dfs is setup.
>
> This works great (scaling beyond 300Hz):
>    cms.dfs lookup distrib redirect immed
> and this saturates at ~20Hz:
>    cms.dfs lookup central redirect immed
>
> I'm puzzled, because I'd expect that "redirect immed" trumps whatever 
> lookup
> setting one might choose. We were lucky -- we had two sites that chose 
> different
> values for lookup :)
>
> Matevz
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-DEV list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1