Hi Brian, On 2/4/14 6:54 PM, Brian Bockelman wrote: > Hi Matevz, > > Not sure if I'm so convinced all is well. What I meant was that the configuration change and the related change in performance makes (some) sense ... I agree we should understand what breaks in the "lookup central" case for hdfs. Matevz > If I use the HDFS client (via FUSE) in a single thread to query the NN as fast as possible, I get a query rate of about 4.7 kHZ. Why is the Xrootd single-client performance 2 orders of magnitude slower than FUSE? > > Brian > > On Feb 4, 2014, at 8:44 PM, Matevz Tadel <[log in to unmask]> wrote: > >> Hi Andy, everybody, >> >> The scaling issues indeed only showed when doing open requests through the meta-manager ... so all is OK. Right? :) >> >> Sorry for all this noise ... I'll pay a round at the Federated Storage workshop :) >> >> Matevz >> >> On 2/4/14 5:43 PM, Matevz Tadel wrote: >>> Hi Andy, >>> >>> Before we go too far ... let me check with the guy who ran the tests if he >>> indeed went through site redirectors directly and not through the meta manager. >>> >>> With meta manager in the game --- the results would make sense, right? >>> >>> Matevz >>> >>> On 2/4/14 5:29 PM, Matevz Tadel wrote: >>>> Hi Andy, >>>> >>>> All versions are for the local redirectors. >>>> >>>> UCSD (3.3.3) had "lookup distrib redirect immed" and performs well (linear >>>> scaling up to 300 Hz). >>>> >>>> Wisconsin (3.3.3), Nebraska (3.3.1, I think) and Purdue (3.3.2) had "lookup >>>> central redirect immed" and performed poorly (clogged up at 10-20Hz). Wisconsin >>>> and Nebraska already made the change to UCSD settings and are now scaling ok, >>>> too. >>>> >>>> Matevz >>>> >>>> On 2/4/14 4:58 PM, Andrew Hanushevsky wrote: >>>>> Hi Matevz, >>>>> >>>>> OK, then I am actually confused as well. So, which site has which option and >>>>> what release is each of them running. >>>>> >>>>> Andy >>>>> >>>>> -----Original Message----- From: Matevz Tadel >>>>> Sent: Tuesday, February 04, 2014 4:49 PM >>>>> To: Andrew Hanushevsky ; xrootd-dev >>>>> Subject: Re: cms.dfs question >>>>> >>>>> Hi Andy, >>>>> >>>>> b) was actually the case, no meta-managers involved, just open request on local >>>>> manager, for a while we know (believe) is available on the site. And both sites >>>>> had redirect immed. It was lookup distrib vs. central that made the difference >>>>> which I did not expect (I also thought that redirect immed is the only thing >>>>> that matters). >>>>> >>>>> Matevz >>>>> >>>>> On 2/4/14 4:10 PM, Andrew Hanushevsky wrote: >>>>>> Hi Matevz, >>>>>> >>>>>> Ah, I see you are as confused as Wilko (and perhaps me :-). We just spent some >>>>>> time understanding what is going on for you (yes, my explanation made some >>>>>> assumptions). >>>>>> >>>>>> a) I assumed you first talked to a meta-redirector that asked several hadoop >>>>>> sites whether they had the file. You then got redirected to some xrootd and >>>>>> opened the file there. Indeed the "redirect" setting in this case is immaterial >>>>>> as the lookup has already been done. >>>>>> >>>>>> b) If (a) was not true, instead you went directly to a particular Hadoop >>>>>> cluster >>>>>> and opened the file on it's local redirector then the "redirect" makes a big >>>>>> difference. In this case, immed means no lookup is done and you get sent to >>>>>> some >>>>>> server which will honor or fail your request. >>>>>> >>>>>> Andy >>>>>> >>>>>> -----Original Message----- From: Matevz Tadel >>>>>> Sent: Tuesday, February 04, 2014 3:53 PM >>>>>> To: Andrew Hanushevsky ; xrootd-dev >>>>>> Subject: Re: cms.dfs question >>>>>> >>>>>> Hi Andy, >>>>>> >>>>>> Thanks for the explanation! To make sure I understand: >>>>>> >>>>>> 1. When redirector does not know if a file exists, it still has to perform the >>>>>> lookup, as configured. >>>>>> >>>>>> 2. When using "lookup central", we are actually "measuring" the limit of hdfs >>>>>> lookup on a single node (the redirector). >>>>>> >>>>>> Now I understand that redirector has to perform the lookup when it doesn't know >>>>>> if a file exists ... otherwise it can not report to meta-manager(s). >>>>>> >>>>>> Would it make sense to have the equivalent of "lookup none" for open requests. >>>>>> The client can then deal directly with a data server. It's true that the >>>>>> redirector does not "learn" anything useful in this case so it can lead to more >>>>>> trouble down the road, especially with mis-behaving users/clients. >>>>>> >>>>>> Matevz >>>>>> >>>>>> On 2/4/14 3:20 PM, Andrew Hanushevsky wrote: >>>>>>> Hi Matevz, >>>>>>> >>>>>>> You are getting caught in the lookup" phase. Distributed lookup will always >>>>>>> scale better then central lookup, when a lookup *has* to be performed. The >>>>>>> redirect part is what to do when a lookup can be avoided because the >>>>>>> information >>>>>>> is already cached. Immed is always the best option is you have a true >>>>>>> distributed file system underneath. >>>>>>> >>>>>>> Anyway, I can't say that I have convinced people that distributed normally has >>>>>>> better scaling, and I have tried. Unfortunately, the majority still seems to >>>>>>> gravitate to centralized vertical design options because they are more >>>>>>> comforting. >>>>>>> >>>>>>> Andy >>>>>>> >>>>>>> -----Original Message----- From: Matevz Tadel >>>>>>> Sent: Tuesday, February 04, 2014 2:46 PM >>>>>>> To: xrootd-dev >>>>>>> Subject: cms.dfs question >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We (AAA) are doing redirection rate scaling tests and noticed a large >>>>>>> difference >>>>>>> between *hadoop* sites based on how cms.dfs is setup. >>>>>>> >>>>>>> This works great (scaling beyond 300Hz): >>>>>>> cms.dfs lookup distrib redirect immed >>>>>>> and this saturates at ~20Hz: >>>>>>> cms.dfs lookup central redirect immed >>>>>>> >>>>>>> I'm puzzled, because I'd expect that "redirect immed" trumps whatever lookup >>>>>>> setting one might choose. We were lucky -- we had two sites that chose >>>>>>> different >>>>>>> values for lookup :) >>>>>>> >>>>>>> Matevz >>>>>>> >>>>>>> ######################################################################## >>>>>>> Use REPLY-ALL to reply to list >>>>>>> >>>>>>> To unsubscribe from the XROOTD-DEV list, click the following link: >>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>>>>> >>>>> >>>>> ######################################################################## >>>>> Use REPLY-ALL to reply to list >>>>> >>>>> To unsubscribe from the XROOTD-DEV list, click the following link: >>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>>>> ######################################################################## >>>>> Use REPLY-ALL to reply to list >>>>> >>>>> To unsubscribe from the XROOTD-DEV list, click the following link: >>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>>> >>>> ######################################################################## >>>> Use REPLY-ALL to reply to list >>>> >>>> To unsubscribe from the XROOTD-DEV list, click the following link: >>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >>> >>> ######################################################################## >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the XROOTD-DEV list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> >> ######################################################################## >> Use REPLY-ALL to reply to list >> >> To unsubscribe from the XROOTD-DEV list, click the following link: >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1