Print

Print


> Also, how many worker nodes are we talking about?
Er, well, due to historical choices, about 600 ATM. :/  (We 
hyperconverged before it was a buzzword. :) )

> Now, it would be good if you send me your config files (i.e. the ones
> you use on the local workers and the redirector). I wouldn't suggest
> you post them unless they are fully public. 

Is vaguely like this your email [log in to unmask] ?
I don't think there is anything secret in our config, but possibly clueless.

Chad.


> 
> Andy
> 
> 
> On Mon, 16 Nov 2020, cwseys wrote:
> 
>  > Wow! I would never have guessed xrootd needed such optimizations!
>  > Happily, it looks like EL6 (CentOS 6) won't be getting updates anymore
>  > after Nov 30th. I suppose you all have got a lot of changes planned for
>  > the its death-day party. :)
>  >
>  > Thanks for all your expertise,
>  > Chad.
>  >
>  > On 11/14/20 6:00 AM, xrootd-dev wrote:
>  >> Hi Chad,
>  >>
>  >> No, 64 is the hard limit for any server. You may wonder why. We made 
> sure
>  >> that only constant time algorithms were used for any lookup processing.
>  >> That effectively put a limit on how many nodes could be connected to a
>  >> server so as to not have the constant time overhead as a dominant factor
>  >> when nodes were added. So, the solution is the true and tried divide and
>  >> conquer technique. By arranging lookups into a B-Tree we could make sure
>  >> that (where k is fixed overhead and n is the number of nodes) k*n
>  >> processing time rapidly decreases relative to K*log64(n) as n increases.
>  >> This means as you add more nodes perfromace doesn't detrioate.
>  >>
>  >> But why 64, wouldn't larger log values work better? Good question. 
> It all
>  >> comes down to the hardware. A 64 bit vector size is the most efficient
>  >> size for today's computers. Since most of the lookup processing involves
>  >> vector bit slicing; a 64 bit vector is the most efficient way of 
> doing bit
>  >> slicing on a CPU today. Yes, GPU processors give you much larger vectors
>  >> sizes and some of collaborators have suggested going down the GPU path.
>  >> Unfortunately, the GPU environment is unpredictable so we decided not to
>  >> go down that path.
>  >>
>  >> Of course, by arranging lookups in a distributed B64-tree lookup we
>  >> sacrificed specificity. It was a reasonable approach at the time to make
>  >> lookups as fast as possible. Today's requirements are in conflict with
>  >> that decision. You are not the only one that suffers from the lack of
>  >> specificity.
>  >>
>  >> All of that said, we are looking at ways to address this conflict. The
>  >> outlook is positive. We were very much constrained by having to support
>  >> RH6 platforms with very primitive atomics support. So, once we can drop
>  >> that support it would allow us to reduce the "k" which allows us to
>  >> increase the size of the bit vector and still maintain good performance.
>  >> That may not be sufficient for your site but we are looking at all the
>  >> alternatives.
>  >>
>  >> I know I gave you far more informatuon than you wanted. However, it's
>  >> important to understand that constraints are not chosen at random or
>  >> for convenience. We do try to optimize the sevrer for the common use
>  >> cases. But, we also recognize those will change and we will have to
>  >> readjust our optimization targets. That's the process we are in 
> today, so
>  >> bear with us.
>  >>
>  >> Andy
>  >>
>  >>
>  >>
>  >>
>  >> On Fri, 13 Nov 2020, cwseys wrote:
>  >>
>  >> > Hi Andrew,
>  >> > Thanks for taking a look. As you guessed we do have supervisors.
>  >> > Are supervisors an absolute requirement for over 64 or can a fast 
> enough
>  >> > redirector handle more?
>  >> > Also, I don't know enough about xrootd to answer this question: Are
>  >> > requests divided amongst the redirector and the supervisors? If 
> so, and
>  >> > the redirector had this new heuristic, some of the file requests would
>  >> > be served by same-node. Something is better than nothing!
>  >> >
>  >> > Thanks again!
>  >> > Chad.
>  >> >
>  >> >
>  >> > --
>  >> > You are receiving this because you are subscribed to this thread.
>  >> > Reply to this email directly or view it on GitHub:
>  >> > https://github.com/xrootd/xrootd/issues/1306#issuecomment-726843740
>  >> >
>  >> > 
> ########################################################################
>  >> > Use REPLY-ALL to reply to list
>  >> >
>  >> > To unsubscribe from the XROOTD-DEV list, click the following link:
>  >> > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>  >>
>  >> ?
>  >> You are receiving this because you authored the thread.
>  >> Reply to this email directly, view it on GitHub
>  >> <https://github.com/xrootd/xrootd/issues/1306#issuecomment-727197027>,
>  >> or unsubscribe
>  >> 
> <https://github.com/notifications/unsubscribe-auth/ACHLUMHYLFP4RCPLPXN5C63SPZWN7ANCNFSM4STOT4LQ>.
>  >>
>  >
>  >
>  > --
>  > You are receiving this because you commented.
>  > Reply to this email directly or view it on GitHub:
>  > https://github.com/xrootd/xrootd/issues/1306#issuecomment-728121621
>  > ########################################################################
>  > Use REPLY-ALL to reply to list
>  >
>  > To unsubscribe from the XROOTD-DEV list, click the following link:
>  > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>  >
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub 
> <https://github.com/xrootd/xrootd/issues/1306#issuecomment-728646114>, 
> or unsubscribe 
> <https://github.com/notifications/unsubscribe-auth/ACHLUMERPO3NCCNVNKTJ7YDSQHN4PANCNFSM4STOT4LQ>.
> 


-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1306#issuecomment-729050818
########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1