Print

Print


  Hi Fons,

On Wed, Feb 23, 2005 at 09:35:20AM +0100, Fons Rademakers wrote:
>    was there always this limit of 64 data servers per olbd? How many data 
> servers do you have in SLAC? Anyway, in cases like Jerome's it would be 
> very important to have an automatic configuration system based just on a 
> single list of data servers and olbd's.

  The limit has been there since the beginning. The BaBar sites typically
have 10-40 data servers so we deferred dealing with it. (Note that the 
limit was introduced intentionally by Andy to avoid dealing with a lot more 
murky issues that might come up with unlimited cell size.)

  We agree that the self-organizing, scalable system is the next thing we
would like (hence Andy has been working on it recently) as we (BaBar) would 
also like to do some tests with this "data mesh" concept where the lines 
between worker node and server are blurred. Andy's recent work to optimize 
the memory footprint will also help with this. 

  Bottom line: we all want it, so expect to hear more about this soon.

                                   Pete

> Peter Elmer wrote:
> >   Hi Jerome,
> > 
> > On Tue, Feb 22, 2005 at 09:11:26PM -0500, Jerome LAURET wrote:
> > 
> >>	So, we started xrootd and olbd on 190 nodes with one
> >>redirector (+olbd -m) and of course, bumped into the 64 connection
> >>limit for olbd. We now have questions:
> >>
> >>- Is there a way to circunvent the 64 connection limit via a
> >>  parameter of some kind we may have missed ??
> > 
> > 
> >   No, there is no parameter to do this. 
> > 
> > 
> >>- If a redirector + olbd manager can connect to only 64 nodes,
> >>  what would happen to the "rest" of the nodes in our configuration
> >>  i.e. the 190-64 other nodes ??
> > 
> > 
> >   IIRC, they don't succeed in subscribing to the redirector (with messages
> > in all log files) and thus don't succeed in serving data.
> > 
> > 
> >>  * Does this implie that the redirector cannot do load balancing
> >>    with those nodes ??
> > 
> > 
> >   With the existing olbd structure, you would have to create a hierarchy
> > of redirectors, each with a group of 64 data servers. (But see below.)
> > 
> > 
> >>  * Does it mean that a single redirector cannot handle more
> >>    than 64 nodes and their files ??
> > 
> > 
> >   At the moment you can build a hierarchy where each group of 64 has
> > a redirector and then those subscriber to a meta-redirector. What Andy
> > has been adding lately is something that avoids you have to specify all
> > of that configuration in a hardcoded way, by allowing the machines
> > to self-configure. (As I understand it you basically just configure all of the
> > data servers _and_ at least 1 redirector per 64 dataservers to connect
> > to the main redirector and they self-configure into a hierachical system.)
> > Andy should comment on when he thinks a test version of this might be
> > ready... 
> > 
> > 
> >>  * If I use DNS RR, what happens then ?? Does every redirector
> >>    gets a set of 64 nodes ?? Does the redirector exchange information
> >>    about each other's known files (and node/load selection) ??
> > 
> > 
> >   At SLAC their are two redirectors which are DNS aliased to the same
> > address, but this is done simply for redundancy, not to create this
> > type of hierachy.
> > 
> >   Probably the best bet is for you to use this self-organizing servers
> > feature which Andy has been building. (I also wanted to do some testing
> > of that.) I didn't realize that you had so many servers. Or are these
> > the worker nodes you are trying to allow to serve data from their local
> > disks?
> > 
> >   (I was travelling and thus still need to catch up on various postings to
> > this list from the last 4-5 days. Expect additional response over the next 
> > day or so as I read all of that!)
> > 
> >                                    Pete
> > 
> > 
> >>	In case this is needed, our configuration is as described
> >>by a previous Email by Pavel in
> >>http://www.slac.stanford.edu/cgi-bin/lwgate/XROOTD-L/archives/xrootd-l.200502/date/article-55.html
> >>
> >>
> >>
> >>-- 
> >>             ,,,,,
> >>            ( o o )
> >>         --m---U---m--
> >>             Jerome
> >>
> > 
> > 
> > 
> > 
> > -------------------------------------------------------------------------
> > Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
> > Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
> > -------------------------------------------------------------------------
> 
> -- 
> Org:    CERN, European Laboratory for Particle Physics.
> Mail:   1211 Geneve 23, Switzerland
> E-Mail: [log in to unmask]              Phone: +41 22 7679248
> WWW:    http://www.rademakers.org/fons/      Fax:   +41 22 7679480



-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------