LISTSERV mailing list manager LISTSERV 16.5

Help for XROOTD-L Archives


XROOTD-L Archives

XROOTD-L Archives


XROOTD-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

XROOTD-L Home

XROOTD-L Home

XROOTD-L  March 2005

XROOTD-L March 2005

Subject:

Re: xrootd meeting - Tuesday 22 March, 2005

From:

Andrew Hanushevsky <[log in to unmask]>

Date:

21 Mar 2005 23:32:32 -0800 (PST)Mon, 21 Mar 2005 23:32:32 -0800 (PST)

Content-Type:

TEXT/PLAIN

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (177 lines)

Hi all,

In preparation for the meeting, I'd like to mention that we have had a
successful clustering of 175 servers. We will be trying for 280 servers
today as well as setting p some stressful scenarios. Below is the first
draft of the documentation that will be added to he olb configuration
manual.

Andy

Configuring more than 64 data servers

1) Choose the port numbers you wish to use for xrootd and olbd.
Typically, for xrootd port 1094 is used. For olbd port 1213 is used.

1) Choose the number of manager nodes you wish to run.
You must configure at least one manager node. The manager is the first
point of contact for a client and is also the cluster leader. A manager
should run on a dedicated machine of modest power (e.g., 512MB RAM, 800MHZ
clock speed, 100Mb ethernet).

A manager node consists of
a) an xrootd started with the -r option or configured with the
"ofs.redirect remote" directive.
b) an olbd started with the -m option.
You may configure more than one manager and either run them in fail-over
mode (the default) or in load balancing mode where each manager shares
part of the client load (see the ??? directive). Each manager xrootd-olbd
pair must run on a separate machine.

A typical manager configuration file useful for an xrootd and olbd would
look like:


2) Choose the number of supervisor nodes you need.
A supervisor node acts as a local manager for a group of 64 nodes. These
nodes may be data servers or supervisors. A supervisor node consists of
a) an xrootd started with the -r and -t option or configured with the
"ofs.redirect remote" and "ofs.redirect target" directives. Additionally,
the xrootd should be started with the -a option.
b) an olbd started with the -m and -s options.

You only need to configure supervisor nodes if you will be running more
than 64 data servers. The number of supervisor nodes is based on the
number of available manager plus supervisor slots. A recursive formula
must be used to calculate the minimum number. Since there is nothing wrong
in starting a few more supervisors than actually are needed, a simplified
formula can be used.

Conservatively, you will need one supervisor node for each group of 64
data servers. For instance, if you plan to run 500 data servers you will
need the upper limit of 500/64 supervisors (i.e., 8). The computed number
must then be multiplied by the number of managers you will run (e.g., 16
supervisors if you are running two managers). The reason is that
supervisors establish the same number of different paths to data servers
as there are possible client paths to managers.

Each supervisor nodes can run on a data server node. If you wish to share
resources in this way, choose data server nodes that will be as lightly
loaded as possible. The performance requirements for a supervisor node are
the same as a manager node.

4) Configure the data server nodes.
A data server node is delivers actual data to clients. It consists of
a) an xrootd started with the -t option or configured with the
"ofs.redirect target" directive. Additionally, the xrootd should be
started with the -a option.
b) an olbd started with the -s option.

Configure as many data server nodes as you need. Keep in mind that at
least one additional supervisor node is need for each group of 64 data
servers.

The performance requirements are determined by the performance needs of
clients. The server should have enough disk space, adequate network
bandwidth (e.g., Gb ethernet), and significant cpu and i/o resources. If
you wish to use memory mapped files, then the node should have a
commensurate amount of real memory.

Frequently asked questions:

Does start-up order matter?

Generally, it does not matter in which order nodes are started. For the
efficiency minded, starting supervisor nodes ahead of data server nodes
allows the system to converge on a stable configuration faster.

How long will it take for the system converge?

This depends on how many servers are in the configuration. Generally, it
takes approximately 1 to 13 seconds for a server to find its place in the
hierarchy. However, the process is parallel across all of the servers. So,
the system should converge in less than 30 seconds for a configuration of
about 1,000 servers.

What happens if I have too few supervisors?

If there are not enough supervisors relative to the number of data
servers, one ore more data servers will be orphaned. If you suspect this,
check the manager’s log. It will contain warnings about orphaned data
servers.

What happens if I have more supervisor nodes than I need?

Since the system tries to evenly distribute data servers across all
available supervisors, excess supervisors are used to further reduce the
load on supervisor nodes.

Why do I need multiply the number of needed supervisors by the number of
managers?

Replicating managers indicates that you wish to have redundant paths to
server nodes. This can only be accomplished if each manager has as many
paths to a data server node. as clients have to a manager node. This
necessitates replicating supervisor nodes by the number of accessible
manager nodes. While system will work if you violate this principal, it
will be in a perpetual path discovery state and perform poorly.

Can I run all the supervisors on a single node?

Yes, but you will need to manually manage the olb administration path and
log files. Each supervisor olbd must have a unique administration path
defined by the olb.adminpath directive. If you are not sharing the
configuration file with the its xrootd counterpart, the path must be
declared to the corresponding xrootd using the odc.olbpath directive.
These requirements mean that each server pair will likely need its own
configuration file. Additionally, each server (olbd and xrootd) must also
be given a separate log file using the -l command line option. Finally,
each xrootd must be started with the -a option to allow for arbitrary port
selection and the port must not be specified using the xrd.port directive.
You should realize that running all of the supervisors on a single node
creates a large single point of failure.

How do I run a data server and a supervisor on the same node?

Use the provided StartOLB and StartXRD scripts. For a supervisor olbd,
specify the -m and -s options. For the supervisor’s xrootd, specify the -r
and -t options. For a data server olbd just specify the -s option. For its
xrootd counterpart specify the -t option. The scripts will make sure that
unique log files are established for each server pair and the no port
conflicts will occur by automatically adding the -a command line option
for the xrootd.

What does the -a xrootd command line option actually do?

The -a option enables anarchist mode. In this mode, the xrootd is free to
choose any port that is available if you did not specify a port number on
the command line or in the configuration file. The arbitrary port number
is then forwarded to the olbd. This allows the olbd to redirect clients to
the proper port even though it’s not known ahead of time. This only works
if the olbd is not started with the -i option (the default) and the xrootd
is started with the -t option (required for data servers and supervisors).
This does not eliminate the need for starting the manager olbd and its
xrootd counterpart with well-known ports. We are working on automating
this process.

Does that mean I can use -a to run multiple data servers on a single node?

Yes. Again, you will need to manually manage the olbd administration path
and log files for each server. We are working on automating this process.

Can I use the -a option to prohibit clients to bypass the olbd?

Yes. This is actually recommended. Since arbitrary port numbers are
chosen, a client cannot directly connect to a data server without using
the manager xrootd. However, while significant programming effort is
required to capture port numbers at run-time; any "management by
obscurity" method can be defeated.








Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
July 2009
June 2009
May 2009
April 2009
March 2009
January 2009
December 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use