LISTSERV 16.5 - XROOTD-L Archives

Hi all,

In preparation for the meeting, I'd like to mention that we have had a
successful clustering of 175 servers. We will be trying for 280 servers
today as well as setting p some stressful scenarios. Below is the first
draft of the documentation that will be added to he olb configuration
manual.

Andy

Configuring more than 64 data servers

1) Choose the port numbers you wish to use for xrootd and olbd.
Typically, for xrootd port 1094 is used. For olbd port 1213 is used.

1) Choose the number of manager nodes you wish to run.
You must configure at least one manager node. The manager is the first
point of contact for a client and is also the cluster leader. A manager
should run on a dedicated machine of modest power (e.g., 512MB RAM, 800MHZ
clock speed, 100Mb ethernet).

A manager node consists of
a) an xrootd started with the -r option or configured with the
"ofs.redirect remote" directive.
b) an olbd started with the -m option.
You may configure more than one manager and either run them in fail-over
mode (the default) or in load balancing mode where each manager shares
part of the client load (see the ??? directive). Each manager xrootd-olbd
pair must run on a separate machine.

A typical manager configuration file useful for an xrootd and olbd would
look like:


2) Choose the number of supervisor nodes you need.
A supervisor node acts as a local manager for a group of 64 nodes. These
nodes may be data servers or supervisors. A supervisor node consists of
a) an xrootd started with the -r and -t option or configured with the
"ofs.redirect remote" and "ofs.redirect target" directives. Additionally,
the xrootd should be started with the -a option.
b) an olbd started with the -m and -s options.

You only need to configure supervisor nodes if you will be running more
than 64 data servers. The number of supervisor nodes is based on the
number of available manager plus supervisor slots. A recursive formula
must be used to calculate the minimum number. Since there is nothing wrong
in starting a few more supervisors than actually are needed, a simplified
formula can be used.

Conservatively, you will need one supervisor node for each group of 64
data servers. For instance, if you plan to run 500 data servers you will
need the upper limit of 500/64 supervisors (i.e., 8). The computed number
must then be multiplied by the number of managers you will run (e.g., 16
supervisors if you are running two managers). The reason is that
supervisors establish the same number of different paths to data servers
as there are possible client paths to managers.

Each supervisor nodes can run on a data server node. If you wish to share
resources in this way, choose data server nodes that will be as lightly
loaded as possible. The performance requirements for a supervisor node are
the same as a manager node.

4) Configure the data server nodes.
A data server node is delivers actual data to clients. It consists of
a) an xrootd started with the -t option or configured with the
"ofs.redirect target" directive. Additionally, the xrootd should be
started with the -a option.
b) an olbd started with the -s option.

Configure as many data server nodes as you need. Keep in mind that at
least one additional supervisor node is need for each group of 64 data
servers.

The performance requirements are determined by the performance needs of
clients. The server should have enough disk space, adequate network
bandwidth (e.g., Gb ethernet), and significant cpu and i/o resources. If
you wish to use memory mapped files, then the node should have a
commensurate amount of real memory.

Frequently asked questions:

Does start-up order matter?

Generally, it does not matter in which order nodes are started. For the
efficiency minded, starting supervisor nodes ahead of data server nodes
allows the system to converge on a stable configuration faster.

How long will it take for the system converge?

This depends on how many servers are in the configuration. Generally, it
takes approximately 1 to 13 seconds for a server to find its place in the
hierarchy. However, the process is parallel across all of the servers. So,
the system should converge in less than 30 seconds for a configuration of
about 1,000 servers.

What happens if I have too few supervisors?

If there are not enough supervisors relative to the number of data
servers, one ore more data servers will be orphaned. If you suspect this,
check the manager’s log. It will contain warnings about orphaned data
servers.

What happens if I have more supervisor nodes than I need?

Since the system tries to evenly distribute data servers across all
available supervisors, excess supervisors are used to further reduce the
load on supervisor nodes.

Why do I need multiply the number of needed supervisors by the number of
managers?

Replicating managers indicates that you wish to have redundant paths to
server nodes. This can only be accomplished if each manager has as many
paths to a data server node. as clients have to a manager node. This
necessitates replicating supervisor nodes by the number of accessible
manager nodes. While system will work if you violate this principal, it
will be in a perpetual path discovery state and perform poorly.

Can I run all the supervisors on a single node?

Yes, but you will need to manually manage the olb administration path and
log files. Each supervisor olbd must have a unique administration path
defined by the olb.adminpath directive. If you are not sharing the
configuration file with the its xrootd counterpart, the path must be
declared to the corresponding xrootd using the odc.olbpath directive.
These requirements mean that each server pair will likely need its own
configuration file. Additionally, each server (olbd and xrootd) must also
be given a separate log file using the -l command line option. Finally,
each xrootd must be started with the -a option to allow for arbitrary port
selection and the port must not be specified using the xrd.port directive.
You should realize that running all of the supervisors on a single node
creates a large single point of failure.

How do I run a data server and a supervisor on the same node?

Use the provided StartOLB and StartXRD scripts. For a supervisor olbd,
specify the -m and -s options. For the supervisor’s xrootd, specify the -r
and -t options. For a data server olbd just specify the -s option. For its
xrootd counterpart specify the -t option. The scripts will make sure that
unique log files are established for each server pair and the no port
conflicts will occur by automatically adding the -a command line option
for the xrootd.

What does the -a xrootd command line option actually do?

The -a option enables anarchist mode. In this mode, the xrootd is free to
choose any port that is available if you did not specify a port number on
the command line or in the configuration file. The arbitrary port number
is then forwarded to the olbd. This allows the olbd to redirect clients to
the proper port even though it’s not known ahead of time. This only works
if the olbd is not started with the -i option (the default) and the xrootd
is started with the -t option (required for data servers and supervisors).
This does not eliminate the need for starting the manager olbd and its
xrootd counterpart with well-known ports. We are working on automating
this process.

Does that mean I can use -a to run multiple data servers on a single node?

Yes. Again, you will need to manually manage the olbd administration path
and log files for each server. We are working on automating this process.

Can I use the -a option to prohibit clients to bypass the olbd?

Yes. This is actually recommended. Since arbitrary port numbers are
chosen, a client cannot directly connect to a data server without using
the manager xrootd. However, while significant programming effort is
required to capture port numbers at run-time; any "management by
obscurity" method can be defeated.