Since increasing the number of servers at RAL from 8 to 21 we seem to be
seeing a new failure mode.
All the processes seem to be running fine and you can read a file by
going directly to the server that hold is but the server does not seem
to respond via the olbd network so if you try to access a file via the
load balancer you fail.
Restarting the load balancer on the data server fixes the problem.
There is nothing unusual in the logs at either end as far or anything
missing either as I can tell.
This is on data servers running RH73 and xrootd-20040907-0403 or
Has anyone else seen this? Is there a fix?
Chris Brew ([log in to unmask]) +44 1235 446326
Particle Physics Department
Rutherford Appleton Laboratory
Chilton, Didcot. Oxfordshire.
OX11 0QX. United Kingdom.