Those two particular releases seem to have had some problems. I assume
you are not mixing releases here (i.e., running either on all servers
causes you to see the problem).
I do know that 20040830 is a stable release. We run that everywhere at
SLAC for analysis. I'd suggest going with that one until we test out
the latest release that should have fixed some other problem relating
to writing files.
On Thu, 17 Feb 2005, Brew, CAJ (Chris) wrote:
> Since increasing the number of servers at RAL from 8 to 21 we seem to be
> seeing a new failure mode.
> All the processes seem to be running fine and you can read a file by
> going directly to the server that hold is but the server does not seem
> to respond via the olbd network so if you try to access a file via the
> load balancer you fail.
> Restarting the load balancer on the data server fixes the problem.
> There is nothing unusual in the logs at either end as far or anything
> missing either as I can tell.
> This is on data servers running RH73 and xrootd-20040907-0403 or
> Has anyone else seen this? Is there a fix?
> Chris Brew ([log in to unmask]) +44 1235 446326
> Particle Physics Department
> Rutherford Appleton Laboratory
> Chilton, Didcot. Oxfordshire.
> OX11 0QX. United Kingdom.