I still don't really understand why this is a client issue not a server
issue, but whatever it is it still seems to be present in the latest SP
A quick recap of the problem as it presented in this case.
The disk containing one of the BkgTrigger files was take offline for
checks - This was a run 5 collection that had just been imported and so
hadn't yet been written to tape.
When a client job requested the file the olbd redirected it to one of
the stage servers to get if from tape, which obviously failed as did the
However when the disk was brought back online and the file was once
again available 50% of the jobs continued to fail.
Looking at the olbd logs I see it redirecting these jobs to the stage
server irrespective of the fact that it FAILED to import the file.
As far as I can tell the only way to fix this is to stop and start the
oblds on both redirectors.