Print

Print


  Hi Remi,

On Fri, Nov 19, 2004 at 04:29:28PM -0800, Remi Mommsen wrote:
> I have many erratic problems with the bbrprod0X servers inhibiting the  
> use of xrootd for the skim production. I cannot reliable reproduce the  
> errors, but about 30% of the transfers fail. The tracebacks are similar  
> to the one posted by Alvise and myself to xrootd-l.
> 
> Questions:
> - Are you (or somebody else) actively looking into these issues? We  
> need to get this solved by early next week.
> - Which version(s) of xrootd are running on bbrprod0X? Can you please  
> start the latest version on all of them?

  Andy shouldn't be doing this. We arranged things such this decision
should be _entirely_ in the hands of Wilko, Artem, etc. (i.e. Andy
shouldn't even need to be in the loop to distribute the software via
"taylor" at SLAC). Wilko, is that not true?

> - I can get a checksum only from bbrprod05. Do you know what the  
> problem is?

  There is clearly a big mess for the versions. I see:

  bbrprod01  20041022-0258
  bbrprod02  20040830-0105
  bbrprod03  20040830-0105
  bbrprod04  20040830-0105
  bbrprod05  20041022-0258

and of course I've no idea if they have all been started with the new
version of the config file which includes the external checksum script.

  Actually, you can always check the versions in Ganglia:

  http://www-gmon.slac.stanford.edu:8080/ganglia/?m=xrootd_version&r=hour&s=by%2520hostname&c=xrootd-prod&h=&sh=1&hc=4 

  Wilko, could you please sort this out?

> There is a test perl script at
> /afs/slac.stanford.edu/u/br/bbrskim/releases/test-16.0.1a/workdir/ 
> testPAdmin.pl
> which exercises the functionality which we need.
> 
> BTW: we gave up to get it to work using olb on the time scale of next  
> week. We will be happy if the functionality required by testPAdmin.pl  
> works for all 5 bbrprod0X machines.

  I'll take a look at it once they start the latest version of the
server (20041118-0948) on all 5 machines with the config file containing
the directive with the external checksum script.

  BTW, the fact that you are using your own compiled version of (HEAD of) the 
client instead of the version installed in afs is also a bit confusing. I'll
try to sort out the debug version for linux to help this along.

                                 thanks,
                                   Pete

-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------