Print

Print


  Hi Artem,

On Sat, Nov 20, 2004 at 09:44:40PM -0800, Artem Trunov wrote:
> > > Artem and I do the restart and also do some of the configuration of the
> > > xrootds but setting up a release so that it gets distributed by taylor
> > > is done by Andy.
> >
> >   Ok, this is different from what Andy, Chuck and I agreed. What we
> > wanted was that (a) someone (me) makes the releases and (b) you/Artem
> > have complete control over distributing them with taylor and starting them
> > on machines at SLAC. This takes Andy completely out of the operational
> > loop. We need to fix that.
> 
> It's not a good practice, IMHO. One of our problems is that chances that
> new release doesn't work are high, and Andy needs to be around during
> restart to understand what's going on.

  We've been through this N times before. You are mixing up two things: 
testing or understanding problems with a new release and you deploying it. 
You >should< be able to deploy and start servers from a new version 
(technically) at SLAC without Andy. If there are problems, you back up to 
the previous version and tell us what the problem was. He may be around
when you start new versions (it is >your< call when you start new versions), 
but he shouldn't have to type any commands (or know all the machines being
used) to make it happen.

  [This part of the discussion should probably be in the BaBar HN and
not the xrootd list as it is SLAC-specific.]

                                   Pete

> > > > > - I can get a checksum only from bbrprod05. Do you know what the
> > > > > problem is?
> > > >
> > > >   There is clearly a big mess for the versions. I see:
> > > >
> > > >   bbrprod01  20041022-0258
> > > >   bbrprod02  20040830-0105
> > > >   bbrprod03  20040830-0105
> > > >   bbrprod04  20040830-0105
> > > >   bbrprod05  20041022-0258
> > > >
> > > > and of course I've no idea if they have all been started with the new
> > > > version of the config file which includes the external checksum script.
> > > >
> > > >   Actually, you can always check the versions in Ganglia:
> > > >
> > > >   http://www-gmon.slac.stanford.edu:8080/ganglia/?m=xrootd_version&r=hour&s=by%2520hostname&c=xrootd-prod&h=&sh=1&hc=4
> > > >
> > > >   Wilko, could you please sort this out?
> > > >
> > > > > There is a test perl script at
> > > > > /afs/slac.stanford.edu/u/br/bbrskim/releases/test-16.0.1a/workdir/
> > > > > testPAdmin.pl
> > > > > which exercises the functionality which we need.
> > > > >
> > > > > BTW: we gave up to get it to work using olb on the time scale of next
> > > > > week. We will be happy if the functionality required by testPAdmin.pl
> > > > > works for all 5 bbrprod0X machines.
> > > >
> > > >   I'll take a look at it once they start the latest version of the
> > > > server (20041118-0948) on all 5 machines with the config file containing
> > > > the directive with the external checksum script.
> > > >
> > > >   BTW, the fact that you are using your own compiled version of (HEAD of) the
> > > > client instead of the version installed in afs is also a bit confusing. I'll
> > > > try to sort out the debug version for linux to help this along.
> > > >
> > > >                                  thanks,
> > > >                                    Pete
> > > >



-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------