LISTSERV mailing list manager LISTSERV 16.5

Help for ATLAS-SCCS-PLANNING-L Archives


ATLAS-SCCS-PLANNING-L Archives

ATLAS-SCCS-PLANNING-L Archives


ATLAS-SCCS-PLANNING-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ATLAS-SCCS-PLANNING-L Home

ATLAS-SCCS-PLANNING-L Home

ATLAS-SCCS-PLANNING-L  March 2009

ATLAS-SCCS-PLANNING-L March 2009

Subject:

FW: re-processing campaign has started

From:

Wei Yang <[log in to unmask]>

Date:

05 Mar 2009 15:48:06 -0800Thu, 05 Mar 2009 15:48:06 -0800

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (87 lines)

------ Forwarded Message
From: Michael Ernst <[log in to unmask]>
Date: Thu, 05 Mar 2009 18:09:30 -0500
To: Rob Gardner <[log in to unmask]>, "McKee, Shawn" <[log in to unmask]>,
Saul Youssef <[log in to unmask]>, Wei Yang <[log in to unmask]>
Subject: FW: re-processing campaign has started

All,
 this is to let you know that the reprocessing exercise has started. This
morning validation jobs were submitted to the US cloud and finished
successfully at AGLT2.

Please find more information from Kors below.

--
    Michael
------ Forwarded Message
From: Kors Bos <[log in to unmask]>
Date: Thu, 5 Mar 2009 17:47:31 -0500
To: <[log in to unmask]>
Conversation: re-processing campaign has started
Subject: re-processing campaign has started

All,

the validation tasks for the re-processing have been submitted or are
about to be submitted. These are very much like the tasks that were
run during the Christmas re-processing campaign. Last time these tests
revealed a problem in FZK but that is now understood and fixed. So we
don't expect any site to fail this time. However Taipei is down
because of the fire and won't be back up in time to participate. These
data have to be re-processed at CERN now.

Tests have been performed at PIC and RAL to re-process data from tape.
This still uses the old release of the reconstruction software but
that is not important for this test. At PIC this went very well except
that the task didn't finish because we seem to have a broken tape. It
is good that this happens now because it gives us the opportunity to
test how to fix this. We can be sure that this will happen again. Also
at RAL this test is going well although a bit slower (on purpose).

All cosmics data has been cleaned from the buffers so the same tests
can start in the other T1's. Lyon will bring the files on-line
manually because there is still one component missing to have pre-
staging been done by the site services. These tests tell us if pre-
staging works and if the buffer turn over is more or less optimal for
the jobs.

The plan still is to start the real re-processing of all the cosmics
data next week. We know that there is a few day shut down at FZK so
they will probably start a little bit later. We don't have to remove
the RAW data from the disks because Panda can now distinguish between
the copy on disk and the copy on tape and be made to choose the tape
copy. This gives us a fall-back in case we do have problems with
reading from tape. We should hope not to need this fall back solution
because with real data we won't have an extra copy on disk.

We will have 2 measures against the "hot file" problem. There will be
a conditions data tar ball per run and not one for all 100 runs
together as we had over Christmas. So there will be fewer jobs at the
same time trying to access these data. Secondly at a few sites we will
test the "pcache solution" where the conditions data will be left on
the worker node after the job has finished. If the next job on that
node needs the same data it will just use it and not bring a fresh
copy in.

During Christmas and New Year few people were available in the sites.
This time we hope you will monitor closely this effort and report any
irregularity to us. We need to measure how efficiently we can do re-
processing, how many cpu's we use and how long it takes. We need to
know if the stage buffer matches the number of tape drives to fill it
and the number of cpu's to use it.  And then there are the exceptions
like broken tapes, files that seem to be missing for other reasons,
job crashes and so on. This may be one of the last chances to test
before we need it all working for real.

Kors


------ End of Forwarded Message


------ End of Forwarded Message



Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

September 2016
July 2016
June 2016
May 2016
April 2016
March 2016
November 2015
September 2015
July 2015
June 2015
May 2015
April 2015
February 2015
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
September 2013
August 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use