LISTSERV mailing list manager LISTSERV 16.5

Help for ATLAS-SCCS-PLANNING-L Archives


ATLAS-SCCS-PLANNING-L Archives

ATLAS-SCCS-PLANNING-L Archives


ATLAS-SCCS-PLANNING-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ATLAS-SCCS-PLANNING-L Home

ATLAS-SCCS-PLANNING-L Home

ATLAS-SCCS-PLANNING-L  June 2010

ATLAS-SCCS-PLANNING-L June 2010

Subject:

Re: [Fwd: Re: Proof cluster ready for testing]

From:

"Yang, Wei" <[log in to unmask]>

Date:

9 Jun 2010 17:18:21 -0700Wed, 9 Jun 2010 17:18:21 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (182 lines)

Hi Bart,

I made another attempt. Here is what I used to start at client side (assuming bash) on a rhel5-64 machine

. /afs/slac/g/atlas/packages/gcc432/setup.sh
export ROOTSYS=/afs/slac.stanford.edu/g/atlas/packages/root/root5.26.00b-slc5_amd64-gcc43
export PATH=${PATH}:/$ROOTSYS/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ROOTSYS/lib
$ROOTSYS/bin/root

It seems I was able to load a .par file.  Can you give it a try? Also, remember on atlint01, if you copy a file to /xrootd/proof/bcbutler, you should TDset::Add("root://boer0123//atlas/proof/bcbulter/..."). However, I found that reading from T2 storage seems to be faster than reading from the disks in the proof cluster (without localizer).

regards,
Wei Yang  |  [log in to unmask]  |  650-926-3338(O)


On Apr 29, 2010, at 3:25 PM, Bart Butler wrote:

> First thing first, I think I killed your cluster. The xrootd mount is no longer readable from atlint01 and I can't submit PROOF jobs to it anymore. This happened after killing my client root session manually after a massively-screwed up job.
> 
> Secondly, I am have a hell of a time compiling my shared library correctly. Which version of ROOT is the cluster running? If I'm not running the exact same root version and gcc version as every worker node, I can't make binaries (which is what Booker did with his test package it seems. I do it too when I run PROOF-Lite). And if I can't make binaries, I have to submit source packages. This should be fine but it's never worked well for me. My first theory was that because the packages are kept in a common place on xrootd in my user space, the compilation errors I was getting from some workers were because all 32 (I was never able to connect to 4 of the 36 workers) tried to compile the package at the same time in the same place. Running on a single worker worked fine (but of course was slow). I don't think this compilation issue was the whole story though, because if the single worker thing worked, the next time all workers should have been able to load the compiled version without problems assuming they are all running the same version of ROOT, and they crashed and burned just as badly that time. That's when the cluster itself crashed.
> 
> Another thing was that making TDSets from the Tier 2 xrootd storage worked fine, but when I tried using the same files I had copied to the cluster xrootd storage it couldn't find them for some reason.
> 
> My log files should be in /xrootd/proof/bcbutler if you guys get the cluster working again.
> 
> -Bart
> 
> 
> Yang, Wei wrote:
>> Hi Bart, David,
>> 
>> any news on this?
>> 
>> regards,
>> Wei Yang  |  
>> [log in to unmask]
>>   |  650-926-3338(O)
>> 
>> 
>> On Apr 21, 2010, at 12:03 PM, Bart Butler wrote:
>> 
>>   
>> 
>>> I'll try to run a few jobs tonight and see what happens.
>>> 
>>> -Bart
>>> 
>>> Yang, Wei wrote:
>>>     
>>> 
>>>> [add Andy Hass ...]
>>>> 
>>>> Hi David, Booker,
>>>> 
>>>> I mounted the xrootd space of the proof cluster at /xrootd/proof on atlint01.  It looks like we have ~1.8TB total on the cluster. So something ~ 1TB should work.
>>>> 
>>>> The cluster should be able to access T2 storage if your provide the URL of those root file to process. But the whole idea of using proof is to avoid network traffic as much as possible. As we are still validation the functions, it would be good to try both. Or if you put half of the data on proof cluster, and leave the other half on T2 storage (no NFS please). 
>>>> 
>>>> The proof master node is boer0123. If you copy files to the cluster, the xroot URL is root://boer0123//atlas/proof (I suggest you to create a fizisist sub-dir). 
>>>> 
>>>> Booker, it looks like proof also leaves some file in the cluster. How would you suggest to manage the space, by user, by group, or something else?
>>>> 
>>>> regards,
>>>> Wei Yang  |  
>>>> 
>>>> [log in to unmask]
>>>> 
>>>>   |  650-926-3338(O)
>>>> 
>>>> 
>>>> On Apr 21, 2010, at 8:40 AM, David W. Miller wrote:
>>>> 
>>>>   
>>>> 
>>>>       
>>>> 
>>>>> Hi Booker and Wei,
>>>>> 
>>>>> I have a few questions: from what machine do we launch the jobs? Any machine at SLAC, but specifying the URI correctly? Also, if the data are on atlasuserdisk or usr in /xrootd/atlas/ is that sufficient?
>>>>> 
>>>>> Thanks,
>>>>> David
>>>>> 
>>>>> On Apr 21, 2010, at 17:36 PM, Ariel Schwartzman wrote:
>>>>> 
>>>>>     
>>>>> 
>>>>>         
>>>>> 
>>>>>> From: Booker Bense <[log in to unmask]>
>>>>>> 
>>>>>> 
>>>>>> Date: April 21, 2010 16:09:51 PM GMT+02:00
>>>>>> To: "Schwartzman, Ariel G." 
>>>>>> 
>>>>>> <[log in to unmask]>
>>>>>> 
>>>>>> 
>>>>>> Cc: "Yang, Wei" 
>>>>>> 
>>>>>> <[log in to unmask]>
>>>>>> 
>>>>>> 
>>>>>> Subject: Re: Proof cluster ready for testing
>>>>>> 
>>>>>> 
>>>>>> On Wed, 21 Apr 2010, Ariel Schwartzman wrote:
>>>>>> 
>>>>>>       
>>>>>> 
>>>>>>           
>>>>>> 
>>>>>>> Hi Booker,
>>>>>>> 
>>>>>>> I cannot access this machine remotely:
>>>>>>> 
>>>>>>>         
>>>>>>> 
>>>>>>>             
>>>>>>> 
>>>>>>>> ssh -Y boer0123.slac.stanford.edu
>>>>>>>>           
>>>>>>>> 
>>>>>>>>               
>>>>>>>> 
>>>>>>> ssh: connect to host boer0123.slac.stanford.edu port 22: Operation timed out
>>>>>>> 
>>>>>>>         
>>>>>>> 
>>>>>>>             
>>>>>>> 
>>>>>> It's on the slac internal network, you'll need to login to a slac 
>>>>>> machine and run root programs from there. You shouldn't need
>>>>>> login access to the master node.
>>>>>> 
>>>>>> _ Booker C. Bense
>>>>>> 
>>>>>> 
>>>>>>       
>>>>>> 
>>>>>>           
>>>>>> 
>>>>> ==========================================
>>>>> David W. Miller
>>>>> ------------------------------------------
>>>>> SLAC
>>>>> Stanford University
>>>>> Department of Physics
>>>>> 
>>>>> SLAC Info: Building 84, B-156. Tel: +1.650.926.3730
>>>>> CERN Info: Building 01, 1-041. Tel: +41.76.487.2484
>>>>> 
>>>>> EMAIL:    
>>>>> 
>>>>> [log in to unmask]
>>>>> 
>>>>> 
>>>>> HOMEPAGE: 
>>>>> 
>>>>> http://cern.ch/David.W.Miller
>>>>> 
>>>>> 
>>>>> ========================================== 
>>>>> 
>>>>>     
>>>>> 
>>>>>         
>>>>> 
>>>>   
>>>> 
>>>>       
>>>> 
>> 
>>   
>> 
> 




Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

September 2016
July 2016
June 2016
May 2016
April 2016
March 2016
November 2015
September 2015
July 2015
June 2015
May 2015
April 2015
February 2015
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
September 2013
August 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use