LISTSERV 16.5 - ATLAS-SCCS-PLANNING-L Archives



> -----Original Message-----
> From: Wei Yang via RT [mailto:[log in to unmask]] 
> Sent: Thursday, March 12, 2009 11:30 AM
> To: Young, Charles C.
> Cc: atlas-sccs-planning-l; Moss, Leonard J.
> Subject: Re: [SLAC #163429] Request to use few memfs machines 
> for ATLAS testing
> 
> 
> Hi Neal,
> 
> Can you add user jchapman to LSF user group atlpetagrp? Also, 
> when can we enable LSF queue atlpetaq?
> 
> Hi Charlie,
> 
> For usage with direct login to SLAC machines, atlint01 is 
> available (8 cores and 8GB). Memfs machine can also be use 

Atlint01 does not say linux64. 

[young@yakut08 ~]$ lshosts | grep atlint01
atlint01      LINUX INTEL_30  12.0     -      -      -     No (linux linux32 rhel40)
[young@yakut08 ~]$ lshosts | grep memfs01
memfs01       LINUX AMD_1800   5.5     2 16000M  8189M    Yes (bs linux linux64 rhel40 memfs)
[young@yakut08 ~]$ 

Does it matter? 

> via batch. I think Randy and Neal want to know how long John 
> will use them.

Them being memfs? It's hard to be precise but I would guess a few months. On and off rather than consistent usage all the time. I am referring to the test/debug part, and not the production part. 

> 
> For "production" use, I don't know if you mean the normal 
> atlas production channel via Panda. If so, someone else in 
> ATLAS production will have to decide how to handle this large 
> memory requirement. A pilot based system doesn't really tells 
> a site the job requirements because pilots don't know what 
> jobs they will run. The only thing SLAC can do is to setup 
> different Panda "sites" and imposed jobs requirement at site 
> level. However in this case setting up another Panda site 
> will not help because normal production will not use it.
> 
> If the "production" will be done by John himself, I think we 
> can setup another site ANALY_SITE_TEST (despite the prefix 
> ANALY_) and have the site use the dedicated atlaspetaq.

This is very useful information. We should not look at it as "what production wants" but "how do we get these jobs done". For example, if not using pilots is best, that is what we should do. If setting up another "site" is best, that is what we should do. Who do we need to make an informed decision with all affected parties in the discussion? My first guess:

	SCCS: Wei + anyone? 
	John 
	Production: Borat K.  
	Panda: ?

> 
> Regards,
> Wei Yang  |  [log in to unmask]  |  650-926-3338(O)
> 
> > From: "Young, Charles C." <[log in to unmask]>
> > Date: Thu, 12 Mar 2009 08:26:27 -0700
> > To: unix-admin <[log in to unmask]>
> > Cc: atlas-sccs-planning-l 
> <[log in to unmask]>, 
> > "Moss, Leonard J." <[log in to unmask]>
> > Subject: RE: [SLAC #163429] Request to use few memfs machines for 
> > ATLAS testing
> > 
> > Hi Wei,
> > 
> > I know John has run the jobs (that require ~4 GB) at CERN already. 
> > There are no fundamental problems. However, it would be 
> useful to make 
> > a quick test here, either interactively or normal batch. There is 
> > enough memory on the typical interactive node, but it may 
> impact other 
> > users on that node. Hence the suggestion of doing it in a 
> fenced off 
> > area. Once production starts, there may be unforeseen 
> problems and it 
> > would once again be useful to be able to go in and debug.
> > 
> > When it comes to production, we will want to go through the normal 
> > channels as much as possible. The only things special that 
> I am aware 
> > of is the large memory requirement. If that means defining another 
> > "site", I guess we have to do it. Does this mean that each 
> "site" is 
> > expected to be homogeneous with no variations in the 
> properties of its 
> > CPUs? Memory, swap space, etc. I would naively expect a bit 
> more flexibility. Cheers.
> 
> 
> 
>