ATLAS SCCS Planning 30Aug2006 ----------------------------- 9am, SCCS Conf Rm A, to call in +1 510 665 5437, press 1, 3935# Present: Charlie, Wei, Steffen, Stephen, Richard, Gary, Chuck, Gary, Su Dong, Agenda: 1. DQ2 Status/Web Proxy Going to be a test of data transfers Tier-0 <-> Tier-1 <-> Tier-2. THe bit that will go to SLAC will only be 70GB. There will be a transfer from BNL to SLAC on the 11th/12th and back on the 13th/14th. We will not have the full network available due to power not being available. Shouldn't be an issue for this test. Prior to the test we need to make sure that there is enough disk space and transfers are currently working. Wei should be around the week of the tests. Might be useful if the transition to the production account for DQ2 could happen before the test. 2. ATLAS Oracle Server Over the time we've been discussing this the importance of Oracle for the SLAC tests has gone done. This is partly due to product that was developed to allow scaling was based on MySQL. It may be in the future that we would need it but it doesn't look likely at the moment. 3. Slots for ATLAS Production jobs and other batch related stuff Raised the number of jobs to 30. Have had 47 jobs running at the same time. No real problems beyond the load on the NFS being higher than before, it was using about 25% for 62MB/s. Once we can get the jobs running on machines with more local scratch space they will not need to use the NFS server. The fairshare group is setup (atlas, with one member) but it doesn't have any allocation yet. 1000 shares, 10% goes to the unwashed masses, 10% to LCD, 30% to one GLAST function and 50% to another GLAST function. The GLAST allocation should reduce soon so should allocate 10% to ATLAS. Want to have separate controls over the allocation to local vrs grid use. Could we direct ATLAS Grid jobs to the general queues and other Grid jobs to the osgq? Yes. Some of the ATLAS jobs take too long for most of the general queues. They do specify a RUNTIME but not a CPUTIME. Would like to move the ATLAS OSG jobs to a larger number of machines but with a lower priority. They should also be on machines with large local disk space. Will need to have the grid jobs specify how much space they need. Can we tell user analysis jobs from real production jobs as they both use the PANDA system? Not sure. Might need to be monitored in the future. 4. Validation of ATLAS jobs on RHEL Currently ATLAS only runs a validation on each kit. This is done by production before running at each site. So SLAC is officially validated. There is someone working with DavidQ to try to do a event-by-event validation. Currently looking at event generators. Should see if SLAC can help out here, it may save us some pain the the future. 5. Further report on Boston meeting Very noticeable when people presented at the meeting that no one had any significant storage even though we were directed towards spending 2/3 of the money on storage as were many past successful proposals. The current requirement for a Tier-2 is to have 1TB of disk. We need to look at this carefully in the future. Expect in practise that will be less than this 2/3 until real data starts to arrive. To get the 2/3 ratio when that happens may need to start the build up somewhat earlier. Due to the specific SLAC setup may not deviate too much from the plan as SLAC has a significant amount of CPU currently but very little ATLAS disk. There is a recognition that the Tier-2s would be managed much like our Western Tier-2 will be managed with a strong local representation but there is no model on how to achieve it. 6. AOB - SLAC ATLAS web page Consensus that each Tier-2 will have a similar looking Tier-2 web page. They should follow the same template (which hasn't been developed yet). Trying to get the content out just now for comments. Can later fix it into the correct framework. We should let others know that we want to follow the "standard" methods. - CERN Users May need to be a special case as we do need some people from CERN to log on to SLAC. Action Items: ------------- 060830 Stephen Talk more to DavidQ about validation. 060830 Richard Talk to Gregory about "getting" disk 060823 Stephen Find out what current validation processes exist 060830 Done. 060823 Wei Talk to Neal about raising the osgq limit 060830 Done. 060816 Wei Setup ATLAS/SLAC Web page 060823 Wei circulated a not try to bring back comments for next week. 060830 First draft up. 060816 Charlie Talk to SLUO about adding institutions. 060830 Will take time to converge but will be done. 060816 Neal Setup atlas priority group for LSF 060823 Not done yet. 060830 Group setup, not priority given yet. 060816 Chuck Check with Bob about web server approval need 060823 To be done. 060830 To be done. 060809 Stephen Ask what dq2user needs to do in MySQL 060816 No good answer. Limited to dq2user from offsite can only SELECT from localreplicas. From onsite can do SELECT,UPDATE,DELETE and INSERT to either localreplicas or queued_transfers_SLAC. We'll see if that works or not. Without onsite privileges production stopped. 060823 Sounds like things are working again, but no concrete info. 060830 Mark this as done. 060412 Systems Provide Oracle service for ATLAS Trigger testing (RT 46089) 060419 No ticket yet, so nothing done. 060426 Now have ticket 46089. 060503 No news. 060524 Steffen has provided configuration information. Now in Chuck's hands. 060628 Randy will ask Chuck about status. 060726 First on list for V240 but not sure when it will happen. Will put a T3a on it. 060802 John checking for rack space. 060809 Still needs allocated rack space. 060816 Has rack and power, waiting for network. 060823 Waiting for switch, 060830 No longer required. "Done."