

ATLAS SCCS Planning 09Aug2006

  SCCS Conf Rm A, to call in +1 510 665 5437, press 1, 3935#

Present: Stephen, Su Dong, Randy, Shirley, Chuck, Ariel, Booker,
 	 Richard, Wei


1. DQ2 Status/Web Proxy

    We now have user account to run under (atldq2). Sent request to
    have access to atl-dq2, will be added to the netgroup that gives it
    access to both atl-prod machines.

    Last week either before or after the meeting limited access to the
    web server to onsite only and have seen no problems.

    Chuck opened a ticket about the broad access to that machine for
    the mysql. BobC forwarded it to CERN and then the ATLAS folk. The
    response was to try to get the information out of the Google
    cache. The mechanism to determine which machines have MySQL
    installed will not work with this machine as it is installed by
    pacman, not via RPM.

    The MySQL user dq2user should only have access to the data tables
    it needs. It should not need to add users or change permissions. We
    should ask ATLAS folk if it really needs access to configure new
    databases etc. Update, create records, create or drop entire
    tables. If someone gets a hold of the password they could create
    tables which contain malware or other things you don't want. Expect
    that all that is needed is update, add and alter records in the

    250GB disk space was full, Wei and Randy talked. They thought there
    was another server on order... so they doubled the size. There is a
    script that cleans up space that Stephen currently runs
    manually. Will run it shortly to see how much space it frees up
    (121GB). US ATLAS says you need about 50GB per CPU. Recently signed
    an MoU to start the process to get the money for the Tier-2
    site. The current plans says to spend twice as much on storage as
    on CPU. Will not get the ATLAS money till the 15th September. There
    is a storage order going out shortly for BaBar (one of twice a
    year), as BaBar will not need this immediately it could be borrowed
    by ATLAS till BaBar needs it and ATLAS' order has been
    placed. Would be useful to know how much space needs to be
    available for users. Is useful to see the site opened up to other
    US ATLAS users. Need to be doing both user and production
    jobs. There will be an advisory board to help settle any issues
    between these two needs. Hope to be able to deliver some resources
    prior to the October Physics Jamboree. Need to determine what is
    exactly needed soon to be useful for then.

    Need to setup a web page to describe how outside ATLAS users can
    register at SLAC. There is an ATLAS group setup for accounts. Need
    a way to identify new user requests are valid. There is a
    database of ATLAS users at CERN but doesn't tell us which users
    should have SLAC Tier-2 access. Perhaps have a list of US groups
    which should have access and determine which institution a user
    comes from. Should have a single web point of entry for ATLAS
    users and who to complain to. Who's responsibility is that?
    Richard's. SCCS will then need to hand off certain problems to
    others. Ariel can help provide feedback and actual help.

    Wei will be "project" lead for getting required infrastructure created
    as needed for the October Workshop.

    Recent lesson learned, real ATLAS people don't use the grid. It is
    used for ATLAS simulation production. Folk doing work don't need
    that level or resources or complication.

2. Trigger Farm Status

    Steffen is happy that all the physical preparations are being made
    on the correct timescale. Richard thought this wasn't going to be
    possible but now John and Boris have pulled off some miracles to
    keep everyone happy.

    Some discussion about bringing in an HP Pro Curve switch. This is
    what ATLAS will actually be using. Gary and Charley would be
    interested in knowing more information about this.

3. ATLAS Oracle Server

    Last week were looking for rack space. Since them some things have
    become higher priority, hopefully get back to it soon.

4. Slots for ATLAS Production jobs and other batch related stuff

    For the priority LSF group should be no problem, Neal back next

    Wei tried sending some of the failing jobs (the pre-12 series
    releases with old Job Transforms) to a couple of slow machines that
    do have Internet access. They didn't work well though as memory was

    For database updates there is a web page that describes what
    version needs to be used.

    Will be a discussion with the US ATLAS which can perhaps help
    understand some of the issues with batch access to the Internet.

    Need to understand if the SLAC Grid resources should be used by
    anyone or just ATLAS. Also need to understand if we should offer
    Tier-2 service to folk from the East Coast or even abroad.

5. AOB

    DQ2 Subscriptions:

    How does DQ2 subscribe to data? There is command that can be run to
    do this. Production jobs can actually pull in data also if they
    need it.

    Memory Use:

    Also still have the question about how much memory is needed. Very
    important to know if we need 2GB per job or just one. The person
    Stephen would talk to to get a definitive answer is on holiday just
    now and back next week. Can this wait till next week? There is an
    order going out soon (as mentioned above) so can lend ATLAS
    10TB. All other Tier-2s are getting 1GB per core of memory. Will
    ask at the Workshop next week also.

    Replica Database:

    The current one should work with the latest jobs, it was installed

    Meeting Room Time:

    Will try to extend to 1.5 hours.

    ATLAS Environment:

    Will talk about this tomorrow at the SLAC ATLAS TDAQ meeting.

Action Items:

060809 Richard	Come back with person responsible for web site

060809 Stephen	Ask what dq2user needs to do in MySQL

060802 Stephen	Email Su Dong about Tier2 Workshop
        060809 Done.

060802 Stephen	Find out about failing ATLAS jobs at SLAC
        060809 Wei asked if anything useful and there were 9 completely
               successful jobs (but those were tests). We should find
 	      out if the intermediate files are kept and are useful.

060726 Randy	Talk to Richard about DQ2 Workshop
        060802 Wei is well along on planning to going, but BNL has some
 	      trouble giving access. Normal process takes 90 days but
 	      hope to be able to do something faster by the 10th 
        060809 Has site access approval. Done.

060726 Stephen	Find out about maximum memory and local storage per job
        060802 No news yet.
        060809 No news yet.

060412 Systems  Provide Oracle service for ATLAS Trigger testing (RT 
        060419 No ticket yet, so nothing done.
        060426 Now have ticket 46089.
        060503 No news.
        060524 Steffen has provided configuration information. Now in 
        060628 Randy will ask Chuck about status.
        060726 First on list for V240 but not sure when it will
               happen. Will put a T3a on it.
        060802 John checking for rack space.
        060809 Still needs allocated rack space.

060224 Chuck	Will check on web server request for DQ2 machine
        060301 Waiting for web server request information from Stephen.
        060308 Haven't checked yet; haven't received Stephen's request yet.
        060315 Still not sent Chuck information.
        060405 No update.
        060412 No update.
        060419 No update.
        060426 Resubmitted request as it was lost before.
        060503 Not heard anything for about two weeks.
        060524 Stephen need to update ticket, moving from yakut to DQ2 
        060628 Request updated now running on atl-dq2, waiting for scan.
        060726 Meeting on Monday about this.
        060802 Requesting new user to run services, blocked offsite web
 	      access, Need to explore MySQL security.
        060809 New user available, more ideas for MySQL security.

060224 Richard	Discuss ATLAS trigger machines with others in SCCS
        060301 Only limited response from John W was resigned
 	      acceptance... need to work on an actual deployment plan as
 	      there are real issues to be solved.
        060308 John aware and in plans as much as anything is. New
 	      engineer will take over.
        060315 No update.
        060405 No update.
        060412 No update.
        060419 No update.
        060426 No update.
        060503 No update.
        060524 RT 45823. Engineer looking at power availability. On track 
        060628 Understand schedule, Randy will make sure John is aware.
        060726 Need to nail down when power will be available. Steffen
 	      things he can make it happen with existing equipment.
        060802 Looks like this will fit in SCCS. Can reuse rack,
               switches and fibres.
        060809 Everything looking good for this now.