Print

Print


ATLAS SCCS Planning 12Dec2006
-----------------------------

  9am, SCCS Conf Rm A, to call in +1 510 665 5437, press 1, 3935#

Present: Richard, Charlie, Wei, JohnB, Randy, BillL, Booker, Gordon

Apologies: Len

Agenda:

1. DQ2 Status/Web Proxy

    This was suppose to be an upgrade but not heard anything (was to
    0.2.12).

    Talked to BobC. Currently only inside SLAC can see the web
    interface. Outside access would be needed to use dq2_get. BobC
    would like to get an iptables rule onto the machine. Patrick (of US
    ATLAS) said it is possible to upgrade apache if needed but very
    tricky, as requested by BobC. Wei will use iptables in the
    meantime.

    Have switched to using Wei's DN. Only ran for a few hours, Stephen
    probably shouldn't need to renew his proxy anymore.

2. Job Priorities

    New guideline from ATLAS if we want to implement it of 50:50
    between production and users. Currently this is actually the case
    as 100 shares are for production and 100 for other users.

    Only one DN is mapped to the production group. Perhaps need to
    figure out how other users can be mapped to the ATLAS priority
    group.

    BobC thought that if a user was in multi-VOs their jobs should be
    mapped to different UNIX accounts depending which VO their job was
    for. However, we don't get enough information to determine which VO
    their job is for.

    For the moment leave it like it is (that user grid jobs end up in
    AllUsers and therefore low priority) and revisit it if (when?) it
    becomes an issue.

3. Power Outage

    Week long power outage is only a small subset of services so
    doesn't effect the whole lab. Outage was announced by Chuck on the
    28th November.

    It is effecting slight more than expected NFS wise, not completely
    clear why.

4  Tier-2 Hardware

    Very close to getting the orders out but it there are so many
    things going on that it probably won't happen. One example is the
    need to rebid the water cooled racks (which has lost 2-4 months of
    time). No point buying things if they can't be cooled.

    To get the power we need to run the ATLAS hardware we need to
    replace the nomas and toris. Those replacements need to be Rackable
    as those machines will come out of those racks. Trying to finish up
    the details on the Rackable order and also put together the order
    to Sun. This will probably end up being the first week in
    January. Don't believe this puts the ATLAS Tier-2 money in jeopardy
    but the continuing resolution might put all unspent money at
    risk. It seems that House and Congress leaders have decided to have a
    Continuing Resolution for the rest of the Financial Year.

5. AOB

    - UTA Tier-2 Meeting

    Most important deliverable requested by Jim Shank was an evaluation
    of what the Tier-2s can deliver through 2012. It should be based
    on a one year slippage from everything they planned. Wei will be
    responding to this based on a spreadsheet from Richard. Need to
    define what the definition of a machine is.

    Every site gave a site report on operations issues and their
    purchasing plans. Andy also gave a talk on xrootd. The sites seemed
    quite interested in this as dCache doesn't seem to function
    anywhere except for BNL. Need to figure out what the next steps in
    making xrootd working with the ATLAS distribution system. Not clear
    that ATLAS will use xrootd. Some people are beginning to see a need
    for such a beast but it is taking a while to penetrate fully. If we
    don't get xrootd working we will need to use something like dCache.

    It would be very difficult to start with dCache and then move to
    xrootd. Would need to move the data to two different places. It is
    a completely different storage system. They both use file servers
    that can run Linux but apart from that they are very different. If
    there was an SRM interface to xrootd that would make it look just
    like any other Storage Element to ATLAS. Believe SRM does the
    transfer between SEs. FTS is used to manage the transfers. There is
    some preference to use gsiftp as that seems more stable than
    SRM. Using gsiftp would mean putting a server on each xrootd
    server. Could a public machine redirect a connection to another
    machine. Only need that part of SRM just now. xrootd clearly does
    that.

    Need to have a good plan for a scalable Storage Element at SLAC for
    the next Tier-2 meeting.

Action Items:
-------------

061213 Booker	Talk to AndyH (and others) about moving SRM interface forward

061129 Stephen	Email Andy about "ingest" rates
        061213 Done.

061129 Wei	Email other Tier-2s about xrootd
        061213 Done.

061129 Stephen	Request more release area AFS space
        061213 Done.

061122 Wei	Attempt to implement security recommendations
        061213 Putting in iptables.

061115 Wei	Add monitoring disk space for DQ2 to Ranger.
        061122 Not done yet. Also need to monitor the GUMS server.
        061213 Done. Also GUMS server and other ATLAS space.

061108 Richard	Discuss with SLAC Security longterm approach to ATLAS VO
        061115 No information.
        061213 Nothing happened yet.

061101 Richard/Bill Convene advisory group regarding CPU/disk split.
        061108 Have emailed Gordon. To be done.
        061115 Gordon and Bill will meet today to discuss it.
        061213 Sent note on the 6th. Some feedback, general agreement. Done.

061025 Stephen	Check web server approval status
        061101 Have opened up that hornet's nest.
        061108 No news for the last week. Need to keep the discussion going.
        061115 Teresa trying to get web team together for a meeting
               next week.
        061122 Everyone at the meeting except Stephen went. They want
 	      to know if we are up to date with security patches. BobC
 	      would like us to use an IP range for protection in MySQL
 	      instead of the domain name. Also not sure if the web
 	      server interface needs read/write or just read
 	      access. Could set the privilege of the web server to
 	      make security tighter, also by putting it on a different
 	      machine. Believe the Site Services python scripts update
 	      the TiersOfAtlasCache.py file. There was a general
 	      unhappiness of how security was handled. Folk are going
 	      to take this message to the Tier-2 meeting next
 	      month. The verdict was allowed under protest. Recognised
 	      this is an important commitment but need to work with authors
 	      to improve system.
        061213 Consider this done.

061018 Wei	Test gridftp with xrootd federation
        061025 Probably not very soon, but should be on the agenda.
        061101 No change.
        061108 Will remove from Agenda and leave as action item.
        061115 Nothing done yet.
        061122 Nothing happened yet. Discussion with Wilko about
 	      setting up an xrootd machine for testing with ATLAS data
 	      transfers. With SRM need to have all xrootd machines
 	      exposed to the Internet, so not a great solution.
        061213 Not going to work. So this isn't going to happen.

061004 Randy	Find out about xrootd for ATLAS plans
        061018 no news
        061025 No information yet. Andy probably knows something.
        061101 No info.
        061108 Need to get Andy involved in this discussion. Might be
 	      useful for Andy to go to the December meeting in
 	      Arlington.
        061115 Will attempt to get Andy on the phone next week.
        061122 Didn't actually happen. Next time?
        061213 Had meeting with Andy and he presented at Tier-2, this
 	      item is Done.