ATLAS SCCS Planning 12Dec2006
-----------------------------
9am, SCCS Conf Rm A, to call in +1 510 665 5437, press 1, 3935#
Present: Richard, Charlie, Wei, JohnB, Randy, BillL, Booker, Gordon
Apologies: Len
Agenda:
1. DQ2 Status/Web Proxy
This was suppose to be an upgrade but not heard anything (was to
0.2.12).
Talked to BobC. Currently only inside SLAC can see the web
interface. Outside access would be needed to use dq2_get. BobC
would like to get an iptables rule onto the machine. Patrick (of US
ATLAS) said it is possible to upgrade apache if needed but very
tricky, as requested by BobC. Wei will use iptables in the
meantime.
Have switched to using Wei's DN. Only ran for a few hours, Stephen
probably shouldn't need to renew his proxy anymore.
2. Job Priorities
New guideline from ATLAS if we want to implement it of 50:50
between production and users. Currently this is actually the case
as 100 shares are for production and 100 for other users.
Only one DN is mapped to the production group. Perhaps need to
figure out how other users can be mapped to the ATLAS priority
group.
BobC thought that if a user was in multi-VOs their jobs should be
mapped to different UNIX accounts depending which VO their job was
for. However, we don't get enough information to determine which VO
their job is for.
For the moment leave it like it is (that user grid jobs end up in
AllUsers and therefore low priority) and revisit it if (when?) it
becomes an issue.
3. Power Outage
Week long power outage is only a small subset of services so
doesn't effect the whole lab. Outage was announced by Chuck on the
28th November.
It is effecting slight more than expected NFS wise, not completely
clear why.
4 Tier-2 Hardware
Very close to getting the orders out but it there are so many
things going on that it probably won't happen. One example is the
need to rebid the water cooled racks (which has lost 2-4 months of
time). No point buying things if they can't be cooled.
To get the power we need to run the ATLAS hardware we need to
replace the nomas and toris. Those replacements need to be Rackable
as those machines will come out of those racks. Trying to finish up
the details on the Rackable order and also put together the order
to Sun. This will probably end up being the first week in
January. Don't believe this puts the ATLAS Tier-2 money in jeopardy
but the continuing resolution might put all unspent money at
risk. It seems that House and Congress leaders have decided to have a
Continuing Resolution for the rest of the Financial Year.
5. AOB
- UTA Tier-2 Meeting
Most important deliverable requested by Jim Shank was an evaluation
of what the Tier-2s can deliver through 2012. It should be based
on a one year slippage from everything they planned. Wei will be
responding to this based on a spreadsheet from Richard. Need to
define what the definition of a machine is.
Every site gave a site report on operations issues and their
purchasing plans. Andy also gave a talk on xrootd. The sites seemed
quite interested in this as dCache doesn't seem to function
anywhere except for BNL. Need to figure out what the next steps in
making xrootd working with the ATLAS distribution system. Not clear
that ATLAS will use xrootd. Some people are beginning to see a need
for such a beast but it is taking a while to penetrate fully. If we
don't get xrootd working we will need to use something like dCache.
It would be very difficult to start with dCache and then move to
xrootd. Would need to move the data to two different places. It is
a completely different storage system. They both use file servers
that can run Linux but apart from that they are very different. If
there was an SRM interface to xrootd that would make it look just
like any other Storage Element to ATLAS. Believe SRM does the
transfer between SEs. FTS is used to manage the transfers. There is
some preference to use gsiftp as that seems more stable than
SRM. Using gsiftp would mean putting a server on each xrootd
server. Could a public machine redirect a connection to another
machine. Only need that part of SRM just now. xrootd clearly does
that.
Need to have a good plan for a scalable Storage Element at SLAC for
the next Tier-2 meeting.
Action Items:
-------------
061213 Booker Talk to AndyH (and others) about moving SRM interface forward
061129 Stephen Email Andy about "ingest" rates
061213 Done.
061129 Wei Email other Tier-2s about xrootd
061213 Done.
061129 Stephen Request more release area AFS space
061213 Done.
061122 Wei Attempt to implement security recommendations
061213 Putting in iptables.
061115 Wei Add monitoring disk space for DQ2 to Ranger.
061122 Not done yet. Also need to monitor the GUMS server.
061213 Done. Also GUMS server and other ATLAS space.
061108 Richard Discuss with SLAC Security longterm approach to ATLAS VO
061115 No information.
061213 Nothing happened yet.
061101 Richard/Bill Convene advisory group regarding CPU/disk split.
061108 Have emailed Gordon. To be done.
061115 Gordon and Bill will meet today to discuss it.
061213 Sent note on the 6th. Some feedback, general agreement. Done.
061025 Stephen Check web server approval status
061101 Have opened up that hornet's nest.
061108 No news for the last week. Need to keep the discussion going.
061115 Teresa trying to get web team together for a meeting
next week.
061122 Everyone at the meeting except Stephen went. They want
to know if we are up to date with security patches. BobC
would like us to use an IP range for protection in MySQL
instead of the domain name. Also not sure if the web
server interface needs read/write or just read
access. Could set the privilege of the web server to
make security tighter, also by putting it on a different
machine. Believe the Site Services python scripts update
the TiersOfAtlasCache.py file. There was a general
unhappiness of how security was handled. Folk are going
to take this message to the Tier-2 meeting next
month. The verdict was allowed under protest. Recognised
this is an important commitment but need to work with authors
to improve system.
061213 Consider this done.
061018 Wei Test gridftp with xrootd federation
061025 Probably not very soon, but should be on the agenda.
061101 No change.
061108 Will remove from Agenda and leave as action item.
061115 Nothing done yet.
061122 Nothing happened yet. Discussion with Wilko about
setting up an xrootd machine for testing with ATLAS data
transfers. With SRM need to have all xrootd machines
exposed to the Internet, so not a great solution.
061213 Not going to work. So this isn't going to happen.
061004 Randy Find out about xrootd for ATLAS plans
061018 no news
061025 No information yet. Andy probably knows something.
061101 No info.
061108 Need to get Andy involved in this discussion. Might be
useful for Andy to go to the December meeting in
Arlington.
061115 Will attempt to get Andy on the phone next week.
061122 Didn't actually happen. Next time?
061213 Had meeting with Andy and he presented at Tier-2, this
item is Done.
|