LISTSERV 16.5 - ATLAS-SCCS-PLANNING-L Archives

All,

since we had some discussion about disk storage solutions during today’s meeting I’d like to add some key points regarding features, fault tolerance, and data integrity on Thumper running Solaris 10 (11/06) with ZFS that one should consider when looking for products:

1. ZFS is a 128-bit filesystem that incorporates high-performance volume manager functionality up to and including double-parity RAID-60 (RAID-Z2) which dynamically stripes across RAID-6 sets for maximum throughput.

2. ZFS offers copy-on-write (COW) and end-to-end checksumming. COW means that live data is never overwritten. A ZFS storage pool is a tree of blocks (Merkle Tree). 256-bit checksums are stored in the parent block pointer instead of the data block. Every block in the tree contains the checksums for all its children so the entire pool is self-validating.

There are multiple checksum copies for the uberblock (the root of the tree). ZFS uses this mechanism to detect and correct silent data corruption.

So, what it means is, data that is read is always verified against the checksum and only the correct data is delivered to the application.

3. An Object-based transactional I/O stack allows for grouped or "batched" I/O (schedule and aggregate I/O at platter speed). All-or-nothing commits == always consistent data on disk == no journal needed.

4. Intelligent pre-fetch streaming detects any linear access pattern forward and backward and attempts to sequentialize the I/O.

Michael

From: [log in to unmask] [mailto:[log in to unmask]] On Behalf Of Popescu, Razvan
Sent: Wednesday, May 02, 2007 2:24 PM
To: [log in to unmask]
Subject: [Usatlas-grid-l] notes F&O 5/2

Data Management:

- # of transfers to/from BNL is increasing (includes requests for streaming tests, validation, etc.)

- DQ2 v0.2 will remain in production for another month. The testing of v0.3 started slowly and might not reach production level testing before ~5/20.

- ToA automatic updates are done on all sites outside the US. Due to our need to rotate storage elements we have implemented manual updates, which have the risk of creating de-synchronizations between sites. What should we do?

- Could we run the automatic updates of ToA, followed (automatically) by a “patch application” to superimpose the required local configuration? It will be a matter of writing a smart text processing that applies modifications from a (local) configuration file to the updated ToA. When the ToA is updated externally, the new version is automatically downloaded and re-patched to include local needs. When we need to rotate storage, we modify the local configuration source and kick the “patch application” phase plus all the restarts and cleanups that we’d do anyway.

- We’ll discuss options again, next week.

- Question from last meeting: do different storage elements require separate FTS channels? Yes, if the hostname is different a new FTS channel is required, in the current implementation. However, BNL’s FTS service can be provisioned appropriately to cope with multiple channels.

Production:

- All fine with few issues in working. (see email traffic)

- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!WE NEED MORE CPU!!!!!!!!!!!!!!!!!!!!!!!!!!!

- All the help possible, from all the sites, is very welcome to ramp up production to the current increased levels of demand. Please make any effort to have every system up and running.

Training session:

- Will have one in Indiana (6/22?)

- People are swamped with work, now, and would require at least 3weeks for preparation. More input will be necessary (from Michael, Fred, Kaushik) to determine if another session before IU’s is feasible. To be followed up.

Site updates:

AGL: All systems go. DQ2 server rebooted after kernel updates.

MW: Work in progress at IU. Troubleshooting PANDA/CONDOR pbs. UC: work on building Pool on SLC4. Several alternatives are considered.

NE: All systems go. Ramping up prod.

UTA: Issues w/ netw. Switch. Expected resolution by this afternoon. DPCC picks up the production demand.

OU: All systems go.

SLAC: All good. Work on new storage deployment. Will use RAID6 due to concerns w/ SATA reliability and reconstruction duration.