Print

Print


Sorry I was not able to attend. Have some questions about this parity business.

> 2. Tier-2 Hardware
> 
>     One storage machine has the OS. Should get two of the machines up
>     and running, allow Len to do some testing with the third
>     machine. There is some question about how many parity disks we
>     want, one or two. There is some discussion at HEPiX about how long

When one disk fails, how many TB's of data are affected? 

>     it takes to reconstruct an array after loosing a disk, we would be
>     at risk during the reconstruction. Len will try to 

You mean a second disk failing before everything on the array has been reconstructed? Hope the components are not that unreliable. What is the advertised MTBF for a drive? 

> measure how long
>     it takes and try to get information from other labs. With double
>     parity it is more reliable but there is a write-time and space
>     parity. An element of the discussion would be the type of data on
>     it, if it was really just a cache of data stored 
> elsewhere or if it
>     was the primary storage. Even if we believe it is just a cache an

Tier 2 is not supposed to be primary storage for anything. True it may be the only storage for production before it is transferred to Tier 1 for archival. However, that is recoverable by regenerating so we actually have infinite number of "backups". On the scale of things, regenerating is cheap. 

>     worry would be how long it would take to reimport all the data
>     again from BNL (probably around a week). Could setup different

I assume we would not do this if reconstructing takes less than a week. So this is an upper bound, right? 

>     areas for production with double parity.

Production could be more tolerant. One, data is swept to Tier 1 for archival, so we should not lose more than a day or two's worth. Two, production data is, almost by definition, something with a fairly long lead time and regenerating them would be acceptable. Active analysis data disappearing is likely to have greater impact on users. Can we use our tape system to back these up -- assuming it is much faster getting things back from our tapes than from across the country? 

>