Hi Patrick,
In general, a good LVM will provide better dispersement of data (and
theoretically better performance) than using a partioning mechanism. When we
wrote the partitioning code, such LVM's were hard to find and ones that
existed were rather expensive (think almost 10 years ago). That said,
partitioning gives you (for most LVM's) better control of recovery
granularity. If a partition dies, you need only to recover files from that
partition. In many LVM's, when a portion goes you may need to recover a
whole filesystem -- a rather daunting task for TB filesystems.
Some filesystem (e.g., ZFS) do better than others and include integrated LVM
support so they are more resilient. Others are not and some even wind up
hurting you when configured on a huge partition as internal limitations
either introduce huge allocation units (wasting a lot of space) or
significantly reduce the number of files you can allocate (i.e., are short
on inodes). So, this adds yet another level of complexity.
You are right that multi-TB cache partitions pose problems in Linux, and are
challenging in terms of good file dispersal to achieve high performance
(though, in all fairness, Linux is not alone in this). We have toyed with
the idea of creating multiple subdirectories in the cache partition to
alleviate those problems but have always put that on the back burner because
LVM's were coming out that exhibited rather good performance and resiliency.
Generally, I prefer cache partition no more than a couple of terabytes (hard
to get these days). That way I don't loose too much data when one of those
goes and it's not so big to pose directory problems given a reasonably large
file size (500MB or more).
I would be interested in knowing how useful people think is providing
further support for cache partitions by introducing another directory level
within a partition or whether people feel comfortable with today's LVM's to
simply go with that.
Andy
----- Original Message -----
From: "Patrick McGuigan" <[log in to unmask]>
To: "Wilko Kroeger" <[log in to unmask]>
Cc: <[log in to unmask]>
Sent: Wednesday, November 07, 2007 2:50 PM
Subject: Re: Question about oss.cache directive
> Hi Wilko,
>
> Your answers were very helpful. I better understand the cache directive,
> but I am curious if anyone has used largish partitions to create a cache?
>
> Our partitions will be 6.5TB (XFS) and I am a little dubious about using
> such a large partition to support a cache. In the scenario that you
> outline all of the files would reside in the base directory of the cache
> directory (/xrd/cache01 or /xrd/cache02). I am concerned that the
> directory would have a large number of files which might result in slower
> access to files because the way that linux deals with large directories.
>
>
> Another alternative is to use LVM to create one large partition, but I
> will need to look at the load balancing issues when some servers have
> twice as much storage than others.
>
> Any and all advice or experience is appreciated,
>
> Patrick
>
>
>
>
>
> Wilko Kroeger wrote:
>>
>> Hello Patrick
>>
>> Fabrizio answered already most of the question. I just have a few
>> comments.
>>
>> If you have more then one partition that an xrootd server should serve
>> you should use the cache directive.
>> The cache is working by placing a file in a cache directory and creating
>> a link between this file and the proper file name. For example:
>> if the file name is /xrd/test/d1/file1 and your you use the the cache
>> directive
>> ooss.cache /xrd*
>> the file would be put (lets pick cache xrd2) into
>> /xrd2/%xrd%test%d1%file1
>> and a link is created:
>>> ls -l /xrd/test/d1/file1 -> /xrd2/%xrd%test%d1%file1
>>
>> As you can see there are no directories in the cache. The file name in
>> the cache is the proper file name with all '/' replaced by '%'.
>>
>>
>> As xrootd will export /xrd you have to create a /xrd directory. I guess
>> this will not be in the '/' root partition but in one of you data
>> partition (/xrd1 /xrd2) and therefore you will need a link.
>> /xrd -> /xrd1
>>
>> However, in this case, doing an 'ls /xrd' would list all files in /xrd1
>> which could be quite large depending how many files you have. Therefore,
>> you might want to have a link like
>> /xrd -> /xrd1/xrd
>> In this case 'ls /xrd' would not list the files in the /xrd1 cache.
>>
>> Another possibility would be to make the cache directories a little bit
>> more explicit. Mount your two partitions as:
>> /xrd
>> /xrd/cache1
>> and create the directory
>> /xrd/cache0
>> and then use
>> ooss.cache /xrd/cache*
>>
>> I hope these comments helped a little bit.
>>
>> Cheers,
>> Wilko
>>
>>
>>
>> On Wed, 7 Nov 2007, Patrick McGuigan wrote:
>>
>>> Hi,
>>>
>>> I am setting up an xrootd cluster for the first time and I have a
>>> question about the oss.cache directive.
>>>
>>> Some of my data servers have two partitions (and some have one) that I
>>> want to use for storage. Is it true that the oss.cache directive MUST
>>> be used to put two partitions into service? How is load balancing
>>> (based on space) managed on caches versus partitions? Are there any
>>> performance penalties to using the cache directive?
>>>
>>> Finally, when a directory is created within a cache, does the directory
>>> get created on both partitions?
>>>
>>>
>>>
>>> If the partition on a one mount server is /xrd1 and the partitions on
>>> dual-mount server are /xrd1 and /xrd2, would the following snippet from
>>> the config file be appropriate:
>>>
>>>
>>> #
>>> #
>>> olb.path rw /xrd
>>> #
>>> oss.cache public /xrd*
>>> #
>>> xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
>>> xrootd.export /xrd
>>>
>>>
>>>
>>> I am expecting this to create a global namespace rooted at /xrd that is
>>> writable and would use both partitions of dual-mount data server.
>>>
>>>
>>>
>>> Thanks for any information,
>>>
>>> Patrick
>>>
>
|