Print

Print


Hi Patrick,

In general, a good LVM will provide better dispersement of data (and 
theoretically better performance) than using a partioning mechanism. When we 
wrote the partitioning code, such LVM's were hard to find and ones that 
existed were rather expensive (think almost 10 years ago). That said, 
partitioning gives you (for most LVM's) better control of recovery 
granularity. If a partition dies, you need only to recover files from that 
partition. In many LVM's, when a portion goes you may need to recover a 
whole filesystem -- a rather daunting task for TB filesystems.

Some filesystem (e.g., ZFS) do better than others and include integrated LVM 
support so they are more resilient. Others are not and some even wind up 
hurting you when configured on a huge partition as internal limitations 
either introduce huge allocation units (wasting a lot of space) or 
significantly reduce the number of files you can allocate (i.e., are short 
on inodes). So, this adds yet another level of complexity.

You are right that multi-TB cache partitions pose problems in Linux, and are 
challenging in terms of good file dispersal to achieve high performance 
(though, in all fairness, Linux is not alone in this). We have toyed with 
the idea of creating multiple subdirectories in the cache partition to 
alleviate those problems but have always put that on the back burner because 
LVM's were coming out that exhibited rather good performance and resiliency.

Generally, I prefer cache partition no more than a couple of terabytes (hard 
to get these days). That way I don't loose too much data when one of those 
goes and it's not so big to pose directory problems given a reasonably large 
file size (500MB or more).

I would be interested in knowing how useful people think is providing 
further support for cache partitions by introducing another directory level 
within a partition or whether people feel comfortable with today's LVM's to 
simply go with that.

Andy


----- Original Message ----- 
From: "Patrick McGuigan" <[log in to unmask]>
To: "Wilko Kroeger" <[log in to unmask]>
Cc: <[log in to unmask]>
Sent: Wednesday, November 07, 2007 2:50 PM
Subject: Re: Question about oss.cache directive


> Hi Wilko,
>
> Your answers were very helpful.  I better understand the cache directive, 
> but I am curious if anyone has used largish partitions to create a cache?
>
> Our partitions will be 6.5TB (XFS) and I am a little dubious about using 
> such a large partition to support a cache.  In the scenario that you 
> outline all of the files would reside in the base directory of the cache 
> directory (/xrd/cache01 or /xrd/cache02).  I am concerned that the 
> directory would have a large number of files which might result in slower 
> access to files because the way that linux deals with large directories.
>
>
> Another alternative is to use LVM to create one large partition, but I 
> will need to look at the load balancing issues when some servers have 
> twice as much storage than others.
>
> Any and all advice or experience is appreciated,
>
> Patrick
>
>
>
>
>
> Wilko Kroeger wrote:
>>
>> Hello Patrick
>>
>> Fabrizio answered already most of the question. I just have a few 
>> comments.
>>
>> If you have more then one partition that an xrootd server should serve 
>> you should use the cache directive.
>> The cache is working by placing a file in a cache directory and creating 
>> a link between this file and the proper file name. For example:
>> if the file name is /xrd/test/d1/file1 and your you use the the cache 
>> directive
>> ooss.cache /xrd*
>> the file would be put (lets pick cache xrd2) into
>>    /xrd2/%xrd%test%d1%file1
>> and a link is created:
>>> ls -l /xrd/test/d1/file1 ->  /xrd2/%xrd%test%d1%file1
>>
>> As you can see there are no directories in the cache. The file name in 
>> the cache is the proper file name with all '/' replaced by '%'.
>>
>>
>> As xrootd will export /xrd you have to create a /xrd directory. I guess 
>> this will not be in the '/' root partition but in one of you data 
>> partition (/xrd1 /xrd2) and therefore you will need a link.
>>  /xrd -> /xrd1
>>
>> However, in this case, doing an 'ls /xrd' would list all files in /xrd1 
>> which could be quite large depending how many files you have. Therefore, 
>> you might want to have a link like
>>  /xrd -> /xrd1/xrd
>> In this case 'ls /xrd' would not list the files in the /xrd1 cache.
>>
>> Another possibility would be to make the cache directories a little bit 
>> more explicit. Mount your two partitions as:
>> /xrd
>> /xrd/cache1
>> and create the directory
>> /xrd/cache0
>> and then use
>> ooss.cache /xrd/cache*
>>
>> I hope these comments helped a little bit.
>>
>>   Cheers,
>>       Wilko
>>
>>
>>
>> On Wed, 7 Nov 2007, Patrick McGuigan wrote:
>>
>>> Hi,
>>>
>>> I am setting up an xrootd cluster for the first time and I have a 
>>> question about the oss.cache directive.
>>>
>>> Some of my data servers have two partitions (and some have one) that I 
>>> want to use for storage.  Is it true that the oss.cache directive MUST 
>>> be used to put two partitions into service?  How is load balancing 
>>> (based on space) managed on caches versus partitions?  Are there any 
>>> performance penalties to using the cache directive?
>>>
>>> Finally, when a directory is created within a cache, does the directory 
>>> get created on both partitions?
>>>
>>>
>>>
>>> If the partition on a one mount server is /xrd1 and the partitions on 
>>> dual-mount server are /xrd1 and /xrd2, would the following snippet from 
>>> the config file be appropriate:
>>>
>>>
>>> #
>>> #
>>> olb.path rw /xrd
>>> #
>>> oss.cache public /xrd*
>>> #
>>> xrootd.fslib /opt/xrootd/lib/libXrdOfs.so
>>> xrootd.export /xrd
>>>
>>>
>>>
>>> I am expecting this to create a global namespace rooted at /xrd that is 
>>> writable and would use both partitions of dual-mount data server.
>>>
>>>
>>>
>>> Thanks for any information,
>>>
>>> Patrick
>>>
>