Print

Print


ok, I spoke too soon, this problem seems to be back.

I've added data server nodes, now 13 of them, each with 3 drives and 
only the first drive is getting populated with any data - except for 1 
of them.  After loading the cluster with lots of files, the first drive 
on all of them are at 100% usage with the rest of the drives at 3-5% 
usage.  The one outlier's 2nd drive is at 20% usage.  All these drives 
are 6-700 GB total size.

Digging deeper on each data server node, on all of them (except for that 
outlier) oss.localroot (/export/data/xrd/ns/) is not getting populated 
with symlinks into the 'oss.space public' locations 
(/export/data*/xrd/data/) but the data files are directly there. 
Further, there *are* lots of files under the oss.space public locations 
(e.g. /export/data2/xrd/data/public/1B/271B3B4FDB000000176%) but they 
are all empty!

On the one outlier there are symlinks to the 2nd drive, but that only 
accounts for ~200 of the ~1000 entries under oss.localroot there.

The config files for each of the data server nodes are identical.

I tried duplicating the oss.space public lines in the configs, as was 
previously suggested but that didn't seem to change anything.

I'll try some other test, like using xrootd 3.0.5, different configs, 
but these test take a while to complete, so I'm hoping this rings some 
bells here as to what I might be missing.

Thanks,
ksb

On 01/30/12 12:47, Keith Beattie wrote:
> to follow up here...
>
> I've tried it again, with the same configs, and now it works. To the
> best of my knowledge nothing is different, except the absence of this
> problem - both drives on all the clients are receiving data.
>
> I'd rather know what fixed it, but I'll take it and move on.
>
> Thanks for your help,
> ksb
>
> On 12/20/11 5:08 PM, Keith Beattie wrote:
>> Here you go, attached. Both from xrootd and cmsd.
>>
>> Thanks,
>> ksb
>>
>> On 12/16/11 11:55 PM, Wilko Kroeger wrote:
>>>
>>> Hello Keith
>>>
>>> I am using the release 3.1.0 at slac with cache systems (using oss.space
>>> public ...) and files are placed in all cache systems.
>>> Could you maybe post the startup message from the xrdlog?
>>>
>>> Cheers,
>>> Wilko
>>>
>>>
>>>
>>> On Fri, 16 Dec 2011, Keith Beattie wrote:
>>>
>>>> Hi Andy,
>>>>
>>>> I'm using 3.1.0, built from source. I tried doubling the oss.cache
>>>> lines and the servers failed to start. When using oss.space rather
>>>> than oss.cache, doubling those lines isn't fatal, but regardless I get
>>>> the same behavior - data only to the first entry.
>>>>
>>>> Thanks,
>>>> ksb
>>>>
>>>> On 12/16/11 6:06 PM, Andrew Hanushevsky wrote:
>>>>> Hi Keith,
>>>>>
>>>>> You may not be missing anything, depending on what version you are
>>>>> using. There was a bug that exhibited exactly this behaviour. The
>>>>> bypass
>>>>> (other than upgrading to atleast 3.0.2) is to list the oss.cache
>>>>> entries
>>>>> twice for each path.
>>>>>
>>>>> Andy
>>>>>
>>>>> On Fri, 16 Dec 2011, Keith Beattie wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> This seems like I'm missing something simple but can't seem to find
>>>>>> it.
>>>>>>
>>>>>> On my data servers nodes, only the first 'oss.cache' entry is getting
>>>>>> data, the following disks don't seem to ever get written to. Below is
>>>>>> an example of what the config looks like on the data nodes (not santa
>>>>>> where the manager is running). What am I missing?
>>>>>>
>>>>>> ----
>>>>>> all.role server
>>>>>> all.manager santa 3121
>>>>>>
>>>>>> all.export /
>>>>>> oss.localroot /data/ns
>>>>>> oss.cache public /data/xrddata
>>>>>> oss.cache public /data1/xrddata
>>>>>> ----
>>>>>>
>>>>>> i.e. /data/xrddata gets filled, /data1/xrddata does not.
>>>>>>
>>>>>> Thanks,
>>>>>> ksb

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1