Print

Print


On 06/16/2017 01:34 PM, Andrew Hanushevsky wrote:
> Hi Yvan,
Hi everyone!
We have the same problem of "full space" even if there is enough space

the reporting done to the redirector shows this :

aliprod@rd: manager $ grep do_Load cmslog
170617 00:06:45 16327 server.2586:23@[::XXX.22]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=98% 188141MB load=0 mass=78
170617 00:06:45 16345 server.32460:20@[::XXX.27]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=85% 5846385MB load=0 mass=68
170617 00:06:45 25209 server.6851:18@[::XXX.24]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=98% 314084MB load=0 mass=78
170617 00:06:45 16564 server.21467:19@[::XXX.25]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=98% 291060MB load=0 mass=78
170617 00:06:45 29507 server.15118:22@[::XXX.23]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=96% 598157MB load=0 mass=76
170617 00:06:45 16323 server.25599:21@[::XXX.26]:1094 do_Load: cpu=0 
net=0 xeq=0 mem=0 pag=0 dsk=98% 404141MB load=0 mass=78

and the xrootd space situation is like this:
[Saturday 17.06.17 00:17] adrian@sev : ~  $
xrd_status rd.MYDOMAIN
Xrootd cluster name is : ALICE::ISS::FILE and is running xrootd version 
v4.5.0
Total space in xrootd cluster : rd.MYDOMAIN
Total space (GiB) :     471919.25
Free space (GiB) :      45949.40

xrootd storage server : storage04.MYDOMAIN:1094
Total space (GiB) :     40334.79
Free space (GiB) :      919.06

xrootd storage server : storage02.MYDOMAIN:1094
Total space (GiB) :     18333.92
Free space (GiB) :      367.12

xrootd storage server : storage05.MYDOMAIN:1094
Total space (GiB) :     40334.79
Free space (GiB) :      851.97

xrootd storage server : storage07.MYDOMAIN:1094
Total space (GiB) :     249087.66
Free space (GiB) :      39693.96

xrootd storage server : storage06.MYDOMAIN:1094
Total space (GiB) :     83485.25
Free space (GiB) :      2365.25

xrootd storage server : storage03.MYDOMAIN:1094
Total space (GiB) :     40342.84
Free space (GiB) :      1752.05

I would like to use all space available up to 32GB free on a partition

I tried using this on redirector :
# http://xrootd.org/doc/dev45/cms_config.htm#_Toc454223038
cms.space min 64g 32g

and on servers (even if it is the default):
oss.alloc 0 0 0

but i still got full disk messages ...

are there any other knobs for tweaking the usage of space?

Thank you!!
Adrian


> 
> If that is the case, then the cms.space is set too low and it is 
> selecting servers who have "enough" space but the "enough" is not that 
> much. You will see in the log a periodic reporting of free space 
> statistics from all your servers. The lines look something like:
> 
> 170616 01:17:28 26648 Node: xxxxxx.slac.stanford.edu load=0; cpu=0 net=0 
> inq=0 mem=0 pag=0 dsk=0 utl=0 shr=[100 73 0]
> 
> could collect all of them for your servers and send tem to me (you hide 
> any sensitive info).
> 
> Andy
> 
> On Fri, 16 Jun 2017, Yvan Calas wrote:
> 
>>> On 15 Jun 2017, at 18:53, Yvan Calas <[log in to unmask]> wrote:
>>>
>>> I would like to understand how to correctly set up the parameter 
>>> cms.space in XRootD. We currently observe error messages like this 
>>> one on our servers:
>>>
>>> 170615 18:42:06 285690 XrootdXeq: alisgm76.15597:26@[::xxx.xxx.xx.x] 
>>> pub IPv4 login as alisgm76
>>> 170615 18:42:06 285690 ofs_open: alisgm76.15597:26@[::xxx.xxx.xx.x] 
>>> Unable to create /03/01818/8cb7a7f4-51e9-11e7-a169-ef6caac907fd; no 
>>> space left on device
>>> 170615 18:42:06 285690 XrootdXeq: alisgm76.15597:26@[::xxx.xxx.xx.x] 
>>> disc 0:00:00
>>>
>>> In order to solve this issue, I reduced the recalc parameter to 1 
>>> minutes and try different thresholds for the "min" and "hwm" 
>>> parameters as described in [1]. Actually, we have:
>>>
>>> cms.space linger 0 recalc 1 min 100g 150g
>>>
>>> and the xrootd partition is as follow:
>>>
>>> Filesystem            Size  Used Avail Use% Mounted on
>>> /dev/mapper/datavg-data
>>>                      110T  110T  109G 100% /xrootd
>>>
>>> I tried to change the value of the hwm parameter (below and above 
>>> 100g) but without success, the error messages remain.
>>>
>>> Could you please tell me if there is something wrong with my setting?
>>
>> I would like to add that some of our servers have a lot of free 
>> (available) space, so they should be chosen by the redirector IMHO. Do 
>> you think this is a bug (we are currently running XRootD v4.2.3-3).
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>
> 
> ########################################################################
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
> 


-- 
----------------------------------------------
Adrian Sevcenco, Ph.D.                       |
Institute of Space Science - ISS, Romania    |
adrian.sevcenco at {cern.ch,spacescience.ro} |
----------------------------------------------


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1