URL:
<http://savannah.cern.ch/bugs/?80880>
Summary: Default delay servers value is incorrect for
supervisors
Project: XROOTD
Submitted by: None
Submitted on: 2011-04-12 16:09
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Email: [log in to unmask]
Open/Closed: Open
Discussion Lock: Any
Fixed by commit(s):
_______________________________________________________
Details:
In our STAR deployment, we found an oddity where the clients would eventually
be delayed ad-aeternam by Scalla. After some tracing (with Andy's help), it
was found that the supervisors would drop dataservers unexpectedly.
Supervisor A expecting 48 nodes in B64 / 75% would be finding less
dataservers due to the fact that some may have rolled over
another supervisor B ... and thinking something is wrong. Re-starting
periodically a few dataservers would exacerbate this issue + the fact that we
naturally have more supervisor was bound to have reach such situation
(re-started dataaservers would roll over the additional supervisor B present
for resilience and redundancy and supervisor A would "miss" its quorum).
Way around was to set
if named xrdtestsuper
cms.delay servers 1
else
cms.delay servers 75% service 15 startup 65 suspend 15
fi
forcing the default delay for supervisor.
Andy requested filing a bug report so this miss-feature would be fixed in
later revision of Xrootd.
Done.
Jerome
_______________________________________________________
Reply to this item at:
<http://savannah.cern.ch/bugs/?80880>
_______________________________________________
Message sent via/by LCG Savannah
http://savannah.cern.ch/
|