Hi Matevz, On Sat, 25 Jun 2011, Matevz Tadel wrote: > OK, understood. That's also what I was hit by ... I had the default 'cms.delay > servers' value (which turns out to be 80%) and four servers ... one went down > ... and so the whole thing stopped. Yes, in the case where you have less than say 16 or so servers, the better choice is to use a specific number. > I don't have 'cms.sched maxload' set ... the default is 100, right? And another > thing, runq percentage -- this pertains to system load average, the first number > reported by the executable starded via cms.perf (being 100 * LoadAvg15 / N_cores > in XrdOlbMonPerf, it seems)? Yes, the default is 100. I haven't looked at the MonPerf program in a while. That was developed by a coalition of partners. Most people just key off the cpu and io numbers as these are more relevant. Also, you can avoid the bad side-effects of highly variable servers by nicing down the cmsd (say -15 or -20, same as xntpd). The idea is that you want the cmsd to be responsive regardless of load. It hardly uses any resources so a low nice value won't impact anything. Andy