@osschar the "good old" workernodes do have less job slots, but only 20% less (32 vs 40) than the generation that sees significantly more failures. I do suspect you're on the right track, these WN disks were never bought with XCaches in mind, so we probably have an range of performances in the various generations. Right away I can see that the "bad" nodes have 2.5 inch hard drives vs 3.5 inch on the others. I'm putting some internal monitoring together to try and understand how hard each of the generations disks are working, hopefully I can find something obvious that correlates with these failures.

I've also added your monitoring to our test WNs, It's easier for me to do it to all 6 WNs reserved for the LHCb test jobs, but if you would prefer a single host I can do that too.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/xrootd/xrootd/issues/1259#issuecomment-672931622", "url": "https://github.com/xrootd/xrootd/issues/1259#issuecomment-672931622", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1