Hi Stephan,
This random behaviour especially for doing exactly the same thing over and
over again looks like an overload somewhere.
You mention that you are using DPM on the storage side. That means that
xrootd has no direct access to the files when asking for it, but all
requests should need to go through the DPM headnode/DPM service and
involve a database lookup.
Have there been any changes on the DPM/mysql config? Do you see any
errors in the dpm/mysql log files for the same request that gave you an
error during your stress test? (debug verbosity was set for the stress
test, wasn't it?)
Cheers,
Marcus
On Wed, 4 Jan 2017, Stephan Zimmer wrote:
> Dear XrootD experts,
>
> I'm writing to you in the hope you may be able to help me understand or at
> least identify the issue we've been seeing in our XrootD configuration. We
> store data on a ATLAS DPM storage element which we access through an xrootd
> redirector authenticated through x509 personal proxies.
>
> We experimented with the system for a while before encouraging our colleagues
> to use remote access through the redirector as preferred mode of access, but
> since doing so, we have been encountering the following error without any
> further specifics more and more frequently.
>
> [ERROR] Server responded with an error: [3005] Server database error:
> Communication error on send
>
> Digging through the Xrootd mail archives I've come across this post:
> https://listserv.slac.stanford.edu/cgi-bin/wa?A2=ind1003&L=XROOTD-L&P=R227&1=XROOTD-L&9=A&I=-3&J=on&d=No+Match;Match;Matches&z=4
>
> which seems to indicate that 3005 is thrown if an operation is not supported.
> Actually, I couldn't find any listing where the different
> error codes that may be raised by XrootD are provided. Does such
> documentation exist and if so, where could I find it?
>
> The biggest challenge in fully debugging this problem is that it occurs
> rather randomly. For instance, i've been running a stress test, where I
> read the same 5 files (which are clearly accessible on xrootd) 200 times
> (specifically i'm using ROOT to add 5 files that are chained together and
> retrieve the total number of entris in the tree). During this stress test,
> I get the above error message in up to 10% of the cases. In this specific
> instance, I call the xrootd client through ROOT (5.34.37) TNetXNGFile::Open
> constructor, but the same problem occurs occasionally when using xrdcp
> (version 4.2.3) and even when using xrdfs mkdir.
>
> Thank you very much for your help in advance,
> Regards,
> -Stephan
>
>
> --
> Dr. Stephan Zimmer
> DPNC, University of Geneva
> 24 quai Ernest-Ansermet
> CH-1211 Genève 4
> [log in to unmask] / [log in to unmask]
> mobile: +41766133052
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
|