Print

Print


Ok, from the output, here's the error:
```
230301 06:28:12 12393 1fa32e3f.565:[log in to unmask] ofs_open: 0-600 fn=/user/ligo/test_access/access_ligo
230301 06:28:12 12393 acc_Audit: 1fa32e3f.565:[log in to unmask] grant https 1fa32e3f.0@[::ffff:129.93.244.204] read /user/ligo/test_access/access_ligo
[2023-03-01 06:28:12.623677 +0000][Error  ][AsyncSock         ] [[log in to unmask]:1094.0] Unable to initiate the connection: [ERROR] Socket error: cannot assign requested address
[2023-03-01 06:28:13.148513 +0000][Info   ][AsyncSock         ] [[log in to unmask]:1094.0] TLS hand-shake done.
230301 06:28:13 12393  ofs_Stall: Stall 3: File access_ligo is being staged; estimated time to completion 3 seconds for /user/ligo/test_access/access_ligo
230301 06:28:13 12393 1fa32e3f.565:[log in to unmask] Xrootd_Protocol: stalling client for 3 sec
230301 06:28:13 12393 1fa32e3f.565:[log in to unmask] ofs_close: use=0 fn=dummy
```
It looks like:
- the HTTP handler code doesn't take kindly to being stalled here and tries to close the file in response instead of waiting. Because the file handle is not open, the close failed with the message "read does not refer to an open file" (obviously an incorrect message but close 'nuff).  @ccaffy - fixing this is probably in your court.  Note this occurs during the GET path.  Look at the request in Fabio's attached file starting with `230301 06:28:07 12725 http_Protocol`.
- Unclear if XrdCl is doing the right thing here.  I'm guessing it's just bouncing around haphazardly, trying to find a data server without a socket error?  @simonmichal, thoughts?  Should it fail earlier / harder?
- For some reason, the OFS layer is able to handle the XrdCl error well for awhile (maybe because the file is cached?), but eventually chokes and starts stalling.  Note it is almost exactly 20 minutes between the first error and when things start to stall.  Perhaps during those 20 minutes the PFC is somehow swallowing the error, then later on decides to propagate it to OFS, and then the OFS finally starts erroring?  @osschar - thoughts?

If I had to guess, the socket error is because IPv6 is disabled on the host.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/1940#issuecomment-1454083076
You are receiving this because you are subscribed to this thread.

Message ID: <[log in to unmask]>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1