Print

Print


Hi guys,

In order to initiate the TPC the client needs to issue a sync  request against the destination file,
if the sync request times out (request timeout) the whole TPC operation fails. This simply means
the client cannot reach the destination within the specified request timeout. The TPC timeout
doesn't apply as the TPC hasn't been initiated yet.

I think this behaviour is rather correct.

Cheers,
Michal
________________________________________
From: [log in to unmask] [[log in to unmask]] on behalf of Andrew Hanushevsky [[log in to unmask]]
Sent: 14 December 2017 09:59
To: Wei Yang
Cc: Kroeger, Wilko; xrootd-l; Michal Simon
Subject: Re: increasing allowed time for xrdcp third party copy

No, Wilko was actually hitting the request timeout. It really should be
better handled and I'll talk to Michal about that. The TPC timeout
actually refers to how long the TPC rendezvous can take. After that, the
request timeout applies. So, the documentation is misleading.

Andy

On Thu, 14 Dec 2017, Yang, Wei wrote:

> I wonder if TPC should send out heartbeat signal,  like what GridFTP does.
>
> --
> Wei Yang  |  [log in to unmask]  |  650-926-3338(O)
>
>
>
>
>
>
>
> -----Original Message-----
> From: <[log in to unmask]> on behalf of Wilko Kroeger <[log in to unmask]>
> Date: Thursday, December 14, 2017 at 2:39 AM
> To: xrootd-l <[log in to unmask]>
> Subject: increasing allowed time for xrdcp third party copy
>
>> Hello
>>
>> We are using the third party copy option (--tpc only) for xrdcp to
>> transfer files between two xrootd clusters and noticed that transfers
>> failed if they took longer than 30 min (1800s).
>>
>> We could fix this issue by increasing the request timeout:
>>    XRD_REQUESTTIMEOUT=7200
>> (default 1800) but according to the documentation it seems like
>> XRD_CPTPCTIMEOUT should be used in this case:
>>       XRD_CPTPCTIMEOUT (-DICPTPCTimeout)
>>             Maximum time allowed for a third-party copy operation to finish.
>> However increasing or decreasing it had no effect on the transfers.
>> Is this expected or is there an issue with xrdcp how it handles these two
>> timeouts?
>>
>>
>> Here is an example of a failed transfer (tested with v4.6.1 and 4.8.0-rc2):
>>
>>   2017-12 13 15:01:51 -0800   xrdcp --debug 1 --nopbar --tpc only root://.....  root://.....
>>
>>  [2017-12-13 15:31:51.940965 -0800][Error  ][XRootD ] [trg-xrootd-srv] Unable to get the response to request kXR_sync (handle: 0x00000000)
>>  [2017-12-13 15:31:51.941069 -0800][Error  ][File    ] ...... Fatal file state error. Message kXR_sync (handle: 0x00000000) returned
>>     with [ERROR] Operation expired
>>  [2017-12-13 15:31:52.376810 -0800][Error  ][Utility           ] Third party copy from .... to ... failed: [ERROR] Operation expired
>>  [2017-12-13 15:31:52.379876 -0800][Error  ][XRootDTransport   ] Message 0xb0000d50, stream [1, 0] is a response that we're no longer interested in
>>    (timed out)
>>  Run: [ERROR] Operation expired
>>
>>
>>  Cheers,
>>     Wilko
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1