Print

Print


Hi Patrick,

By default, the redirector asks the client to wait 5 second when it 
encounters a file it has never seen before. The reason is to give 
sufficient time for all the data servers to indicate whether they already 
have the file. This is a standard tactic in a large distributed cluster 
where there is no single point of control. Can we do better? Perhaps, we 
can give the option of allowing a client saying "I don't care" give me a 
destination server right away.

Now, the easier way to get around this philisophical problem is to issue a 
"prepare" request for all the files you want to load. This request happens 
in parallel and then when you really want to load the files, the 
redirector knows what is there and what is not. In practice, it means you 
suffer the 5 seconds delay only once.

The only thing I can say is that xrootd does not behave as a standard 
filesystem (which most people are used to). On the other hand, because it 
does not, it can give you much better performance in the standard case 
(i.e., sub-millisecond response time for existent file opens).

Andy

On Sat, 8 Dec 2007, Patrick McGuigan wrote:

> I have a basic system working with one client, seven dataservers, and a 
> redirector.  The client can store correctly but I am observing a consistent 
> five second delay when the client is storing data.
>
> I ran xrdcp with -d 3 and the output contains :
>
> 071207 23:03:40 001 Xrd: DoLogin: No prev session info for 10.1.2.255:1094
> 071207 23:03:40 001 Xrd: Open: Access to server granted.
> 071207 23:03:40 001 Xrd: Open: Opening the remote file /xrd/test24/100mb
> 071207 23:03:40 001 Xrd: Open: File open in progress.
> 071207 23:03:40 18805 Xrd: XrdClientMessage::ReadRaw: Reading header (8 
> bytes).
> 071207 23:03:40 18805 Xrd: ReadRaw: Reading from 10.1.2.255:1094
> 071207 23:03:40 18805 Xrd: SendGenCommand: Sending command Open
> 071207 23:03:40 18805 Xrd: WriteRaw: Writing 24 bytes to physical connection
> 071207 23:03:40 18805 Xrd: WriteRaw: Writing to substreamid 0
> 071207 23:03:40 18805 Xrd: WriteRaw: Writing 17 bytes to physical connection
> 071207 23:03:40 18805 Xrd: WriteRaw: Writing to substreamid 0
> 071207 23:03:40 18805 Xrd: ReadPartialAnswer: Reading a XrdClientMessage from 
> the server [10.1.2.255:1094]...
> 071207 23:03:40 18805 Xrd: XrdClientMessage::ReadRaw:  sid: 1, IsAttn: 0, 
> substreamid: 0
> 071207 23:03:40 18805 Xrd: XrdClientMessage::ReadRaw: Reading data (4 bytes) 
> from substream 0
> 071207 23:03:40 18805 Xrd: ReadRaw: Reading from 10.1.2.255:1094
> 071207 23:03:40 18805 Xrd: BuildMessage:  posting id 1
> 071207 23:03:40 18805 Xrd: XrdClientMessage::ReadRaw: Reading header (8 
> bytes).
> 071207 23:03:40 18805 Xrd: ReadRaw: Reading from 10.1.2.255:1094
> 071207 23:03:40 18805 Xrd: ReadPartialAnswer: Server [10.1.2.255:1094] 
> answered [kXR_wait] (4005)
> 071207 23:03:40 18805 Xrd: CheckErrorStatus: Server [10.1.2.255:1094] 
> requested 5 seconds of wait
> 071207 23:03:42 18805 Xrd: DumpPhyConn: Phyconn entry, 
> [log in to unmask]:1094', LogCnt=1 Valid
> 071207 23:03:44 18805 Xrd: DumpPhyConn: Phyconn entry, 
> [log in to unmask]:1094', LogCnt=1 Valid
> 071207 23:03:45 18805 Xrd: SendGenCommand: Sending command Open
>
>
>
> It appears that the client received a 5 second delay from the redirector.  I 
> don't seen any useful information in the olbd log of the redirector and the 
> xrootd log only indicates that the client was asked to wait:
>
> 071207 22:54:24 15092 odc_send2Man: root.18805:16@compute-0-0 asked to wait 5 
> by xrdb path=/xrd/test24/100mb
>
>
> How do I determine why the redirector is sending xXR_wait?
>
> Is the result of my configuration defaulting to round robin scheduling?
>
>
> Thanks,
>
> Patrick
>
>
>
>
>
>
>