Print

Print


Hi,

  uhm, maybe now i better see your point.

  The dtor of a physical connection is only called by the Connection 
Manager garbage collection.

  This happens because the protocol states that the idle connections 
must be kept alive for a certain amount of time. Uhm. Maybe your problem 
is due to the fact that the dtor is not explicitly called at program exit.

Is this your problem, i.e. that you have no points where to put your 
"fast disconnection request" at the client side when the app simply exits ?



Fabrizio




Ulrich Schwickerath wrote:
> Hi, Fabrizio,
> 
> thank's a lot for the fast answer!
> 
> 
>>  Re-looking at your log I don't understand the problem. xrdcp is
>>supposed to exit after having closed the file, so the connection is
>>closed and the server notices this because there are no more bytes to
>>read from the connection. Do I miss something?
> 
> Well, I'm using native InfiniBand. Basically, what you have is a pair of 
> connected send and receive queue on both sides. If you want to send 
> something, you post a send request on the sender side. The receiver, on the 
> other hand, posts a receive request (containing the adress of the buffer 
> where the data should go), and then  polls the receive queue. If some data 
> has arrived, it will see a completion event. At that point the data has 
> already arrived. The problem in this case is that if the sender does not send 
> anything, the receiver will poll the receive queue until the time out is 
> reached, and that is exactly what happens. My idea was to post a send request 
> with no data attached, so that the receiver will not time out but see no data 
> which would emulate the behavior that is expected.  
> 
>>  Anyway, the newer xrdcp, using the latest client, should avoid
>>requesting data over the eof, that was one of the latest commits, but it
>>implements a different schema for data transfers, which now are done
>>both synchronously and asynchronously (i.e. in parallel, while the
>>application "thinks"). The problem is that this stuff is in the head.
>>Here it works, so if you are adventurous you can test it and report here
>>if you have troubles.
> 
> I started with 20050413-0433, so I'll update to the new version that just came 
> in :-)
> 
> Thank's a lot!
> Ulrich
> 
> 
>>Ulrich Schwickerath wrote:
>>
>>>Hi,
>>>
>>>I (finally) found a bit of time to work on a
>>>port of Xrootd to (native) InfiniBand, and a
>>>proof-of-concept version is now sort of working, meaning
>>>that I can now transfer files with xrdcp. If possible, I would
>>>like to show first results with this version at the ACAT05
>>>conference in Berlin in about 3 weeks from now.
>>>I have a few questions to you:
>>>
>>>1/ what would be meaningful benchmark for xrootd ?
>>>   For the moment, I'm only using point-to-point transfers
>>>   (remote read to /dev/null, mainly) which is a bit boring ...
>>>
>>>2/ Is there a ready-to-use test suite that can/should be used ?
>>>
>>>3/ I still have one problem with xrdcp. For the client I have added
>>>a few lines of code in the XrdClientPhyConnection.cc which
>>>initializes my InfiniBand connection, and bypasses the recv/write
>>>calls. Everything works fine until the end. The Server log looks
>>>like this:
>>>050501 21:36:41 5282 schwicke.17140:12@iwrcgop027 XrootdProtocol: 0000
>>>req=3003 dlen=0
>>>050501 21:36:41 5282 schwicke.17140:12@iwrcgop027 XrootdFile: closing
>>>r /tmp/testfile.dat
>>>050501 21:36:41 5282 schwicke.17140:12@iwrcgop027 XrootdProtocol: 0000
>>>close fh=0
>>>050501 21:36:41 5282 schwicke.17140:12@iwrcgop027 XrootdResponse: 0000
>>>sending OK
>>>050501 21:36:44 5282 XrootdXeq: schwicke.17140:12@iwrcgop027 disc 0:04:24
>>>(link read error)
>>>050501 21:36:44 5282 schwicke.17140:12@iwrcgop027 XrdPoll: sending poller
>>>0 detach for link 12
>>>050501 21:36:44 5282 XrdPoll: Poller 0 detached fd 12 entry 1 now at 1
>>>050501 21:36:44 5282 schwicke.17140:12@iwrcgop027 XrdPoll: FD 12 detached
>>>from poller 0; num=0
>>>For TCP/IP I see that the last recv request succeeds in the poll
>>>command but ends with zero bytes of data, resulting in a ENOMSG return
>>>code. For InfiniBand, this call simply times out, giving the above
>>>link read error. A workaround would be to send a zero size message over
>>>the InfiniBand link just before the connection is closed for good. I
>>>tried to do that inside the XrdClientPhyConnection destructor, but that
>>>one is never called in xrdcp. A side effect of this is that the resources
>>>used by the InfiniBand connection need to be cleaned up and freed by the
>>>driver resource tracking mechanisms. Where should this be done instead?
>>>
>>>
>>>Thank's a lot in advance,
>>>Ulrich
> 
>