Print

Print


Hi Wilko,

Could you rerun the test with the olbd's started with the -d option and
send me the log; thanks.

Andy

On Thu, 16 Dec 2004, Wilko Kroeger wrote:

>
>
>
> Hello Andy and Fabrizio
>
> I tried to copy a local file into xrootd using the latest xrootd release,
> 20041216-0815. I am using one redirector (bbr-test-xfer1) and two data
> servers (oprser04, oprserv07).
> Running xrdcp against a data server works fine, but running it against
> the redirector doesn't work the way it should.
>
> Running xrdcp the first time fails with the error:
>
>  041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers
>         are available to read the file.
>
> Running the same xrdcp command a second time the client is redirected to a
> data server and the file is transfered.
> Below is the output from the client and the relevant part from the redirector
> or data server (if aplicable) log files. The output is quiet verbose
> because in the xrootd.cf the 'trace: all' option was used for:
> xrootd,xrd,olb,ofs,oss
>
> Any idea ?
>
> Thanks,
>   Wilko
>
>
> #
> #  first attempt to copy a local file into xrootd
> #
>
> > xrdcp -d2 dummy.root  root://bbr-test-xfer1:2094////prod/s5/wk.root
>
> # ------------ xrdcp output
>
> 041216 22:32:20 001 Xrd: Create Access to server granted.
> 041216 22:32:20 001 Xrd: Create Opening the remote file ///prod/s5/wk.root
> 041216 22:32:20 001 Xrd: SendGenCommand Sending command Open
> 041216 22:32:20 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [bbr-test-xfer1.slac.stanford.edu:2094]...
> 041216 22:32:20 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
>        did not answer OK. Resp status is [kXR_wait]
> 041216 22:32:20 001 Xrd: SendGenCommand Server [bbr-test-xfer1.slac.stanford.edu:2094]
>        requested 5 seconds of wait
> 041216 22:32:25 001 Xrd: SendGenCommand Sending command Open
> 041216 22:32:25 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [bbr-test-xfer1.slac.stanford.edu:2094]...
> 041216 22:32:25 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
>        did not answer OK. Resp status is [kXR_error]
> 041216 22:32:25 001 Xrd: Open Server [bbr-test-xfer1.slac.stanford.edu:2094] did not
>        return OK message for last request.
> 041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers are available
>        to read the file.
> 041216 22:32:25 001 Xrd: Create Error opening the file ///prod/s5/wk.root on host
>        bbr-test-xfer1:2094
> 041216 22:32:25 001 Xrd: Close File not opened.
> Caching info: MissRate=0 MissCount=0 ReadsCounter=0
> Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0
> Error accessing path/file for root://bbr-test-xfer1:2094////prod/s5/wk.root
>
> # ----------------------- Redirector log file
>
> 041216 22:32:20 029 XrdInet: Accepted connection from noric06
> 041216 22:32:20 008 XrdSched: running 6@noric06 inq=0
> 041216 22:32:20 008 XrdProtocol: matched protocol xrootd
> 041216 22:32:20 008 ?:16@noric06 XrdPoll: FD 16 attached to poller 0; num=1
> 041216 22:32:20 008 ?:16@noric06 XrootdProtocol: 0000 req=3007 dlen=0
> 041216 22:32:20 008 wilko.25211:16@noric06 XrootdResponse: 0000 sending OK
> 041216 22:32:20 008 XrootdXeq: User logged in as wilko.25211:16@noric06
> 041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> 041216 22:32:20 008 wilko.25211:16@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> 041216 22:32:20 008 odc_Locate: user=wilko.25211:16@noric06 wait :5 by bbr-test-xfer1
>        path=/prod/s5/wk.root
> 041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 stalling client for 5 sec
> 041216 22:32:25 008 wilko.25211:16@noric06 XrootdProtocol: 0000 request timeout;
>        read 0 of 24 bytes
> 041216 22:32:25 023 XrdPoll: Poller 0 enable fd 16
> 041216 22:32:25 009 XrdSched: running wilko.25211:16@noric06 inq=0
> 041216 22:32:25 009 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> 041216 22:32:25 009 wilko.25211:16@noric06 ofs_open: 0-664 fn=/prod/s5/wk.root
> 041216 22:32:25 009 odc_Locate: user=wilko.25211:16@noric06 error: No servers are available to
>        read the file.:0 by bbr-test-xfer1 path=/prod/s5/wk.root
> 041216 22:32:25 009 wilko.25211:16@noric06 XrootdResponse: 0000 sending err 3005:
>        No servers are available to read the file.
> 041216 22:32:25 009 XrdLink: wilko.25211:16@noric06 disconnected after 0:00:05 (link read error)
>
>
> #
> # 2nd attempt
> #
>
> > xrdcp -d2 dummy.root  root://bbr-test-xfer1:2094////prod/s5/wk.root
>
> # ------------ xrdcp output
>
> 041216 22:36:08 001 Xrd: Create Access to server granted.
> 041216 22:36:08 001 Xrd: Create Opening the remote file ///prod/s5/wk.root
> 041216 22:36:08 001 Xrd: SendGenCommand Sending command Open
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [bbr-test-xfer1.slac.stanford.edu:2094]...
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
>        did not answer OK. Resp status is [kXR_redirect]
> 041216 22:36:08 001 Xrd: HandleServerError Redir count=1
> 041216 22:36:08 001 Xrd: HandleServerError Received redirection to
>        [oprserv04.slac.stanford.edu:2094]. Token=[].
> 041216 22:36:08 001 Xrd: GetDomainToMatch GetHostName(oprserv04.slac.stanford.edu) returned
>        name=oprserv04.slac.stanford.edu
> 041216 22:36:08 001 Xrd: GetDomainToMatch GetDomain(oprserv04.slac.stanford.edu) -->
>        slac.stanford.edu
> 041216 22:36:08 001 Xrd: CheckHostDomain Resolved [oprserv04.slac.stanford.edu]'s domain
>        name into [slac.stanford.edu]
> 041216 22:36:08 001 Xrd: CheckHostDomain Access granted to the domain of
>        [oprserv04.slac.stanford.edu].
> 041216 22:36:08 001 Xrd: Connect Creating a logical connection...
> 041216 22:36:08 001 Xrd: Connect Physical connection not found. Creating a new one...
> 041216 22:36:08 001 Xrd: Connect Connecting to [oprserv04.slac.stanford.edu:2094]
> 041216 22:36:08 001 Xrd: ClientSock::TryConnect Trying to connect tooprserv04.slac.stanford.edu(
>        134.79.85.44):2094 Timeout=60
> 041216 22:36:08 001 Xrd: Connect Connected to [oprserv04.slac.stanford.edu:2094]
> 041216 22:36:08 001 Xrd: Connect New physical connection to server
>        oprserv04.slac.stanford.edu:2094 succesfully created.
> 041216 22:36:08 001 Xrd: Connect LogConn: size:2 count: 1PhyConn: size:2 count: 2
> 041216 22:36:08 001 Xrd: Connect Connect(oprserv04.slac.stanford.edu, 2094) returned 1
> 041216 22:36:08 001 Xrd: DoHandShake HandShake step 1: Sending 20 bytes to the server
>        [oprserv04.slac.stanford.edu:2094]
> 041216 22:36:08 001 Xrd: DoHandShake HandShake step 2: Reading 4 bytes from server
>        [oprserv04.slac.stanford.edu:2094].
> 041216 22:36:08 001 Xrd: DoHandShake HandShake step 3: Reading 12 bytes from server
>        [oprserv04.slac.stanford.edu:2094].
> 041216 22:36:08 001 Xrd: StartReader Starting reader thread...
> 041216 22:36:08 000 Xrd: SocketReaderThread Reader Thread starting.
> 041216 22:36:08 001 Xrd: GetAccessToSrv Ok, the server on [oprserv04.slac.stanford.edu:2094]
>        is an  xrootd data server.
> 041216 22:36:08 001 Xrd: DoLogin Logging into the server [oprserv04.slac.stanford.edu:2094].
>        pid=25233 uid=wilko
> 041216 22:36:08 001 Xrd: SendGenCommand Sending command XTNetconn::doLogin
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [oprserv04.slac.stanford.edu:2094]...
> 041216 22:36:08 001 Xrd: SendGenCommand Communication error detected with
>        [oprserv04.slac.stanford.edu:2094
> 041216 22:36:08 001 Xrd: SendGenCommand Sending command Open
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [oprserv04.slac.stanford.edu:2094]...
> 041216 22:36:08 001 Xrd: Create File opened succesfully.
> 041216 22:36:08 001 Xrd: main dummy.root --> root://bbr-test-xfer1:2094////prod/s5/wk.root
> 041216 22:36:08 25233 Xrd: ReaderThread_loc Reader Thread starting.
> 041216 22:36:08 001 Xrd: SendGenCommand Sending command Write
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [oprserv04.slac.stanford.edu:2094]...
> 041216 22:36:08 001 Xrd: SendGenCommand Sending command Close
> 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
>        [oprserv04.slac.stanford.edu:2094]...
> Caching info: MissRate=0 MissCount=0 ReadsCounter=0
> Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0
>
>
> # ------------------ Redirector bbr-test-xfer1 log file
>
>
> 041216 22:36:08 029 XrdInet: Accepted connection from noric06
> 041216 22:36:08 010 XrdSched: running 7@noric06 inq=0
> 041216 22:36:08 010 XrdProtocol: matched protocol xrootd
> 041216 22:36:08 010 ?:17@noric06 XrdPoll: FD 17 attached to poller 0; num=1
> 041216 22:36:08 010 ?:17@noric06 XrootdProtocol: 0000 req=3007 dlen=0
> 041216 22:36:08 010 wilko.25233:17@noric06 XrootdResponse: 0000 sending OK
> 041216 22:36:08 010 XrootdXeq: User logged in as wilko.25233:17@noric06
> 041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> 041216 22:36:08 010 wilko.25233:17@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> 041216 22:36:08 010 odc_Locate: user=wilko.25233:17@noric06 redirected to
>        oprserv04.slac.stanford.edu:2094 by bbr-test-xfer1 path=/prod/s5/wk.root
> 041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 redirecting to
>        oprserv04.slac.stanford.edu:2094
> 041216 22:36:08 010 XrdLink: wilko.25233:17@noric06 disconnected after 0:00:00 (link read error)
>
>
> # ------------------ Data Server oprserv04  log file
>
> 041216 22:36:08 029 XrdInet: Accepted connection from noric06
> 041216 22:36:08 007 XrdSched: running 5@noric06 inq=0
> 041216 22:36:08 007 XrdProtocol: matched protocol xrootd
> 041216 22:36:08 007 ?:15@noric06 XrdPoll: FD 15 attached to poller 0; num=1
> 041216 22:36:08 007 ?:15@noric06 XrootdProtocol: 0100 req=3007 dlen=0
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> 041216 22:36:08 007 XrootdXeq: User logged in as wilko.25233:15@noric06
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3010 dlen=18
> 041216 22:36:08 007 wilko.25233:15@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> 041216 22:36:08 007 oss_Alloc_Cache: public cache as /u1/kanga/cache0/%u1%kanga%prod%s5%wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 oss_Open_ufs: fd=8192 flags=2 mode=1000000664
>        path=/u1/kanga/prod/s5/wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 ofs_stat:  pi=5 fn=/prod/s5/wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 open nmfa /prod/s5/wk.root fh=0
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending 4 data bytes; rc=0
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3019 dlen=29
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 fh=0 write 29@0
> 041216 22:36:08 007 wilko.25233:15@noric06 ofs_write: 29@0 pi=5 fn=/prod/s5/wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3003 dlen=0
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdFile: closing w /prod/s5/wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 ofs_close: lnks=1 pi=5 fn=/prod/s5/wk.root
> 041216 22:36:08 007  ofs_Unopen: numfd=0 pi=5 fn=/prod/s5/wk.root
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 close fh=0
> 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> 041216 22:36:08 007 XrdLink: wilko.25233:15@noric06 disconnected after 0:00:00 (link read error)
>
>