Print

Print


Hi Wilko,

OK, there seems to be a somewhat strange conflict in the client. The first
request is to create a file. That is handled correctly (the client is told
to wait). However, when the client comes back, it tries to (apparently)
open the file in read mode, not create mode. That fails because the file
doesn't exist. That's also a correct response. Why is the client retrying
in read mode? Fabrizio?

The reason it works the second time is that the file lookup has occured.
So, the first request for a create binds correctly to a server that can
create a file. The real question is what is the client trying to do here?

Andy

On Fri, 17 Dec 2004, Wilko Kroeger wrote:

>
> Hello Andy
>
> I restarted the olbds with -d option and run the test again.
>
> > xrdcp -d2  dummy.root  root://bbr-test-xfer1:2094////prod/s13/wk.root
>
> The log files are:
>
> # ----  olbdlog from redirector:  (first time running xrdcp)
> 041217 13:06:25 010 Receive From bbr-test-xfer1.slac.stanford.edu: 0 select  c /prod/s13/wk.root
> 041217 13:06:25 010 do_Select Lookup delay bbr-test-xfer1.slac.stanford.edu 5
> 041217 13:06:30 010 Receive From bbr-test-xfer1.slac.stanford.edu: 1 select  r /prod/s13/wk.root
> 041217 13:06:30 010 do_Select No servers available to r /prod/s13/wk.root
>
> # ----  olbdlog from oprserv04 (data server)
> 041217 13:06:25 001 Receive From bbr-test-xfer1.slac.stanford.edu: 7@0  state /prod/s13/wk.root
>
> # ----  olbdlog from oprserv07 (data server)
> 041217 13:06:25 001 Receive From bbr-test-xfer1.slac.stanford.edu: 7@0  state /prod/s13/wk.root
>
>
> running the command a second time:
>
> > xrdcp -d2 dummy.root root://bbr-test-xfer1:2094////prod/s13/wk.root
>
> # ----  olbdlog from redirector:  (second time running xrdcp)
>
> 041217 13:11:20 010 Receive From bbr-test-xfer1.slac.stanford.edu: 2 select  c /prod/s13/wk.root
> 041217 13:11:20 010 SelServer Server oprserv07.slac.stanford.edu:2094 bound to /prod/s13/wk.root
> 041217 13:11:20 010 do_Select Redirect bbr-test-xfer1.slac.stanford.edu ->
>        oprserv07.slac.stanford.edu:2094 for /prod/s13/wk.root
>
> There are no entries in the olbdlog of the two data servers.
>
>
>
> I also provided the entires in the xrdlog file (redirector) for the first
> attempt to run xrdcp.
>
> # ---- xrdlog redirector (first time running xrdcp)
> 041217 13:06:25 029 XrdInet: Accepted connection from noric13
> 041217 13:06:25 007 XrdSched: running 5@noric13 inq=0
> 041217 13:06:25 007 XrdProtocol: matched protocol xrootd
> 041217 13:06:25 007 ?:15@noric13 XrdPoll: FD 15 attached to poller 0; num=1
> 041217 13:06:25 007 ?:15@noric13 XrootdProtocol: 0000 req=3007 dlen=0
> 041217 13:06:25 007 wilko.18587:15@noric13 XrootdResponse: 0000 sending OK
> 041217 13:06:25 007 XrootdXeq: User logged in as wilko.18587:15@noric13
> 041217 13:06:25 007 wilko.18587:15@noric13 XrootdProtocol: 0000 req=3010 dlen=19
> 041217 13:06:25 007 wilko.18587:15@noric13 ofs_open: 100-1000000664 fn=/prod/s13/wk.root
> 041217 13:06:25 007 odc_Locate: user=wilko.18587:15@noric13 wait :5 by
>   bbr-test-xfer1 path=/prod/s13/wk.root
> 041217 13:06:25 007 wilko.18587:15@noric13 XrootdProtocol: 0000 stalling
>   client for 5 sec
> 041217 13:06:30 007 wilko.18587:15@noric13 XrootdProtocol: 0000 req=3010 dlen=19
> 041217 13:06:30 007 wilko.18587:15@noric13 ofs_open: 0-664 fn=/prod/s13/wk.root
> 041217 13:06:30 007 odc_Locate: user=wilko.18587:15@noric13 error: No
>   servers are available to read the file.:0 by bbr-test-xfer1 path=/prod/s13/wk.root
> 041217 13:06:30 007 wilko.18587:15@noric13 XrootdResponse: 0000 sending
>   err 3005: No servers are available to read the file.
> 041217 13:06:30 007 XrdLink: wilko.18587:15@noric13 disconnected after 0:00:05 (link read error)
> 041217 13:06:30 007 wilko.18587:15@noric13 XrdPoll: FD 15 detached from poller 0; num=0
>
>
> I hope this helps
>
> Cheers,
>    Wilko
>
>
>
>
> On Fri, 17 Dec 2004, Andrew Hanushevsky wrote:
>
> > Hi Wilko,
> >
> > Could you rerun the test with the olbd's started with the -d option and
> > send me the log; thanks.
> >
> > Andy
> >
> > On Thu, 16 Dec 2004, Wilko Kroeger wrote:
> >
> > >
> > >
> > >
> > > Hello Andy and Fabrizio
> > >
> > > I tried to copy a local file into xrootd using the latest xrootd release,
> > > 20041216-0815. I am using one redirector (bbr-test-xfer1) and two data
> > > servers (oprser04, oprserv07).
> > > Running xrdcp against a data server works fine, but running it against
> > > the redirector doesn't work the way it should.
> > >
> > > Running xrdcp the first time fails with the error:
> > >
> > >  041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers
> > >         are available to read the file.
> > >
> > > Running the same xrdcp command a second time the client is redirected to a
> > > data server and the file is transfered.
> > > Below is the output from the client and the relevant part from the redirector
> > > or data server (if aplicable) log files. The output is quiet verbose
> > > because in the xrootd.cf the 'trace: all' option was used for:
> > > xrootd,xrd,olb,ofs,oss
> > >
> > > Any idea ?
> > >
> > > Thanks,
> > >   Wilko
> > >
> > >
> > > #
> > > #  first attempt to copy a local file into xrootd
> > > #
> > >
> > > > xrdcp -d2 dummy.root  root://bbr-test-xfer1:2094////prod/s5/wk.root
> > >
> > > # ------------ xrdcp output
> > >
> > > 041216 22:32:20 001 Xrd: Create Access to server granted.
> > > 041216 22:32:20 001 Xrd: Create Opening the remote file ///prod/s5/wk.root
> > > 041216 22:32:20 001 Xrd: SendGenCommand Sending command Open
> > > 041216 22:32:20 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [bbr-test-xfer1.slac.stanford.edu:2094]...
> > > 041216 22:32:20 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
> > >        did not answer OK. Resp status is [kXR_wait]
> > > 041216 22:32:20 001 Xrd: SendGenCommand Server [bbr-test-xfer1.slac.stanford.edu:2094]
> > >        requested 5 seconds of wait
> > > 041216 22:32:25 001 Xrd: SendGenCommand Sending command Open
> > > 041216 22:32:25 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [bbr-test-xfer1.slac.stanford.edu:2094]...
> > > 041216 22:32:25 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
> > >        did not answer OK. Resp status is [kXR_error]
> > > 041216 22:32:25 001 Xrd: Open Server [bbr-test-xfer1.slac.stanford.edu:2094] did not
> > >        return OK message for last request.
> > > 041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers are available
> > >        to read the file.
> > > 041216 22:32:25 001 Xrd: Create Error opening the file ///prod/s5/wk.root on host
> > >        bbr-test-xfer1:2094
> > > 041216 22:32:25 001 Xrd: Close File not opened.
> > > Caching info: MissRate=0 MissCount=0 ReadsCounter=0
> > > Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0
> > > Error accessing path/file for root://bbr-test-xfer1:2094////prod/s5/wk.root
> > >
> > > # ----------------------- Redirector log file
> > >
> > > 041216 22:32:20 029 XrdInet: Accepted connection from noric06
> > > 041216 22:32:20 008 XrdSched: running 6@noric06 inq=0
> > > 041216 22:32:20 008 XrdProtocol: matched protocol xrootd
> > > 041216 22:32:20 008 ?:16@noric06 XrdPoll: FD 16 attached to poller 0; num=1
> > > 041216 22:32:20 008 ?:16@noric06 XrootdProtocol: 0000 req=3007 dlen=0
> > > 041216 22:32:20 008 wilko.25211:16@noric06 XrootdResponse: 0000 sending OK
> > > 041216 22:32:20 008 XrootdXeq: User logged in as wilko.25211:16@noric06
> > > 041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> > > 041216 22:32:20 008 wilko.25211:16@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> > > 041216 22:32:20 008 odc_Locate: user=wilko.25211:16@noric06 wait :5 by bbr-test-xfer1
> > >        path=/prod/s5/wk.root
> > > 041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 stalling client for 5 sec
> > > 041216 22:32:25 008 wilko.25211:16@noric06 XrootdProtocol: 0000 request timeout;
> > >        read 0 of 24 bytes
> > > 041216 22:32:25 023 XrdPoll: Poller 0 enable fd 16
> > > 041216 22:32:25 009 XrdSched: running wilko.25211:16@noric06 inq=0
> > > 041216 22:32:25 009 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> > > 041216 22:32:25 009 wilko.25211:16@noric06 ofs_open: 0-664 fn=/prod/s5/wk.root
> > > 041216 22:32:25 009 odc_Locate: user=wilko.25211:16@noric06 error: No servers are available to
> > >        read the file.:0 by bbr-test-xfer1 path=/prod/s5/wk.root
> > > 041216 22:32:25 009 wilko.25211:16@noric06 XrootdResponse: 0000 sending err 3005:
> > >        No servers are available to read the file.
> > > 041216 22:32:25 009 XrdLink: wilko.25211:16@noric06 disconnected after 0:00:05 (link read error)
> > >
> > >
> > > #
> > > # 2nd attempt
> > > #
> > >
> > > > xrdcp -d2 dummy.root  root://bbr-test-xfer1:2094////prod/s5/wk.root
> > >
> > > # ------------ xrdcp output
> > >
> > > 041216 22:36:08 001 Xrd: Create Access to server granted.
> > > 041216 22:36:08 001 Xrd: Create Opening the remote file ///prod/s5/wk.root
> > > 041216 22:36:08 001 Xrd: SendGenCommand Sending command Open
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [bbr-test-xfer1.slac.stanford.edu:2094]...
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094]
> > >        did not answer OK. Resp status is [kXR_redirect]
> > > 041216 22:36:08 001 Xrd: HandleServerError Redir count=1
> > > 041216 22:36:08 001 Xrd: HandleServerError Received redirection to
> > >        [oprserv04.slac.stanford.edu:2094]. Token=[].
> > > 041216 22:36:08 001 Xrd: GetDomainToMatch GetHostName(oprserv04.slac.stanford.edu) returned
> > >        name=oprserv04.slac.stanford.edu
> > > 041216 22:36:08 001 Xrd: GetDomainToMatch GetDomain(oprserv04.slac.stanford.edu) -->
> > >        slac.stanford.edu
> > > 041216 22:36:08 001 Xrd: CheckHostDomain Resolved [oprserv04.slac.stanford.edu]'s domain
> > >        name into [slac.stanford.edu]
> > > 041216 22:36:08 001 Xrd: CheckHostDomain Access granted to the domain of
> > >        [oprserv04.slac.stanford.edu].
> > > 041216 22:36:08 001 Xrd: Connect Creating a logical connection...
> > > 041216 22:36:08 001 Xrd: Connect Physical connection not found. Creating a new one...
> > > 041216 22:36:08 001 Xrd: Connect Connecting to [oprserv04.slac.stanford.edu:2094]
> > > 041216 22:36:08 001 Xrd: ClientSock::TryConnect Trying to connect tooprserv04.slac.stanford.edu(
> > >        134.79.85.44):2094 Timeout=60
> > > 041216 22:36:08 001 Xrd: Connect Connected to [oprserv04.slac.stanford.edu:2094]
> > > 041216 22:36:08 001 Xrd: Connect New physical connection to server
> > >        oprserv04.slac.stanford.edu:2094 succesfully created.
> > > 041216 22:36:08 001 Xrd: Connect LogConn: size:2 count: 1PhyConn: size:2 count: 2
> > > 041216 22:36:08 001 Xrd: Connect Connect(oprserv04.slac.stanford.edu, 2094) returned 1
> > > 041216 22:36:08 001 Xrd: DoHandShake HandShake step 1: Sending 20 bytes to the server
> > >        [oprserv04.slac.stanford.edu:2094]
> > > 041216 22:36:08 001 Xrd: DoHandShake HandShake step 2: Reading 4 bytes from server
> > >        [oprserv04.slac.stanford.edu:2094].
> > > 041216 22:36:08 001 Xrd: DoHandShake HandShake step 3: Reading 12 bytes from server
> > >        [oprserv04.slac.stanford.edu:2094].
> > > 041216 22:36:08 001 Xrd: StartReader Starting reader thread...
> > > 041216 22:36:08 000 Xrd: SocketReaderThread Reader Thread starting.
> > > 041216 22:36:08 001 Xrd: GetAccessToSrv Ok, the server on [oprserv04.slac.stanford.edu:2094]
> > >        is an  xrootd data server.
> > > 041216 22:36:08 001 Xrd: DoLogin Logging into the server [oprserv04.slac.stanford.edu:2094].
> > >        pid=25233 uid=wilko
> > > 041216 22:36:08 001 Xrd: SendGenCommand Sending command XTNetconn::doLogin
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [oprserv04.slac.stanford.edu:2094]...
> > > 041216 22:36:08 001 Xrd: SendGenCommand Communication error detected with
> > >        [oprserv04.slac.stanford.edu:2094
> > > 041216 22:36:08 001 Xrd: SendGenCommand Sending command Open
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [oprserv04.slac.stanford.edu:2094]...
> > > 041216 22:36:08 001 Xrd: Create File opened succesfully.
> > > 041216 22:36:08 001 Xrd: main dummy.root --> root://bbr-test-xfer1:2094////prod/s5/wk.root
> > > 041216 22:36:08 25233 Xrd: ReaderThread_loc Reader Thread starting.
> > > 041216 22:36:08 001 Xrd: SendGenCommand Sending command Write
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [oprserv04.slac.stanford.edu:2094]...
> > > 041216 22:36:08 001 Xrd: SendGenCommand Sending command Close
> > > 041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server
> > >        [oprserv04.slac.stanford.edu:2094]...
> > > Caching info: MissRate=0 MissCount=0 ReadsCounter=0
> > > Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0
> > >
> > >
> > > # ------------------ Redirector bbr-test-xfer1 log file
> > >
> > >
> > > 041216 22:36:08 029 XrdInet: Accepted connection from noric06
> > > 041216 22:36:08 010 XrdSched: running 7@noric06 inq=0
> > > 041216 22:36:08 010 XrdProtocol: matched protocol xrootd
> > > 041216 22:36:08 010 ?:17@noric06 XrdPoll: FD 17 attached to poller 0; num=1
> > > 041216 22:36:08 010 ?:17@noric06 XrootdProtocol: 0000 req=3007 dlen=0
> > > 041216 22:36:08 010 wilko.25233:17@noric06 XrootdResponse: 0000 sending OK
> > > 041216 22:36:08 010 XrootdXeq: User logged in as wilko.25233:17@noric06
> > > 041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 req=3010 dlen=18
> > > 041216 22:36:08 010 wilko.25233:17@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> > > 041216 22:36:08 010 odc_Locate: user=wilko.25233:17@noric06 redirected to
> > >        oprserv04.slac.stanford.edu:2094 by bbr-test-xfer1 path=/prod/s5/wk.root
> > > 041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 redirecting to
> > >        oprserv04.slac.stanford.edu:2094
> > > 041216 22:36:08 010 XrdLink: wilko.25233:17@noric06 disconnected after 0:00:00 (link read error)
> > >
> > >
> > > # ------------------ Data Server oprserv04  log file
> > >
> > > 041216 22:36:08 029 XrdInet: Accepted connection from noric06
> > > 041216 22:36:08 007 XrdSched: running 5@noric06 inq=0
> > > 041216 22:36:08 007 XrdProtocol: matched protocol xrootd
> > > 041216 22:36:08 007 ?:15@noric06 XrdPoll: FD 15 attached to poller 0; num=1
> > > 041216 22:36:08 007 ?:15@noric06 XrootdProtocol: 0100 req=3007 dlen=0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> > > 041216 22:36:08 007 XrootdXeq: User logged in as wilko.25233:15@noric06
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3010 dlen=18
> > > 041216 22:36:08 007 wilko.25233:15@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root
> > > 041216 22:36:08 007 oss_Alloc_Cache: public cache as /u1/kanga/cache0/%u1%kanga%prod%s5%wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 oss_Open_ufs: fd=8192 flags=2 mode=1000000664
> > >        path=/u1/kanga/prod/s5/wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 ofs_stat:  pi=5 fn=/prod/s5/wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 open nmfa /prod/s5/wk.root fh=0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending 4 data bytes; rc=0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3019 dlen=29
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 fh=0 write 29@0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 ofs_write: 29@0 pi=5 fn=/prod/s5/wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3003 dlen=0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdFile: closing w /prod/s5/wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 ofs_close: lnks=1 pi=5 fn=/prod/s5/wk.root
> > > 041216 22:36:08 007  ofs_Unopen: numfd=0 pi=5 fn=/prod/s5/wk.root
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 close fh=0
> > > 041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK
> > > 041216 22:36:08 007 XrdLink: wilko.25233:15@noric06 disconnected after 0:00:00 (link read error)
> > >
> > >
> >
>