Hi, You are right, sigh. I found a silly bug in XrdClientConn.cc which caused for sure that bug. I just committed the fix for the typo. Instead of clearing only the kXR_refresh bit after a server requested command repetition, it cleared all the options field. This is consistent with what we see. Fabrizio Andrew Hanushevsky wrote: > Hi Wilko, > > OK, there seems to be a somewhat strange conflict in the client. The first > request is to create a file. That is handled correctly (the client is told > to wait). However, when the client comes back, it tries to (apparently) > open the file in read mode, not create mode. That fails because the file > doesn't exist. That's also a correct response. Why is the client retrying > in read mode? Fabrizio? > > The reason it works the second time is that the file lookup has occured. > So, the first request for a create binds correctly to a server that can > create a file. The real question is what is the client trying to do here? > > Andy > > On Fri, 17 Dec 2004, Wilko Kroeger wrote: > > >>Hello Andy >> >>I restarted the olbds with -d option and run the test again. >> >> >>>xrdcp -d2 dummy.root root://bbr-test-xfer1:2094////prod/s13/wk.root >> >>The log files are: >> >># ---- olbdlog from redirector: (first time running xrdcp) >>041217 13:06:25 010 Receive From bbr-test-xfer1.slac.stanford.edu: 0 select c /prod/s13/wk.root >>041217 13:06:25 010 do_Select Lookup delay bbr-test-xfer1.slac.stanford.edu 5 >>041217 13:06:30 010 Receive From bbr-test-xfer1.slac.stanford.edu: 1 select r /prod/s13/wk.root >>041217 13:06:30 010 do_Select No servers available to r /prod/s13/wk.root >> >># ---- olbdlog from oprserv04 (data server) >>041217 13:06:25 001 Receive From bbr-test-xfer1.slac.stanford.edu: 7@0 state /prod/s13/wk.root >> >># ---- olbdlog from oprserv07 (data server) >>041217 13:06:25 001 Receive From bbr-test-xfer1.slac.stanford.edu: 7@0 state /prod/s13/wk.root >> >> >>running the command a second time: >> >> >>>xrdcp -d2 dummy.root root://bbr-test-xfer1:2094////prod/s13/wk.root >> >># ---- olbdlog from redirector: (second time running xrdcp) >> >>041217 13:11:20 010 Receive From bbr-test-xfer1.slac.stanford.edu: 2 select c /prod/s13/wk.root >>041217 13:11:20 010 SelServer Server oprserv07.slac.stanford.edu:2094 bound to /prod/s13/wk.root >>041217 13:11:20 010 do_Select Redirect bbr-test-xfer1.slac.stanford.edu -> >> oprserv07.slac.stanford.edu:2094 for /prod/s13/wk.root >> >>There are no entries in the olbdlog of the two data servers. >> >> >> >>I also provided the entires in the xrdlog file (redirector) for the first >>attempt to run xrdcp. >> >># ---- xrdlog redirector (first time running xrdcp) >>041217 13:06:25 029 XrdInet: Accepted connection from noric13 >>041217 13:06:25 007 XrdSched: running 5@noric13 inq=0 >>041217 13:06:25 007 XrdProtocol: matched protocol xrootd >>041217 13:06:25 007 ?:15@noric13 XrdPoll: FD 15 attached to poller 0; num=1 >>041217 13:06:25 007 ?:15@noric13 XrootdProtocol: 0000 req=3007 dlen=0 >>041217 13:06:25 007 wilko.18587:15@noric13 XrootdResponse: 0000 sending OK >>041217 13:06:25 007 XrootdXeq: User logged in as wilko.18587:15@noric13 >>041217 13:06:25 007 wilko.18587:15@noric13 XrootdProtocol: 0000 req=3010 dlen=19 >>041217 13:06:25 007 wilko.18587:15@noric13 ofs_open: 100-1000000664 fn=/prod/s13/wk.root >>041217 13:06:25 007 odc_Locate: user=wilko.18587:15@noric13 wait :5 by >> bbr-test-xfer1 path=/prod/s13/wk.root >>041217 13:06:25 007 wilko.18587:15@noric13 XrootdProtocol: 0000 stalling >> client for 5 sec >>041217 13:06:30 007 wilko.18587:15@noric13 XrootdProtocol: 0000 req=3010 dlen=19 >>041217 13:06:30 007 wilko.18587:15@noric13 ofs_open: 0-664 fn=/prod/s13/wk.root >>041217 13:06:30 007 odc_Locate: user=wilko.18587:15@noric13 error: No >> servers are available to read the file.:0 by bbr-test-xfer1 path=/prod/s13/wk.root >>041217 13:06:30 007 wilko.18587:15@noric13 XrootdResponse: 0000 sending >> err 3005: No servers are available to read the file. >>041217 13:06:30 007 XrdLink: wilko.18587:15@noric13 disconnected after 0:00:05 (link read error) >>041217 13:06:30 007 wilko.18587:15@noric13 XrdPoll: FD 15 detached from poller 0; num=0 >> >> >>I hope this helps >> >>Cheers, >> Wilko >> >> >> >> >>On Fri, 17 Dec 2004, Andrew Hanushevsky wrote: >> >> >>>Hi Wilko, >>> >>>Could you rerun the test with the olbd's started with the -d option and >>>send me the log; thanks. >>> >>>Andy >>> >>>On Thu, 16 Dec 2004, Wilko Kroeger wrote: >>> >>> >>>> >>>> >>>>Hello Andy and Fabrizio >>>> >>>>I tried to copy a local file into xrootd using the latest xrootd release, >>>>20041216-0815. I am using one redirector (bbr-test-xfer1) and two data >>>>servers (oprser04, oprserv07). >>>>Running xrdcp against a data server works fine, but running it against >>>>the redirector doesn't work the way it should. >>>> >>>>Running xrdcp the first time fails with the error: >>>> >>>> 041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers >>>> are available to read the file. >>>> >>>>Running the same xrdcp command a second time the client is redirected to a >>>>data server and the file is transfered. >>>>Below is the output from the client and the relevant part from the redirector >>>>or data server (if aplicable) log files. The output is quiet verbose >>>>because in the xrootd.cf the 'trace: all' option was used for: >>>>xrootd,xrd,olb,ofs,oss >>>> >>>>Any idea ? >>>> >>>>Thanks, >>>> Wilko >>>> >>>> >>>># >>>># first attempt to copy a local file into xrootd >>>># >>>> >>>> >>>>>xrdcp -d2 dummy.root root://bbr-test-xfer1:2094////prod/s5/wk.root >>>> >>>># ------------ xrdcp output >>>> >>>>041216 22:32:20 001 Xrd: Create Access to server granted. >>>>041216 22:32:20 001 Xrd: Create Opening the remote file ///prod/s5/wk.root >>>>041216 22:32:20 001 Xrd: SendGenCommand Sending command Open >>>>041216 22:32:20 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [bbr-test-xfer1.slac.stanford.edu:2094]... >>>>041216 22:32:20 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094] >>>> did not answer OK. Resp status is [kXR_wait] >>>>041216 22:32:20 001 Xrd: SendGenCommand Server [bbr-test-xfer1.slac.stanford.edu:2094] >>>> requested 5 seconds of wait >>>>041216 22:32:25 001 Xrd: SendGenCommand Sending command Open >>>>041216 22:32:25 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [bbr-test-xfer1.slac.stanford.edu:2094]... >>>>041216 22:32:25 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094] >>>> did not answer OK. Resp status is [kXR_error] >>>>041216 22:32:25 001 Xrd: Open Server [bbr-test-xfer1.slac.stanford.edu:2094] did not >>>> return OK message for last request. >>>>041216 22:32:25 001 Xrd: SendGenCommand Server declared error 3005:No servers are available >>>> to read the file. >>>>041216 22:32:25 001 Xrd: Create Error opening the file ///prod/s5/wk.root on host >>>> bbr-test-xfer1:2094 >>>>041216 22:32:25 001 Xrd: Close File not opened. >>>>Caching info: MissRate=0 MissCount=0 ReadsCounter=0 >>>>Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0 >>>>Error accessing path/file for root://bbr-test-xfer1:2094////prod/s5/wk.root >>>> >>>># ----------------------- Redirector log file >>>> >>>>041216 22:32:20 029 XrdInet: Accepted connection from noric06 >>>>041216 22:32:20 008 XrdSched: running 6@noric06 inq=0 >>>>041216 22:32:20 008 XrdProtocol: matched protocol xrootd >>>>041216 22:32:20 008 ?:16@noric06 XrdPoll: FD 16 attached to poller 0; num=1 >>>>041216 22:32:20 008 ?:16@noric06 XrootdProtocol: 0000 req=3007 dlen=0 >>>>041216 22:32:20 008 wilko.25211:16@noric06 XrootdResponse: 0000 sending OK >>>>041216 22:32:20 008 XrootdXeq: User logged in as wilko.25211:16@noric06 >>>>041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18 >>>>041216 22:32:20 008 wilko.25211:16@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root >>>>041216 22:32:20 008 odc_Locate: user=wilko.25211:16@noric06 wait :5 by bbr-test-xfer1 >>>> path=/prod/s5/wk.root >>>>041216 22:32:20 008 wilko.25211:16@noric06 XrootdProtocol: 0000 stalling client for 5 sec >>>>041216 22:32:25 008 wilko.25211:16@noric06 XrootdProtocol: 0000 request timeout; >>>> read 0 of 24 bytes >>>>041216 22:32:25 023 XrdPoll: Poller 0 enable fd 16 >>>>041216 22:32:25 009 XrdSched: running wilko.25211:16@noric06 inq=0 >>>>041216 22:32:25 009 wilko.25211:16@noric06 XrootdProtocol: 0000 req=3010 dlen=18 >>>>041216 22:32:25 009 wilko.25211:16@noric06 ofs_open: 0-664 fn=/prod/s5/wk.root >>>>041216 22:32:25 009 odc_Locate: user=wilko.25211:16@noric06 error: No servers are available to >>>> read the file.:0 by bbr-test-xfer1 path=/prod/s5/wk.root >>>>041216 22:32:25 009 wilko.25211:16@noric06 XrootdResponse: 0000 sending err 3005: >>>> No servers are available to read the file. >>>>041216 22:32:25 009 XrdLink: wilko.25211:16@noric06 disconnected after 0:00:05 (link read error) >>>> >>>> >>>># >>>># 2nd attempt >>>># >>>> >>>> >>>>>xrdcp -d2 dummy.root root://bbr-test-xfer1:2094////prod/s5/wk.root >>>> >>>># ------------ xrdcp output >>>> >>>>041216 22:36:08 001 Xrd: Create Access to server granted. >>>>041216 22:36:08 001 Xrd: Create Opening the remote file ///prod/s5/wk.root >>>>041216 22:36:08 001 Xrd: SendGenCommand Sending command Open >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [bbr-test-xfer1.slac.stanford.edu:2094]... >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Server [bbr-test-xfer1.slac.stanford.edu:2094] >>>> did not answer OK. Resp status is [kXR_redirect] >>>>041216 22:36:08 001 Xrd: HandleServerError Redir count=1 >>>>041216 22:36:08 001 Xrd: HandleServerError Received redirection to >>>> [oprserv04.slac.stanford.edu:2094]. Token=[]. >>>>041216 22:36:08 001 Xrd: GetDomainToMatch GetHostName(oprserv04.slac.stanford.edu) returned >>>> name=oprserv04.slac.stanford.edu >>>>041216 22:36:08 001 Xrd: GetDomainToMatch GetDomain(oprserv04.slac.stanford.edu) --> >>>> slac.stanford.edu >>>>041216 22:36:08 001 Xrd: CheckHostDomain Resolved [oprserv04.slac.stanford.edu]'s domain >>>> name into [slac.stanford.edu] >>>>041216 22:36:08 001 Xrd: CheckHostDomain Access granted to the domain of >>>> [oprserv04.slac.stanford.edu]. >>>>041216 22:36:08 001 Xrd: Connect Creating a logical connection... >>>>041216 22:36:08 001 Xrd: Connect Physical connection not found. Creating a new one... >>>>041216 22:36:08 001 Xrd: Connect Connecting to [oprserv04.slac.stanford.edu:2094] >>>>041216 22:36:08 001 Xrd: ClientSock::TryConnect Trying to connect tooprserv04.slac.stanford.edu( >>>> 134.79.85.44):2094 Timeout=60 >>>>041216 22:36:08 001 Xrd: Connect Connected to [oprserv04.slac.stanford.edu:2094] >>>>041216 22:36:08 001 Xrd: Connect New physical connection to server >>>> oprserv04.slac.stanford.edu:2094 succesfully created. >>>>041216 22:36:08 001 Xrd: Connect LogConn: size:2 count: 1PhyConn: size:2 count: 2 >>>>041216 22:36:08 001 Xrd: Connect Connect(oprserv04.slac.stanford.edu, 2094) returned 1 >>>>041216 22:36:08 001 Xrd: DoHandShake HandShake step 1: Sending 20 bytes to the server >>>> [oprserv04.slac.stanford.edu:2094] >>>>041216 22:36:08 001 Xrd: DoHandShake HandShake step 2: Reading 4 bytes from server >>>> [oprserv04.slac.stanford.edu:2094]. >>>>041216 22:36:08 001 Xrd: DoHandShake HandShake step 3: Reading 12 bytes from server >>>> [oprserv04.slac.stanford.edu:2094]. >>>>041216 22:36:08 001 Xrd: StartReader Starting reader thread... >>>>041216 22:36:08 000 Xrd: SocketReaderThread Reader Thread starting. >>>>041216 22:36:08 001 Xrd: GetAccessToSrv Ok, the server on [oprserv04.slac.stanford.edu:2094] >>>> is an xrootd data server. >>>>041216 22:36:08 001 Xrd: DoLogin Logging into the server [oprserv04.slac.stanford.edu:2094]. >>>> pid=25233 uid=wilko >>>>041216 22:36:08 001 Xrd: SendGenCommand Sending command XTNetconn::doLogin >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [oprserv04.slac.stanford.edu:2094]... >>>>041216 22:36:08 001 Xrd: SendGenCommand Communication error detected with >>>> [oprserv04.slac.stanford.edu:2094 >>>>041216 22:36:08 001 Xrd: SendGenCommand Sending command Open >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [oprserv04.slac.stanford.edu:2094]... >>>>041216 22:36:08 001 Xrd: Create File opened succesfully. >>>>041216 22:36:08 001 Xrd: main dummy.root --> root://bbr-test-xfer1:2094////prod/s5/wk.root >>>>041216 22:36:08 25233 Xrd: ReaderThread_loc Reader Thread starting. >>>>041216 22:36:08 001 Xrd: SendGenCommand Sending command Write >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [oprserv04.slac.stanford.edu:2094]... >>>>041216 22:36:08 001 Xrd: SendGenCommand Sending command Close >>>>041216 22:36:08 001 Xrd: ReadPartialAnswer Reading a XrdClientMessage from the server >>>> [oprserv04.slac.stanford.edu:2094]... >>>>Caching info: MissRate=0 MissCount=0 ReadsCounter=0 >>>>Caching info: BytesUsefulness=0 BytesSubmitted=0 BytesHit=0 >>>> >>>> >>>># ------------------ Redirector bbr-test-xfer1 log file >>>> >>>> >>>>041216 22:36:08 029 XrdInet: Accepted connection from noric06 >>>>041216 22:36:08 010 XrdSched: running 7@noric06 inq=0 >>>>041216 22:36:08 010 XrdProtocol: matched protocol xrootd >>>>041216 22:36:08 010 ?:17@noric06 XrdPoll: FD 17 attached to poller 0; num=1 >>>>041216 22:36:08 010 ?:17@noric06 XrootdProtocol: 0000 req=3007 dlen=0 >>>>041216 22:36:08 010 wilko.25233:17@noric06 XrootdResponse: 0000 sending OK >>>>041216 22:36:08 010 XrootdXeq: User logged in as wilko.25233:17@noric06 >>>>041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 req=3010 dlen=18 >>>>041216 22:36:08 010 wilko.25233:17@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root >>>>041216 22:36:08 010 odc_Locate: user=wilko.25233:17@noric06 redirected to >>>> oprserv04.slac.stanford.edu:2094 by bbr-test-xfer1 path=/prod/s5/wk.root >>>>041216 22:36:08 010 wilko.25233:17@noric06 XrootdProtocol: 0000 redirecting to >>>> oprserv04.slac.stanford.edu:2094 >>>>041216 22:36:08 010 XrdLink: wilko.25233:17@noric06 disconnected after 0:00:00 (link read error) >>>> >>>> >>>># ------------------ Data Server oprserv04 log file >>>> >>>>041216 22:36:08 029 XrdInet: Accepted connection from noric06 >>>>041216 22:36:08 007 XrdSched: running 5@noric06 inq=0 >>>>041216 22:36:08 007 XrdProtocol: matched protocol xrootd >>>>041216 22:36:08 007 ?:15@noric06 XrdPoll: FD 15 attached to poller 0; num=1 >>>>041216 22:36:08 007 ?:15@noric06 XrootdProtocol: 0100 req=3007 dlen=0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK >>>>041216 22:36:08 007 XrootdXeq: User logged in as wilko.25233:15@noric06 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3010 dlen=18 >>>>041216 22:36:08 007 wilko.25233:15@noric06 ofs_open: 100-1000000664 fn=/prod/s5/wk.root >>>>041216 22:36:08 007 oss_Alloc_Cache: public cache as /u1/kanga/cache0/%u1%kanga%prod%s5%wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 oss_Open_ufs: fd=8192 flags=2 mode=1000000664 >>>> path=/u1/kanga/prod/s5/wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 ofs_stat: pi=5 fn=/prod/s5/wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 open nmfa /prod/s5/wk.root fh=0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending 4 data bytes; rc=0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3019 dlen=29 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 fh=0 write 29@0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 ofs_write: 29@0 pi=5 fn=/prod/s5/wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 req=3003 dlen=0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdFile: closing w /prod/s5/wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 ofs_close: lnks=1 pi=5 fn=/prod/s5/wk.root >>>>041216 22:36:08 007 ofs_Unopen: numfd=0 pi=5 fn=/prod/s5/wk.root >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdProtocol: 0100 close fh=0 >>>>041216 22:36:08 007 wilko.25233:15@noric06 XrootdResponse: 0100 sending OK >>>>041216 22:36:08 007 XrdLink: wilko.25233:15@noric06 disconnected after 0:00:00 (link read error) >>>> >>>> >>>