Print

Print


Hello Miriam,

I am forwarding your mail to the xrootd mailing list.

The problem Miriam has is that when she uses xrootd the request fails 
and the redirector babar2 is crashing.

I have restarted xrootd (but didn't find the core although 'ulimit -c'
says unlimited...) ?????

Then I tried on babar and babar2:
KanCopyUtil -r -n 20 /store/SP/R12/000998/200301/12.6.0b/SP_000998_011885
and had no problem... (no crash)

What command are you using? Are you running in batch? What I find strange 
is that I seem to be logged from a different machine. I tried to ssh 
babar2 then I ask csh -l and do the KanCopyUtil and in the xrd log I see 
"User logged in as schott.28977:[log in to unmask]" which seems to 
be babar.fzk.de!? but env | grep HOST confirms I am on babar2 ?????

I see in the logfile for you:

050209 19:35:20 6055 XrootdXeq: User logged in as skimprod.27917:[log in to unmask]
050209 19:35:20 6055 odc_Locate: user=skimprod.27917:[log in to unmask] wait  5 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011887.01.root
050209 19:35:25 6055 odc_Locate: user=skimprod.27917:[log in to unmask] redirected to f01-001-118.gridka.de 1094 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011887.01.root
050209 19:35:26 6055 XrdLink: skimprod.27917:[log in to unmask] disconnected after 0:00:06

and for me:

050209 20:04:05 31000 XrootdXeq: User logged in as schott.28977:[log in to unmask]
050209 20:04:05 31000 odc_Locate: user=schott.28977:[log in to unmask] wait  5 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011885.01.root
050209 20:04:10 31000 odc_Locate: user=schott.28977:[log in to unmask] redirected to f01-001-118.gridka.de 1094 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011885.01.root
050209 20:04:12 31000 odc_Locate: user=schott.28977:[log in to unmask] redirected to f01-001-118.gridka.de 1094 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011885.01.root
050209 20:04:14 31000 odc_Locate: user=schott.28977:[log in to unmask] wait  5 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011885.02E.root
050209 20:04:19 31000 odc_Locate: user=schott.28977:[log in to unmask] redirected to f01-001-118.gridka.de 1094 path=/store/SP/R12/000998/200301/12.6.0b/SP_000998_011885.02E.root
050209 20:04:23 31000 XrdLink: schott.28977:[log in to unmask] disconnected after 0:00:18

and in the xrootd log of f01-001-118.gridka.de

050209 19:35:25 6912 XrootdXeq: User logged in as skimprod.27917:[log in to unmask]
050209 19:35:26 6912 XrdLink: skimprod.27917:[log in to unmask] disconnected after 0:00:01
050209 20:04:11 6912 XrootdXeq: User logged in as schott.28977:[log in to unmask]
050209 20:04:23 5594 XrdLink: schott.28977:[log in to unmask] disconnected after 0:00:12

-- gregory


On Wed, 9 Feb 2005, Miriam Fritsch wrote:

>
> Hi Gregory,
>
> I tried to read via xrootd again, but the xrootd-process crashs 15 sec
> after the start of my job. The error message started in my logfile with
> -------------------------------------------------------------------------
> 2005-02-09 19:42:35 10574 SysError: TUnixSystem::UnixRecv          - recv
> (Connection reset by peer)
> 2005-02-09 19:42:35 10574 SysError: TUnixSystem::DispatchOneEvent  -
> select: read error on 19
> (Bad file descriptor)
> 2005-02-09 19:43:36 10574 Err : TXSocket::RecvRaw              - Request
> timed out 60 seconds reading 4 bytes from socket -1
> (server[babar2.fzk.de:1094])
> 2005-02-09 19:43:36 10574 Err : TXNetConn::DoHandShake         - Error
> reading 4 bytes from the server [babar2.fzk.de:1094].
> 2005-02-09 19:43:36 10574 Info: TXNetConn::GetAccessToSrv      - HandShake
> failed with server [babar2.fzk.de:1094].
> 2005-02-09 19:43:36 10574 Err : TXNetFile::CreateTXNf          - Access to
> server failed
> 2005-02-09 19:43:36 10574 Err : TXConnectionMgr::Disconnect    -
> Destroying nonexistent logconnid 0.
> 2005-02-09 19:43:46 10574 SysError: TUnixSystem::UnixTcpConnect    -
> connect (babar2.fzk.de:1094) (Connection refused)
> 2005-02-09 19:43:46 10574 Err : TXPhyConnection::Connect       - can't
> open connection to xrootd/rootd on host [babar2.fzk.de:1094]
> 2005-02-09 19:43:46 10574 Err : TXNetConn::TXNetFile           - Error
> creating logical connection with [babar2.fzk.de:1094]
> 2005-02-09 19:43:46 10574 Err : TXConnectionMgr::Disconnect    -
> Destroying nonexistent logconnid -1.
> .....
> -------------------------------------------------------------------------
>
> I can read the collection with KanCollUtil, I've tested before.
>
> Cheers,
>
> Miriam
>
> *************************************************************************
>
> Dr. Miriam Fritsch
>
> Institut fuer Experimentalphysik I
> Ruhr-Universitaet Bochum, Germany               email: [log in to unmask]
> c/o SLAC                                        tel:  +1 (650) 926-3565
> 2575 Sand Hill Road #34                         fax:  +1 (650) 926-3882
> Menlo Park, CA 94025, USA                       home: +1 (650) 324-2813
>
> *************************************************************************
>
> On Wed, 9 Feb 2005, Gregory Schott wrote:
>
>>
>>> Sorry, I have to run NFS, because I couldn't test reading and writing via
>>> xrootd until now, because the collection were not available via xrootd
>>> (configuration) or xrootd is not running (today). My opinion is that it is
>>> more important to run the production than to wait until the tests with
>>> xrootd are finished ...
>>>
>>> Until I run with NFS I need the diskspace for running skimming at
>>> babar13+babar14. With xrootd later, I can (have to) use the nas boxes.
>>
>>
>> Hello Miriam,
>>
>>    If you want to write with xrootd, I'll just need to know to which nas
>> boxes you want to write to and I'll set them up with write access to /prod
>> (right?). Do you also need write access to /store?
>>    Presently, xrootd is only running for babar6-12 with read access only.
>>
>>    xrootd on the redirector babar2 was crashing 3 days ago when trying to
>> use it but it seem to not crash anymore. I don't know yet why it crashed.
>> Can you use it now for reading?
>>
>> -- Gregory
>> ______________________________________________________________________
>> You are monitoring a forum at savannah.fzk.de.
>> To stop monitoring this forum, login to Savannah and visit:
>> http://savannah.fzk.de/forum/monitor.php?forum_id=219
>>
>

-------------- Dr. Gregory Schott --------------
  Institut fuer Experimentelle Kernphysik (IEKP)
      Universitaet Karlsruhe - Postfach 3640
            76021 Karlsruhe  (Germany)
             tel.: +49-(0)724782-3537
             fax.: +49-(0)724782-3414
            e-mail: [log in to unmask]
-----------------------------------------------