Print

Print


Hello Miriam,

OK. This was at the time one of the servers was restarted (it got ofline 
just a second or two). Andreas thought that in this case the currently 
reading processes would reconnect to the redirector for re-assignemrnt of 
a dataserver. Apparently it crashes instead.

I am forwarding to the xrootd experts to ask them for their opinion. We 
are using the latest (July) production version and the config files looks 
like:

$ cat config/redirector.cf
olb.allow host babar2.gridka.de
olb.allow host f01-014-108.gridka.de
olb.allow host f01-016-102.gridka.de
olb.allow host f01-016-101.gridka.de
olb.allow host f01-014-106.gridka.de
olb.allow host f01-016-108.gridka.de
olb.allow host f01-016-109.gridka.de
olb.allow host f01-016-106.gridka.de
olb.allow host f01-016-107.gridka.de
olb.allow host f01-014-103.gridka.de
olb.allow host f01-014-107.gridka.de
olb.allow host f01-005-151.gridka.de
olb.allow host f01-010-110.gridka.de
olb.allow host f01-005-115.gridka.de
olb.allow host f01-010-107.gridka.de
olb.allow host l01-001-122.gridka.de
olb.port 3121

odc.manager l01-001-122.gridka.de 3121

xrootd.fslib /home/xrootd/software/current/lib/libXrdOfs.so
xrootd.export /prod
xrootd.export /store

odc.trace redirect
---
$ cat config/dataserver.cfg
odc.manager l01-001-122.gridka.de 3121

olb.allow host babar2.gridka.de
olb.allow host f01-014-108.gridka.de
olb.allow host f01-016-102.gridka.de
olb.allow host f01-016-101.gridka.de
olb.allow host f01-014-106.gridka.de
olb.allow host f01-016-108.gridka.de
olb.allow host f01-016-109.gridka.de
olb.allow host f01-016-106.gridka.de
olb.allow host f01-016-107.gridka.de
olb.allow host f01-014-103.gridka.de
olb.allow host f01-014-107.gridka.de
olb.allow host f01-005-151.gridka.de
olb.allow host 10.65.10.110
olb.allow host f01-010-110.gridka.de
olb.allow host 10.65.5.115
olb.allow host f01-005-115.gridka.de
olb.allow host f01-010-107.gridka.de
olb.allow host l01-001-122.gridka.de

olb.path r /store
olb.path w /prod
olb.port 3121
olb.sched cpu 100
olb.subscribe l01-001-122.gridka.de
olb.wait

ofs.redirect remote if l01-001-122.gridka.de
ofs.redirect target

oss.alloc * * 80
oss.fdlimit * max
oss.localroot /home/xrootd/disk/kanga-export/EventStore/

xrd.protocol xrootd *
xrootd.async off
xrootd.export /prod
xrootd.export /store
xrootd.fslib /home/xrootd/software/current/lib/libXrdOfs.so
xrootd.chksum crc32 /home/xrootd/bin/getCRC32.sh

odc.trace redirect
---

Did anything also happen at 18:33 or 18:45 when the redirector got reset? 
In principle nothing happened from your point of view.

Cheers,

-- Gregory



On Tue, 11 Oct 2005, Miriam Fritsch wrote:

>
> Hi Gregory,
>
> some jobs crash with the following error message:
>
> ---------------------------------------------------------------------------
> 18:21:37.524 EvtCounter: processing event # 12085 [
> 1d:ffffffff:04ee72/3f73bb1d:V ]
> 2005-10-11 18:21:37 19228 Err : TXMessage::ReadRaw             - Error
> reading 8 bytes
> 2005-10-11 18:21:37 19228 Err : ReadPartialAnswer              - Error
> reading msg from connmgr (server [f01-010-107.gridka.de:1094]).
> 18:21:44.575 EvtCounter: processing event # 12086 [
> 1d:ffffffff:04ee72/3f73be86:J ]
> 2005-10-11 18:21:44 19228 Err : TXNetFile::ReadBuffer          - Server
> [f01-010-107.gridka.de:1094] did not return OK message for last reque
> st.
> 2005-10-11 18:21:44 19228 Err : SendGenCommand                 - Server
> declared error 3004: 'read does not refer to an open file'
> -- JOB DONE --------------------------------------------------------------
>
> Cheers,
>
> Miriam
>
>
> *************************************************************************
>
> Dr. Miriam Fritsch
>
> Institut fuer Experimentalphysik I
> Ruhr-Universitaet Bochum, Germany               email: [log in to unmask]
> c/o SLAC                                        tel:  +1 (650) 926-3565
> 2575 Sand Hill Road #34                         fax:  +1 (650) 926-3882
> Menlo Park, CA 94025, USA                       home: +1 (650) 324-2813
>
> *************************************************************************
>
>