Print

Print


From EOS bugtracker:

Date: 2013-08-08 21:08 By: Jan Iven

Downtime (neither xrootd nor SRM work), starts roughly with

130808 18:03:16 26269 XrdLink: attempt to reuse active link
130808 18:03:16 26269 XrdAccept: Unable to allocate new link for
lxplus0320.cern.ch; cannot allocate memory

These two messages become increasingly common, and start to include machines
used by the service (e.g. the gridftp doors), unless nearly nothing else is
logged. Suspect that "26269" is the main thread that accepts connections.

Must be something thread-local, machine has more than enough memory, is not
(and was not) swapping. No OOM messages either.

[root@lxbrf39c02 ~]# ps axvw | grep mgm
26269 ?        Sl   36190:13    0   177 52738946 37919324 14.3
/usr//bin/xrootd -n mgm -c /etc/xrd.cf.mgm -m -l /var/log/eos/xrdlog.mgm -b
-Rdaemon
[root@lxbrf39c02 ~]# free
             total       used       free     shared    buffers     cached
Mem:     264254700  224848388   39406312          0    1599512  178560424
-/+ buffers/cache:   44688452  219566248

"fixed" by MGM restart

eos-server-0.2.38-1
xrootd-server-3.2.8-1.slc5

Date: 2013-08-09 10:58 By: Lukasz Janyst

Surprisingly enough this has nothing to do with memory allocation.

XrdLink::Alloc() would return the same error code for the case where the link
is already allocated and in use and for the case where it could not have been
allocated because of lack of memory.

Then, the code calling XrdLink::Alloc has no way of distinguishing between
the two cases and assumes a memory problem - thus the error message you see.

This should be reported in xrootd bug tracker.


Reply to this email directly or view it on GitHub.



Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1