in case it is still needed, this is what IFCA is using: " Hi Tom, All xrootd pools have the same package versions: [root@pool01 ~]# rpm -qa | grep xrootd xrootd-cmstfc-1.5.1-6.osg.el6.x86_64 xrootd-client-3.3.6-1.1.osg31.el6.x86_64 cms-xrootd-1.2-7.osg.el6.noarch xrootd-client-libs-3.3.6-1.1.osg31.el6.x86_64 xrootd-lcmaps-0.0.7-5.osg.el6.x86_64 xrootd-libs-3.3.6-1.1.osg31.el6.x86_64 xrootd-3.3.6-1.1.osg31.el6.x86_64 xrootd-server-libs-3.3.6-1.1.osg31.el6.x86_64 Cheers, I " tom On Mon, Aug 24, 2015 at 10:14 PM, Andrew Hanushevsky <[log in to unmask]> wrote: > Hi Tommaso, > > OK, this is now not critical as I seem to have found the problem. However, > a backtrace would still be good to have as assurance. Yes, it is a bug when > invalid login data is encountered. Now we have tto find out why this > actually happened. > > What is wngw.ifca.es actually running (i.e. what cmsd version). > > > Andy > > On Mon, 24 Aug 2015, Tommaso Boccali wrote: > > ciao andrew, tomorrow I can try but the fact is taht today in the end I had >> to downgrade, since it is a production server. >> >> so i have to reupgrade, take the snapshot and go back as fast as possible >> :( >> >> ciao ciao >> >> tom >> >> On Mon, Aug 24, 2015 at 9:47 PM, Andrew Hanushevsky < >> [log in to unmask]> >> wrote: >> >> Hi Tommaso, >>> >>> Both daemons (xrootd and cmsd) will exit if you attempt to run them as >>> root. This is a security feature. You can run them as root but only after >>> specifically confirming this via command line options (i.e. you accept >>> the >>> risks). As for the SEGV, that's clearly a bug. Is it possible to get a >>> stack trace of the thread that got the SEGV? Please make sure to install >>> the debug RPM so we can get actual line numbers. >>> >>> Andy >>> >>> >>> On Mon, 24 Aug 2015, Tommaso Boccali wrote: >>> >>> uhm, >>> >>>> >>>> - the strace line was just my fault, I was trying running as root on the >>>> command line >>>> - when I retried with user xrootd, I get instead the lines below, which >>>> terminate with a segv (*) >>>> >>>> so the last message is consistent with the one in the logs: >>>> >>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": >>>> ", >>>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.", >>>> 19}, {"\n", 1}], 7) = 72 >>>> [pid 65395] gettid() = 65395 >>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {": >>>> ", >>>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1}, >>>> {"invalid login data", 18}, {"\n", 1}], 9) = 75 >>>> >>>> >>>> *: >>>> [pid 65395] <... gettid resumed> ) = 65395 >>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>>> [pid 65395] write(2, "Xrd", 3) = 3 >>>> [pid 65395] write(2, "Inet", 4) = 4 >>>> [pid 65395] write(2, ": ", 2) = 2 >>>> [pid 65395] write(2, "Accepted connection from ", 25) = 25 >>>> [pid 65395] write(2, "23", 2) = 2 >>>> [pid 65395] write(2, "@", 1) = 1 >>>> [pid 65395] write(2, "wngw.ifca.es", 12) = 12 >>>> [pid 65395] write(2, "\n", 1) = 1 >>>> [pid 65395] futex(0x6473c8, FUTEX_WAKE_PRIVATE, 1) = 0 >>>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 1000) = 1 >>>> ([{fd=23, revents=POLLIN|POLLRDNORM}]) >>>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, MSG_PEEK, NULL, NULL) = >>>> 8 >>>> [pid 65395] gettid() = 65395 >>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>>> [pid 65395] write(2, "Xrd", 3) = 3 >>>> [pid 65395] write(2, "Protocol", 8) = 8 >>>> [pid 65395] write(2, ": ", 2) = 2 >>>> [pid 65395] write(2, "matched protocol ", 17) = 17 >>>> [pid 65395] write(2, "cmsd", 4) = 4 >>>> [pid 65395] write(2, "\n", 1) = 1 >>>> [pid 65395] epoll_ctl(12, EPOLL_CTL_ADD, 23, {0, {u32=4160757352, >>>> u64=140071634214504}}) = 0 >>>> [pid 65395] gettid() = 65395 >>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>>> [pid 65395] write(2, "?:[log in to unmask]", 17) = 17 >>>> [pid 65395] write(2, " ", 1) = 1 >>>> [pid 65395] write(2, "Xrd", 3) = 3 >>>> [pid 65395] write(2, "Poll", 4) = 4 >>>> [pid 65395] write(2, ": ", 2) = 2 >>>> [pid 65395] write(2, "FD ", 3) = 3 >>>> [pid 65395] write(2, "23", 2) = 2 >>>> [pid 65395] write(2, " attached to poller ", 20) = 20 >>>> [pid 65395] write(2, "2", 1) = 1 >>>> [pid 65395] write(2, "; num=", 6) = 6 >>>> [pid 65395] write(2, "1", 1) = 1 >>>> [pid 65395] write(2, "\n", 1) = 1 >>>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 5000) = 1 >>>> ([{fd=23, revents=POLLIN|POLLRDNORM}]) >>>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, 0, NULL, NULL) = 8 >>>> [pid 65395] gettid() = 65395 >>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": >>>> ", >>>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.", >>>> 19}, {"\n", 1}], 7) = 72 >>>> [pid 65395] gettid() = 65395 >>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {": >>>> ", >>>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1}, >>>> {"invalid login data", 18}, {"\n", 1}], 9) = 75 >>>> [pid 65395] --- SIGSEGV (Segmentation fault) @ 0 (0) --- >>>> Process 65395 detached >>>> [pid 65394] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65389] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65397] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65396] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65379] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65383] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65391] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65392] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65393] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65390] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65386] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65385] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65388] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65387] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65384] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65382] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65381] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65380] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65378] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65377] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65376] +++ killed by SIGSEGV (core dumped) +++ >>>> [pid 65375] +++ killed by SIGSEGV (core dumped) +++ >>>> +++ killed by SIGSEGV (core dumped) +++ >>>> >>>> On Mon, Aug 24, 2015 at 10:02 AM, Jan Iven <[log in to unmask]> wrote: >>>> >>>> On 08/24/2015 09:51 AM, Tommaso Boccali wrote: >>>> >>>>> >>>>> Ciao, I am trying to upgrade one of the CMS EU redirectors from 3.3.6 >>>>> to >>>>> >>>>>> 4.2.2 (with no configuration changed at first approx) >>>>>> >>>>>> The problem is that the main cmsd seems to die soon after start. No >>>>>> real >>>>>> message in the logs, but with strace I see a very suspect >>>>>> >>>>>> writev(2, [{"Copr. 2007 Stanford University/"..., 42}, {"\n", 1}], >>>>>> 2) = >>>>>> 43 >>>>>> geteuid() = 0 >>>>>> gettid() = 53572 >>>>>> writev(2, [{"150824 09:47:27 53572 ", 22}, {"Config", 6}, {": ", 2}, >>>>>> {"Security reasons prohibit cmsd r"..., 73}, {"\n", 1}], 5) = 104 >>>>>> >>>>>> >>>>>> Well, that line ought to end up either on STDERR somewhere or in some >>>>> log >>>>> file. Alternatively, suggest "strace -s 1024" to get the full error >>>>> message.. >>>>> >>>>> Cheers >>>>> jan >>>>> >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> Tommaso Boccali >>>> INFN Pisa >>>> >>>> ######################################################################## >>>> Use REPLY-ALL to reply to list >>>> >>>> To unsubscribe from the XROOTD-L list, click the following link: >>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 >>>> >>>> >>>> >> >> -- >> Tommaso Boccali >> INFN Pisa >> >> -- Tommaso Boccali INFN Pisa ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1