Hi Tommaso, Do you have a core file? If so, you should be able to install on some other machine to get a traceback. Andy On Mon, 24 Aug 2015, Tommaso Boccali wrote: > ciao andrew, tomorrow I can try but the fact is taht today in the end I had > to downgrade, since it is a production server. > > so i have to reupgrade, take the snapshot and go back as fast as possible :( > > ciao ciao > > tom > > On Mon, Aug 24, 2015 at 9:47 PM, Andrew Hanushevsky <[log in to unmask]> > wrote: > >> Hi Tommaso, >> >> Both daemons (xrootd and cmsd) will exit if you attempt to run them as >> root. This is a security feature. You can run them as root but only after >> specifically confirming this via command line options (i.e. you accept the >> risks). As for the SEGV, that's clearly a bug. Is it possible to get a >> stack trace of the thread that got the SEGV? Please make sure to install >> the debug RPM so we can get actual line numbers. >> >> Andy >> >> >> On Mon, 24 Aug 2015, Tommaso Boccali wrote: >> >> uhm, >>> >>> - the strace line was just my fault, I was trying running as root on the >>> command line >>> - when I retried with user xrootd, I get instead the lines below, which >>> terminate with a segv (*) >>> >>> so the last message is consistent with the one in the logs: >>> >>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": ", >>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.", >>> 19}, {"\n", 1}], 7) = 72 >>> [pid 65395] gettid() = 65395 >>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {": >>> ", >>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1}, >>> {"invalid login data", 18}, {"\n", 1}], 9) = 75 >>> >>> >>> *: >>> [pid 65395] <... gettid resumed> ) = 65395 >>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>> [pid 65395] write(2, "Xrd", 3) = 3 >>> [pid 65395] write(2, "Inet", 4) = 4 >>> [pid 65395] write(2, ": ", 2) = 2 >>> [pid 65395] write(2, "Accepted connection from ", 25) = 25 >>> [pid 65395] write(2, "23", 2) = 2 >>> [pid 65395] write(2, "@", 1) = 1 >>> [pid 65395] write(2, "wngw.ifca.es", 12) = 12 >>> [pid 65395] write(2, "\n", 1) = 1 >>> [pid 65395] futex(0x6473c8, FUTEX_WAKE_PRIVATE, 1) = 0 >>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 1000) = 1 >>> ([{fd=23, revents=POLLIN|POLLRDNORM}]) >>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, MSG_PEEK, NULL, NULL) = 8 >>> [pid 65395] gettid() = 65395 >>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>> [pid 65395] write(2, "Xrd", 3) = 3 >>> [pid 65395] write(2, "Protocol", 8) = 8 >>> [pid 65395] write(2, ": ", 2) = 2 >>> [pid 65395] write(2, "matched protocol ", 17) = 17 >>> [pid 65395] write(2, "cmsd", 4) = 4 >>> [pid 65395] write(2, "\n", 1) = 1 >>> [pid 65395] epoll_ctl(12, EPOLL_CTL_ADD, 23, {0, {u32=4160757352, >>> u64=140071634214504}}) = 0 >>> [pid 65395] gettid() = 65395 >>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22 >>> [pid 65395] write(2, "?:[log in to unmask]", 17) = 17 >>> [pid 65395] write(2, " ", 1) = 1 >>> [pid 65395] write(2, "Xrd", 3) = 3 >>> [pid 65395] write(2, "Poll", 4) = 4 >>> [pid 65395] write(2, ": ", 2) = 2 >>> [pid 65395] write(2, "FD ", 3) = 3 >>> [pid 65395] write(2, "23", 2) = 2 >>> [pid 65395] write(2, " attached to poller ", 20) = 20 >>> [pid 65395] write(2, "2", 1) = 1 >>> [pid 65395] write(2, "; num=", 6) = 6 >>> [pid 65395] write(2, "1", 1) = 1 >>> [pid 65395] write(2, "\n", 1) = 1 >>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 5000) = 1 >>> ([{fd=23, revents=POLLIN|POLLRDNORM}]) >>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, 0, NULL, NULL) = 8 >>> [pid 65395] gettid() = 65395 >>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": ", >>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.", >>> 19}, {"\n", 1}], 7) = 72 >>> [pid 65395] gettid() = 65395 >>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {": >>> ", >>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1}, >>> {"invalid login data", 18}, {"\n", 1}], 9) = 75 >>> [pid 65395] --- SIGSEGV (Segmentation fault) @ 0 (0) --- >>> Process 65395 detached >>> [pid 65394] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65389] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65397] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65396] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65379] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65383] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65391] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65392] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65393] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65390] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65386] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65385] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65388] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65387] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65384] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65382] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65381] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65380] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65378] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65377] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65376] +++ killed by SIGSEGV (core dumped) +++ >>> [pid 65375] +++ killed by SIGSEGV (core dumped) +++ >>> +++ killed by SIGSEGV (core dumped) +++ >>> >>> On Mon, Aug 24, 2015 at 10:02 AM, Jan Iven <[log in to unmask]> wrote: >>> >>> On 08/24/2015 09:51 AM, Tommaso Boccali wrote: >>>> >>>> Ciao, I am trying to upgrade one of the CMS EU redirectors from 3.3.6 to >>>>> 4.2.2 (with no configuration changed at first approx) >>>>> >>>>> The problem is that the main cmsd seems to die soon after start. No real >>>>> message in the logs, but with strace I see a very suspect >>>>> >>>>> writev(2, [{"Copr. 2007 Stanford University/"..., 42}, {"\n", 1}], 2) = >>>>> 43 >>>>> geteuid() = 0 >>>>> gettid() = 53572 >>>>> writev(2, [{"150824 09:47:27 53572 ", 22}, {"Config", 6}, {": ", 2}, >>>>> {"Security reasons prohibit cmsd r"..., 73}, {"\n", 1}], 5) = 104 >>>>> >>>>> >>>> Well, that line ought to end up either on STDERR somewhere or in some log >>>> file. Alternatively, suggest "strace -s 1024" to get the full error >>>> message.. >>>> >>>> Cheers >>>> jan >>>> >>>> >>>> >>>> >>> >>> -- >>> Tommaso Boccali >>> INFN Pisa >>> >>> ######################################################################## >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the XROOTD-L list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 >>> >>> > > > -- > Tommaso Boccali > INFN Pisa > > ######################################################################## > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-L list, click the following link: > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1 > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1