Print

Print


Hi Tommaso,

Do you have a core file? If so, you should be able to install on some 
other machine to get a traceback.

Andy

On Mon, 24 Aug 2015, Tommaso Boccali wrote:

> ciao andrew, tomorrow I can try but the fact is taht today in the end I had
> to downgrade, since it is a production server.
>
> so i have to reupgrade, take the snapshot and go back as fast as possible :(
>
> ciao ciao
>
> tom
>
> On Mon, Aug 24, 2015 at 9:47 PM, Andrew Hanushevsky <[log in to unmask]>
> wrote:
>
>> Hi Tommaso,
>>
>> Both daemons (xrootd and cmsd) will exit if you attempt to run them as
>> root. This is a security feature. You can run them as root but only after
>> specifically confirming this via command line options (i.e. you accept the
>> risks). As for the SEGV, that's clearly a bug. Is it possible to get a
>> stack trace of the thread that got the SEGV? Please make sure to install
>> the debug RPM so we can get actual line numbers.
>>
>> Andy
>>
>>
>> On Mon, 24 Aug 2015, Tommaso Boccali wrote:
>>
>> uhm,
>>>
>>> - the strace line was just my fault, I was trying running as root on the
>>> command line
>>> - when I retried with user xrootd, I get instead the lines below, which
>>> terminate with a segv (*)
>>>
>>> so the last message is consistent with the one in the logs:
>>>
>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": ",
>>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.",
>>> 19}, {"\n", 1}], 7) = 72
>>> [pid 65395] gettid()                    = 65395
>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {":
>>> ",
>>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1},
>>> {"invalid login data", 18}, {"\n", 1}], 9) = 75
>>>
>>>
>>> *:
>>> [pid 65395] <... gettid resumed> )      = 65395
>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22
>>> [pid 65395] write(2, "Xrd", 3)          = 3
>>> [pid 65395] write(2, "Inet", 4)         = 4
>>> [pid 65395] write(2, ": ", 2)           = 2
>>> [pid 65395] write(2, "Accepted connection from ", 25) = 25
>>> [pid 65395] write(2, "23", 2)           = 2
>>> [pid 65395] write(2, "@", 1)            = 1
>>> [pid 65395] write(2, "wngw.ifca.es", 12) = 12
>>> [pid 65395] write(2, "\n", 1)           = 1
>>> [pid 65395] futex(0x6473c8, FUTEX_WAKE_PRIVATE, 1) = 0
>>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 1000) = 1
>>> ([{fd=23, revents=POLLIN|POLLRDNORM}])
>>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, MSG_PEEK, NULL, NULL) = 8
>>> [pid 65395] gettid()                    = 65395
>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22
>>> [pid 65395] write(2, "Xrd", 3)          = 3
>>> [pid 65395] write(2, "Protocol", 8)     = 8
>>> [pid 65395] write(2, ": ", 2)           = 2
>>> [pid 65395] write(2, "matched protocol ", 17) = 17
>>> [pid 65395] write(2, "cmsd", 4)         = 4
>>> [pid 65395] write(2, "\n", 1)           = 1
>>> [pid 65395] epoll_ctl(12, EPOLL_CTL_ADD, 23, {0, {u32=4160757352,
>>> u64=140071634214504}}) = 0
>>> [pid 65395] gettid()                    = 65395
>>> [pid 65395] write(2, "150824 10:09:53 65395 ", 22) = 22
>>> [pid 65395] write(2, "?:[log in to unmask]", 17) = 17
>>> [pid 65395] write(2, " ", 1)            = 1
>>> [pid 65395] write(2, "Xrd", 3)          = 3
>>> [pid 65395] write(2, "Poll", 4)         = 4
>>> [pid 65395] write(2, ": ", 2)           = 2
>>> [pid 65395] write(2, "FD ", 3)          = 3
>>> [pid 65395] write(2, "23", 2)           = 2
>>> [pid 65395] write(2, " attached to poller ", 20) = 20
>>> [pid 65395] write(2, "2", 1)            = 1
>>> [pid 65395] write(2, "; num=", 6)       = 6
>>> [pid 65395] write(2, "1", 1)            = 1
>>> [pid 65395] write(2, "\n", 1)           = 1
>>> [pid 65395] poll([{fd=23, events=POLLIN|POLLRDNORM}], 1, 5000) = 1
>>> ([{fd=23, revents=POLLIN|POLLRDNORM}])
>>> [pid 65395] recvfrom(23, "\0\0\0\0\0\0\0\0", 8, 0, NULL, NULL) = 8
>>> [pid 65395] gettid()                    = 65395
>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Pup", 3}, {": ",
>>> 2}, {"buffer overrun unpacking", 24}, {" ", 1}, {"short arg 0: ident.",
>>> 19}, {"\n", 1}], 7) = 72
>>> [pid 65395] gettid()                    = 65395
>>> [pid 65395] writev(2, [{"150824 10:09:53 65395 ", 22}, {"Login", 5}, {":
>>> ",
>>> 2}, {"wngw.ifca.es", 12}, {" ", 1}, {"login failed;", 13}, {" ", 1},
>>> {"invalid login data", 18}, {"\n", 1}], 9) = 75
>>> [pid 65395] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>>> Process 65395 detached
>>> [pid 65394] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65389] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65397] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65396] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65379] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65383] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65391] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65392] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65393] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65390] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65386] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65385] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65388] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65387] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65384] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65382] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65381] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65380] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65378] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65377] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65376] +++ killed by SIGSEGV (core dumped) +++
>>> [pid 65375] +++ killed by SIGSEGV (core dumped) +++
>>> +++ killed by SIGSEGV (core dumped) +++
>>>
>>> On Mon, Aug 24, 2015 at 10:02 AM, Jan Iven <[log in to unmask]> wrote:
>>>
>>> On 08/24/2015 09:51 AM, Tommaso Boccali wrote:
>>>>
>>>> Ciao, I am trying to upgrade one of the CMS EU redirectors from 3.3.6 to
>>>>> 4.2.2 (with no configuration changed at first approx)
>>>>>
>>>>> The problem is that the main cmsd seems to die soon after start. No real
>>>>> message in the logs, but with strace I see a very suspect
>>>>>
>>>>> writev(2, [{"Copr.  2007 Stanford University/"..., 42}, {"\n", 1}], 2) =
>>>>> 43
>>>>> geteuid()                               = 0
>>>>> gettid()                                = 53572
>>>>> writev(2, [{"150824 09:47:27 53572 ", 22}, {"Config", 6}, {": ", 2},
>>>>> {"Security reasons prohibit cmsd r"..., 73}, {"\n", 1}], 5) = 104
>>>>>
>>>>>
>>>> Well, that line ought to end up either on STDERR somewhere or in some log
>>>> file. Alternatively, suggest "strace -s 1024" to get the full error
>>>> message..
>>>>
>>>> Cheers
>>>> jan
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Tommaso Boccali
>>> INFN Pisa
>>>
>>> ########################################################################
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the XROOTD-L list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>
>>>
>
>
> -- 
> Tommaso Boccali
> INFN Pisa
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1