OK, I will look again but all I saw were a lot of question marks in my
email. I see all of those are gone in the post (sorry about my cranky
email client). It appears that several things are going on and this
problem is much like jello. With Michals two fixes, the message handler is
not deleted twice but the hostlist most certainly is deleted twice.
Without his fix we don't get to point of deleting the hostlist twice
because it crashes much earlier.

As for logging, this is problematic. When we were debugging the BNL issue
we set logging to one level below debug and it still yieled log files in
the several GB range which is very difficult to deal with (though not
impossible). I fear that the debug level will blow this up by an order of
magnitude and at that point it's just as good as not having any log file
at all. Frankly, if the new logging additions provide enough flow
information then I would modify the code to just permanently enable those
and, perhaps, set the logging level to error.

Andy

On Fri, 22 Mar 2019, Matev? Tadel wrote:

> @abh3 the line numbers are there, I do have debuginfo installed
>
> @simonmichal Thanks for all the info! I didn't know those two commits were related to debugging of the BNL issue.
>
> I will rebuild with your latest fix and run in debug mode ... is the following ok or is it too much?
> ```pss.setopt DebugLevel 4```
>
> There were 10 more crashes over night (all within 3 hours, nothing before, nothing after, with overall load the same all the time ... so it really seems some external circumstances trigger this issue).
>
> I will also look into using tcpkill + valgrind and report what I find.
>
> --
> You are receiving this because you were mentioned.
> Reply to this email directly or view it on GitHub:
> https://github.com/xrootd/xrootd/issues/937#issuecomment-475751452


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/xrootd/xrootd","title":"xrootd/xrootd","subtitle":"GitHub repository","main_image_url":"https://github.githubassets.com/images/email/message_cards/header.png","avatar_image_url":"https://github.githubassets.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/xrootd/xrootd"}},"updates":{"snippets":[{"icon":"PERSON","message":"@abh3 in #937: OK, I will look again but all I saw were a lot of question marks in my \nemail. I see all of those are gone in the post (sorry about my cranky \nemail client). It appears that several things are going on and this \nproblem is much like jello. With Michals two fixes, the message handler is \nnot deleted twice but the hostlist most certainly is deleted twice. \nWithout his fix we don't get to point of deleting the hostlist twice \nbecause it crashes much earlier.\n\nAs for logging, this is problematic. When we were debugging the BNL issue \nwe set logging to one level below debug and it still yieled log files in \nthe several GB range which is very difficult to deal with (though not \nimpossible). I fear that the debug level will blow this up by an order of \nmagnitude and at that point it's just as good as not having any log file \nat all. Frankly, if the new logging additions provide enough flow \ninformation then I would modify the code to just permanently enable those \nand, perhaps, set the logging level to error.\n\nAndy\n\nOn Fri, 22 Mar 2019, Matev? Tadel wrote:\n\n\u003e @abh3 the line numbers are there, I do have debuginfo installed\n\u003e\n\u003e @simonmichal Thanks for all the info! I didn't know those two commits were related to debugging of the BNL issue.\n\u003e\n\u003e I will rebuild with your latest fix and run in debug mode ... is the following ok or is it too much?\n\u003e ```pss.setopt DebugLevel 4```\n\u003e\n\u003e There were 10 more crashes over night (all within 3 hours, nothing before, nothing after, with overall load the same all the time ... so it really seems some external circumstances trigger this issue).\n\u003e\n\u003e I will also look into using tcpkill + valgrind and report what I find.\n\u003e\n\u003e -- \n\u003e You are receiving this because you were mentioned.\n\u003e Reply to this email directly or view it on GitHub:\n\u003e https://github.com/xrootd/xrootd/issues/937#issuecomment-475751452\n"}],"action":{"name":"View Issue","url":"https://github.com/xrootd/xrootd/issues/937#issuecomment-475755425"}}} [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/xrootd/xrootd/issues/937#issuecomment-475755425", "url": "https://github.com/xrootd/xrootd/issues/937#issuecomment-475755425", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1