Print

Print


Hi all,

Andrew Hanushevsky wrote:
> Hi Pete,
> 
> Well, we finally hit a scaling problem with bbrskim relative to the log
> file. There is so much traffic (200 messages/second) that
> 
> a) The log file becomes unusably big, and
> b) When tons of messages queue the xrootd response time may exceed the
> client timeout window.
> 
> What to do....
> 
> For (a) I propose shortening the disconnect message to simply be
> 
> <id> disc <hh:mm:ss> [(<reason>)]
> 
> Additionally, I propose putting in a new directive to allow one to
> suppress the login message.
> 
> For (b) there is no real good solution, but here are two ideas:
> 
> 1) Remove the message synchronization call. Synchornization forces a
> message to be written to the log file before the next message can be
> written. This adds a lot of delay and can actually hang the server
> if the disk fills up. Without the call, however, you will loose
> messages if the server crashes.
> 

 From my past experience this is not a good idea. If your program 
deadlocks or gets crazy, the logfile becomes useless. I know, you might 
have the corefile, however.

Perhaps you can specify a less verbose loglevel no?

Fabrizio

> 2) (aletrnative) Double buffer the messages in a big buffer (say 16k) when
> it fills up, substitute a free buffer and blast the messages to the log
> file with synchronization. That means one will have to hunt for the
> messages in storage after a crash but at least none will be lost. On the
> other hand, it also means there may be a significant delay before you
> actually see messages in the log file making "tail -f" very cumbersome to
> use.
> 
> What's your poison?
> 
> Andy