Print

Print


Hi all,

I have got very strange behavior of our installation. Let me describe it:
We had some problems on one of Cisco switch boards where is also 
connected our redirector node. There was discovered these lines in olb 
log file during the crash of network :
Example for one node:

060128 19:09:30 20424 olb_GetLine: Unable to read request; no route to host
060128 19:09:30 20424 olb_Manager: rcas6150:1095 scheduled for removal; 
not responding

This node "rcas6150" didn't recover a connection to the redirector olbd 
server anymore, but the olbd proccess is stil running on that node.
And when someone tried to request the file from that node then he wasn't 
redirected to that node, even the file is there.
If the olbd process is restarted on that node, everything is in on order 
and the user is redirected to that node and file is opened.

You can simulate this strange behavior by giving wrong numbers (means 
value greater than 100 etc.) of load, io etc. to redirector node. Then 
node is scheduled for removal ....

Thanks for a advice
Let me know if you need something
Pavel