URL: <http://savannah.cern.ch/bugs/?82184> Summary: Problem for startup using /etc/init.d and proxy services Project: XROOTD Submitted by: bdouglas Submitted on: 2011-05-13 03:24 Severity: 5 - Blocker Priority: 7 - High Status: None Privacy: Public Assigned to: None Originator Email: Open/Closed: Open Discussion Lock: Any Fixed by commit(s): _______________________________________________________ Details: According to the xrootd manual, any component may be started in any order and will simply wait until it has the right resources to proceed. This is great because it removes the sysadmin from trying to figure out much when starting up the system or restarting some part of it. However, this seems to be in direct opposition to how init.d works. That is, while xrootd assumes a parallel start-up order; init.d assumes a serial one. This essentially means that depending on how you specify things in init.d you may get into a deadlock situation. For instance, starting a proxy server before starting its manager on the same machine will hang because the server will wait for the manager and the manager will never be started. While this would seem to be avoided by reversing the order, that is not necessarily going to solve anything because the manager could be waiting for resources from other machines in the cluster essentially delaying the proxy which simply delays init.d. This puts the sysadmin in the unenviable position of trying to figure out the state of dozens of machines when init.d hangs when all of this could have been avoided if init.d just started everything in parallel. There seems to be a fundamental mismatch here. Today, Andy and I had a heck of a time debugging this problem. Intitially a typo in the config file caused the proxy xrootd server child process not to finish and the the parent process was locked. After we fixed the mistake in config file, we restarted things and got into a dead lock condition because none. Of the data servers were running. Not an uncommon occurrence after a power outage. The child proxy process was hung again waiting for response from the redirector. These locking occurances were direct result of how /etc/init.d demonizes things. _______________________________________________________ Reply to this item at: <http://savannah.cern.ch/bugs/?82184> _______________________________________________ Message sent via/by LCG Savannah http://savannah.cern.ch/