Print

Print


Folks,

The etc/XrdOlbMonPerf script in the xrootd distribution is generating 
lots of orphan netstat -c -i 60 processes on our RHEL 3 system.

I noticed that the pipe used to open the netstat isn't explicitly 
closed, but presumably it is closed when XrdOlbMonPerf exits.  At the 
end of the while loop (~line 207) it would be cleaner to put
   close(CMDFD);
but not essential, since it doesn't help the problem I have.

It turns out that netstat doesn't die when the pipe it is writing to is 
closed. It apparently keeps trying for up to 15-16 more iterations - 
which means that each netstat hangs around for an extra 15 minutes if 
you have a 60 second delay between iterations. It should exit if it 
gets EPIPE while attempting to write its output. (You can try this with 
"netstat -c -i 2 | cat; date" and then in another window figure out the 
PID of cat and run "kill $CATPID; date" - it takes quite a while for 
the netstat to die, and time is apparently proportional to the time 
interval given to netstat.)

I don't know if there is a simple cure for this. Putting in code to 
make sure the netstat is dead seems like an unpleasant way to work 
around a bug/misfeature in netstat on RHEL 3.

--
Gregory J. Sharp                   email: [log in to unmask]
Wilson Synchrotron Laboratory      url: 
http://www.lepp.cornell.edu/~gregor
Dryden Rd                          ph:  +1 607 255 4882
Ithaca, NY 14853                   fax: +1 607 255 8062