Folks,
The etc/XrdOlbMonPerf script in the xrootd distribution is generating
lots of orphan netstat -c -i 60 processes on our RHEL 3 system.
I noticed that the pipe used to open the netstat isn't explicitly
closed, but presumably it is closed when XrdOlbMonPerf exits. At the
end of the while loop (~line 207) it would be cleaner to put
close(CMDFD);
but not essential, since it doesn't help the problem I have.
It turns out that netstat doesn't die when the pipe it is writing to is
closed. It apparently keeps trying for up to 15-16 more iterations -
which means that each netstat hangs around for an extra 15 minutes if
you have a 60 second delay between iterations. It should exit if it
gets EPIPE while attempting to write its output. (You can try this with
"netstat -c -i 2 | cat; date" and then in another window figure out the
PID of cat and run "kill $CATPID; date" - it takes quite a while for
the netstat to die, and time is apparently proportional to the time
interval given to netstat.)
I don't know if there is a simple cure for this. Putting in code to
make sure the netstat is dead seems like an unpleasant way to work
around a bug/misfeature in netstat on RHEL 3.
--
Gregory J. Sharp email: [log in to unmask]
Wilson Synchrotron Laboratory url:
http://www.lepp.cornell.edu/~gregor
Dryden Rd ph: +1 607 255 4882
Ithaca, NY 14853 fax: +1 607 255 8062
|