Print

Print


hello Greg,

       I was able to reproduce the problem on RHEL. At the end, the 
orphan process disappears but only after 15 iterations as you mentioned 
(unlike what is happening on Solaris).
I don't know what to do at the moment about this. I could clean them by 
brute force at the end of the script in order to do that with less 
latency. Let me check what I can do about it in a cleaner way.
cheers,
JY

Gregory J. Sharp wrote:

> Folks,
>
> The etc/XrdOlbMonPerf script in the xrootd distribution is generating 
> lots of orphan netstat -c -i 60 processes on our RHEL 3 system.
>
> I noticed that the pipe used to open the netstat isn't explicitly 
> closed, but presumably it is closed when XrdOlbMonPerf exits.  At the 
> end of the while loop (~line 207) it would be cleaner to put
>   close(CMDFD);
> but not essential, since it doesn't help the problem I have.
>
> It turns out that netstat doesn't die when the pipe it is writing to 
> is closed. It apparently keeps trying for up to 15-16 more iterations 
> - which means that each netstat hangs around for an extra 15 minutes 
> if you have a 60 second delay between iterations. It should exit if it 
> gets EPIPE while attempting to write its output. (You can try this 
> with "netstat -c -i 2 | cat; date" and then in another window figure 
> out the PID of cat and run "kill $CATPID; date" - it takes quite a 
> while for the netstat to die, and time is apparently proportional to 
> the time interval given to netstat.)
>
> I don't know if there is a simple cure for this. Putting in code to 
> make sure the netstat is dead seems like an unpleasant way to work 
> around a bug/misfeature in netstat on RHEL 3.
>
> -- 
> Gregory J. Sharp                   email: [log in to unmask]
> Wilson Synchrotron Laboratory      url: 
> http://www.lepp.cornell.edu/~gregor
> Dryden Rd                          ph:  +1 607 255 4882
> Ithaca, NY 14853                   fax: +1 607 255 8062
>