Print

Print


Dear Adrian, 

Am 18.06.19 um 01:15 schrieb Adrian Sevcenco:
>>> I'd be willing to fix this ot just use the kernel interface from procfs: /proc/net/dev
>> That would be greatly appreciated :-)
> well, i had a head-start :)
> https://github.com/adriansev/bin-scripts/blob/master/cms_monPerf
> 
> but i really do no like the vmstat usage, i will change to use something directly like /proc/stat
> 
> also i do not know what is supposed to be the PGIO
> 
> Any comments/ideas/feedback about it?

that's certainly more radical than the minimalistic change I wanted to do - it's a full rewrite. 
But I must say it's much more readable, and seems more portable! 
So my personal feeling is that this is a better way to go than to improve on the old version, indeed - if Andy is fine with it. 

I have a few nits, though:
- There are many calls to external tools which could in some places be simplified. 
  E.g. the content of "${IFACE_DIR}/${iface}/operstate" could be compared against "up" using bash intrinsics instead of firing up "grep",
  same for the call to "ls" which could be replaced by looping over "${IFACE_DIR}/*" and checking that the string does not end with "/lo" using a bash builtin. 
  Probably also a few of the "bc" calls could be handled using bash builtins. 
  The whole memory-part could be done inside a single awk-call, removing the double awk and bc. 
- The script breaks if ${IFACE_DIR}/${iface}/speed can not be read, or returns a bad value. 
  For example, in a VM with a virtio-net interface, you'll get:
    # cat /sys/class/net/eth0/speed 
    cat: /sys/class/net/eth0/speed: Invalid argument
  I think there are also cases where you could get a "-1". Probably it would be best to fall back to 1000 in case 
  you get an error code back from "cat" or a negative number to handle such pathologic cases. 
- It's not (yet) possible to limit the network statistics to a single interface. 
  An administrator may want only a single interface to be considered (if XRootD does not serve all interfaces). 

I have adressed most of these things (apart from the last) in:
https://github.com/adriansev/bin-scripts/pull/1
Also, the dependency on "bc" is removed, which may not be present on most people's minimal installations. 
It's not really tested yet, and it also does not handle speed "-1" gracefully yet - and I am sure there are other corner cases. Maybe I'll have more time to look at it later this week. 

Finally, I have three things I am unsure about:
- The original script was running in an endless loop, echoing output for consumption. 
  I think this is also what needs to be done here. 
  Of course, CPU count etc. do not need to be rechecked then ;-). 
  But an interval parameter should probably be accepted to define the monitoring frequency. 
- The original script (optionally) supported writing the performance values to a shared memory segment. I don't know if this is used by XRootD in any way? 
  Probably not. 
- PGIO was also not provided by the original script, so I believe it's not really needed. 

Cheers and thanks,
	Oliver


> 
> Thanks!
> Adrian
> 
> 
> ########################################################################
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1