Print

Print


Hello Andy

I guess most of the stat/open calls are from opening shared libraries and
directories. Depending on your LD_LIBRARY_PATH there could be many
failed attempts to open or stat a file.
For example opening libc.so I see three attempts:
  open("/opt/xrootd/prod/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  open("/reg/g/psdm/sw/releases/dm-current/arch/x86_64-rhel7-gcc48-opt/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

Also the strace didn't include the '-f' flag so not all threads are traced
and therefore the reads are very low.

I ran "strace -c -f" for a 24 GB file and I see:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  71.19   50.881974        5098      9981      1097 futex
  17.94   12.818864      312655        41         1 nanosleep
   9.99    7.138469          23    314920     36189 read
   0.79    0.564963          14     40227           epoll_wait
   0.03    0.020346          14      1479           pwrite
   0.02    0.016862           6      2973           epoll_ctl
   0.02    0.011628           8      1494           sendto
   0.01    0.007300         456        16           munmap
   0.01    0.007009         467        15           brk
   0.00    0.000850           9        91           write
   0.00    0.000734          56        13         6 stat
   0.00    0.000572          11        50        26 open
   0.00    0.000434           7        66           mmap
   0.00    0.000239           6        37           close
   0.00    0.000156           4        43           mprotect
   0.00    0.000152          30         5           madvise
   0.00    0.000103           5        21           fstat
   0.00    0.000064          32         2         2 access
   0.00    0.000062           7         9           socket
   0.00    0.000059          15         4           tgkill
   0.00    0.000053           6         9           recvmsg
   0.00    0.000045           6         7           poll
   0.00    0.000045          45         1           execve
   0.00    0.000043           6         7         2 connect
   0.00    0.000017           6         3           getsockname
   0.00    0.000010          10         1           bind
   0.00    0.000010           2         6           fcntl
   0.00    0.000007           4         2           getsockopt
   0.00    0.000006           1         8           uname
   0.00    0.000004           2         2           getpeername
   0.00    0.000004           1         3           geteuid
   0.00    0.000004           4         1           arch_prctl
   0.00    0.000002           1         2           readv
   0.00    0.000001           1         1           getegid
   0.00    0.000000           0         1           lseek
   0.00    0.000000           0         2           rt_sigaction
   0.00    0.000000           0         1           rt_sigprocmask
   0.00    0.000000           0         6           clone
   0.00    0.000000           0         1           readlink
   0.00    0.000000           0         1           getrlimit
   0.00    0.000000           0         2           getuid
   0.00    0.000000           0         1           set_tid_address
   0.00    0.000000           0         2         2 openat
   0.00    0.000000           0         7           set_robust_list
   0.00    0.000000           0         1           epoll_create1
   0.00    0.000000           0         1           pipe2
------ ----------- ----------- --------- --------- ----------------
100.00   71.471091                371566     37325 total


Cheers,
   Wilko


On Fri, 20 Nov 2015, Andrew Hanushevsky wrote:

> Yes, I also am mystified by some of the counts like:
>
> 0.01    0.001123          10       107        87 open
> 0.00    0.000000           0        16        14 stat
>
> among others. Why so many opens? Why are most of the opens and stat calls
> returning an error? It does seem consistent run to run. Quite strange.
>
> Andy
>
> On Fri, 20 Nov 2015, Lukasz Janyst wrote:
>
>> You could try to figure out which version of BSD supports socket watermarking and run a test with that. I strongly suspect that excessive returns from epoll are the cause here. Socket watermarking does not work on Linux. http://lkml.iu.edu/hypermail/linux/kernel/0412.1/0680.html
>>
>> ---
>> Reply to this email directly or view it on GitHub:
>> https://github.com/xrootd/xrootd/issues/20#issuecomment-158479186
>
>
> ---
> Reply to this email directly or view it on GitHub:
> https://github.com/xrootd/xrootd/issues/20#issuecomment-158534692


---
Reply to this email directly or view it on GitHub:
https://github.com/xrootd/xrootd/issues/20#issuecomment-158559524

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1