Hi guys,
I run some tests on two machines (32 CPUs, 2.6 GHz) in wigner linked with 10Gbit. Here's a longish summary:
server:
[root@p05799459s78866 ~]# free -g
total used free shared buffers cached
Mem: 62 1 61 0 0 0
-/+ buffers/cache: 1 61
Swap: 1 0 1
client:
[xrootd-test@p05799459m77145 XrdCl]$ time strace -c ./xrdcp -f root://p05799459s78866//tmp/data01/20G_big_file /dev/null
[19.53GB/19.53GB][100%][==================================================][162.6MB/s]
% time seconds usecs/call calls errors syscall
99.82 8.599052 2308 3725 2 futex
0.12 0.009998 1111 9 socket
0.03 0.002859 2859 1 readlink
0.01 0.001123 10 107 87 open
0.01 0.000760 45 17 recvmsg
0.00 0.000235 0 1257 epoll_ctl
0.00 0.000214 24 9 brk
0.00 0.000207 52 4 1 connect
0.00 0.000155 13 12 munmap
0.00 0.000036 0 284 write
0.00 0.000032 0 1250 pwrite
0.00 0.000000 0 20 read
0.00 0.000000 0 41 close
0.00 0.000000 0 16 14 stat
0.00 0.000000 0 20 fstat
0.00 0.000000 0 3 poll
0.00 0.000000 0 1 lseek
0.00 0.000000 0 55 mmap
0.00 0.000000 0 27 mprotect
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 1 1 access
0.00 0.000000 0 8 sendto
0.00 0.000000 0 4 bind
0.00 0.000000 0 4 getsockname
0.00 0.000000 0 8 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 uname
0.00 0.000000 0 5 fcntl
0.00 0.000000 0 1 getcwd
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getuid
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 tgkill
0.00 0.000000 0 1 set_robust_list
0.00 0.000000 0 4 epoll_create1
0.00 0.000000 0 4 pipe2
100.00 8.614671 6913 105 total
real 2m3.596s
user 0m0.416s
sys 0m8.814s
The limiting factor is the server side read speed (100% reproducible).
server:
[root@p05799459s78866 ~]# free -g
total used free shared buffers cached
Mem: 62 20 42 0 0 19
-/+ buffers/cache: 1 61
Swap: 1 0 1
client:
[xrootd-test@p05799459m77145 XrdCl]$ time strace -c ./xrdcp -f root://p05799459s78866//tmp/data01/20G_big_file /dev/null
[19.53GB/19.53GB][100%][==================================================][1.028GB/s]
% time seconds usecs/call calls errors syscall
99.88 7.983796 3686 2166 8 futex
0.09 0.006999 778 9 socket
0.01 0.001003 111 9 brk
0.01 0.000945 79 12 munmap
0.01 0.000506 506 1 readlink
0.00 0.000121 0 1257 epoll_ctl
0.00 0.000053 0 1250 pwrite
0.00 0.000012 0 562 write
0.00 0.000009 0 41 close
0.00 0.000008 1 8 sendto
0.00 0.000000 0 20 read
0.00 0.000000 0 107 87 open
0.00 0.000000 0 16 14 stat
0.00 0.000000 0 20 fstat
0.00 0.000000 0 3 poll
0.00 0.000000 0 1 lseek
0.00 0.000000 0 55 mmap
0.00 0.000000 0 27 mprotect
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 1 1 access
0.00 0.000000 0 4 1 connect
0.00 0.000000 0 17 recvmsg
0.00 0.000000 0 4 bind
0.00 0.000000 0 4 getsockname
0.00 0.000000 0 8 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 uname
0.00 0.000000 0 5 fcntl
0.00 0.000000 0 1 getcwd
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getuid
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 tgkill
0.00 0.000000 0 1 set_robust_list
0.00 0.000000 0 4 epoll_create1
0.00 0.000000 0 4 pipe2
100.00 7.993452 5632 111 total
real 0m18.797s
user 0m0.314s
sys 0m8.145s
It is pretty close to filling up the pipe. There is one busy xrootd thread that takes about 40-45% CPU (100% reproducible).
server:
[root@p05799459s78866 ~]# free -g
total used free shared buffers cached
Mem: 62 20 42 0 0 19
-/+ buffers/cache: 1 61
Swap: 1 0 1
client:
[xrootd-test@p05799459m77145 XrdCl]$ free -g
total used free shared buffers cached
Mem: 62 1 61 0 0 0
-/+ buffers/cache: 1 61
Swap: 1 0 1
[xrootd-test@p05799459m77145 XrdCl]$ time strace -c ./xrdcp -f root://p05799459s78866//tmp/data01/20G_big_file /data07/tmp/
[19.53GB/19.53GB][100%][==================================================][1.085GB/s]
% time seconds usecs/call calls errors syscall
76.55 10.780198 8624 1250 pwrite
23.45 3.301757 1280 2579 2 futex
0.00 0.000370 31 12 munmap
0.00 0.000095 0 1253 epoll_ctl
0.00 0.000022 22 1 readlink
0.00 0.000010 0 107 87 open
0.00 0.000000 0 20 read
0.00 0.000000 0 277 write
0.00 0.000000 0 32 close
0.00 0.000000 0 16 14 stat
0.00 0.000000 0 19 fstat
0.00 0.000000 0 3 poll
0.00 0.000000 0 1 lseek
0.00 0.000000 0 51 mmap
0.00 0.000000 0 24 mprotect
0.00 0.000000 0 8 brk
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 1 1 access
0.00 0.000000 0 9 socket
0.00 0.000000 0 4 1 connect
0.00 0.000000 0 8 sendto
0.00 0.000000 0 17 recvmsg
0.00 0.000000 0 4 bind
0.00 0.000000 0 4 getsockname
0.00 0.000000 0 5 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 uname
0.00 0.000000 0 5 fcntl
0.00 0.000000 0 1 getcwd
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getuid
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 tgkill
0.00 0.000000 0 1 set_robust_list
0.00 0.000000 0 1 epoll_create1
0.00 0.000000 0 1 pipe2
100.00 14.082452 5729 105 total
real 0m18.231s
user 0m0.286s
sys 0m19.176s
Again, it is pretty close to filling up the pipe. Now in top I can see 2 xrootd threads that are quite busy, respectively 40-45% and 55-60% CPU (100% reproducible).
server:
[root@p05799459s78866 ~]# free -g
total used free shared buffers cached
Mem: 62 20 42 0 0 19
-/+ buffers/cache: 1 61
Swap: 1 0 1
client:
[xrootd-test@p05799459m77145 XrdCl]$ free -g
total used free shared buffers cached
Mem: 62 61 1 0 0 58
-/+ buffers/cache: 2 60
Swap: 1 0 1
[xrootd-test@p05799459m77145 XrdCl]$ time strace -c ./xrdcp -f root://p05799459s78866//tmp/data01/20G_big_file /data07/tmp/
[19.53GB/19.53GB][100%][==================================================][217.4MB/s]
% time seconds usecs/call calls errors syscall
95.50 19.843486 15875 1250 pwrite
4.50 0.934266 518 1803 1 futex
0.00 0.000660 55 12 munmap
0.00 0.000608 76 8 brk
0.00 0.000272 0 1253 epoll_ctl
0.00 0.000118 0 631 write
0.00 0.000008 0 107 87 open
0.00 0.000000 0 20 read
0.00 0.000000 0 32 close
0.00 0.000000 0 16 14 stat
0.00 0.000000 0 19 fstat
0.00 0.000000 0 3 poll
0.00 0.000000 0 1 lseek
0.00 0.000000 0 51 mmap
0.00 0.000000 0 24 mprotect
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 1 1 access
0.00 0.000000 0 9 socket
0.00 0.000000 0 4 1 connect
0.00 0.000000 0 8 sendto
0.00 0.000000 0 17 recvmsg
0.00 0.000000 0 4 bind
0.00 0.000000 0 4 getsockname
0.00 0.000000 0 5 clone
0.00 0.000000 0 1 execve
0.00 0.000000 0 2 uname
0.00 0.000000 0 5 fcntl
0.00 0.000000 0 1 getcwd
0.00 0.000000 0 1 readlink
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 2 getuid
0.00 0.000000 0 1 arch_prctl
0.00 0.000000 0 1 set_tid_address
0.00 0.000000 0 4 tgkill
0.00 0.000000 0 1 set_robust_list
0.00 0.000000 0 1 epoll_create1
0.00 0.000000 0 1 pipe2
100.00 20.779418 5307 104 total
real 1m31.164s
user 0m0.313s
sys 0m23.084s
Now the limiting factor is the write speed to the disk on the client side, the xrootd threads are not so busy any more: respectively 6-8% and 9-11% CPU (100% reproducible).
server:
[root@p05799459s78866 ~]# free -g
total used free shared buffers cached
Mem: 62 20 42 0 0 19
-/+ buffers/cache: 1 61
Swap: 1 0 1
client:
[xrootd-test@p05799459m77145 XrdCl]$ free -g
total used free shared buffers cached
Mem: 62 1 61 0 0 0
-/+ buffers/cache: 1 61
Swap: 1 0 1
[root@p05799459m77145 tmp]# time curl --url http://p05799459s78866/data01/tmp/20G_big_file -o 20G_big_file
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 19.5G 100 19.5G 0 0 897M 0 0:00:22 0:00:22 --:--:-- 882M
real 0m22.283s
user 0m1.466s
sys 0m20.818s
curl is slightly slower than xrdcp, it seems to be CPU bound (100% CPU in top)
I also did some tests with multiple streams and XRD_CPPARALLELCHUNKS=10, and also modified the code so multiple event loops are being used, but I didn't observe any effect on the transfer rate.
Now to summarize it seems to me that xrdcp works just fine. If you would like me to do other tests let me know!
—
Reply to this email directly or view it on GitHub.
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1