Print

Print


Hi Andy,

On 06/23/2011 04:28 PM, Andrew Hanushevsky wrote:
> Hi Kyle,
>
> What we need to do is find out what the inbound traffic really is. So, some
> more questions:
>
> 1) On your graph is that bytes in/out or packets in/out. If this is packets
> then the graph is likely correct and you are simply doing a lot of very
> small reads.
That's definitely bytes, not packets on the graph.
> 2) To find out a bit more statistics you can connect to the xrootd server
> using the xrd command. Do the following:
>
> xrd<the_xrootd_server>
> query 1 lp
> exit
>
> The find the values between<in></in>  and<out></out>  that will give you
> number of bytes in and number out. We need to see if that is reasonable
> compared to actual requests which you will find between<rd></rd>  and
> <wr></wr>  (read/write counters).

<rd>1548943168</rd>
<wr>0</wr>
<in>37184778462</in>
<out>26610785518162</out>

So writes are 0, as expected.  Inbound traffic is three orders of 
magnitude less than outbound, which doesn't correspond to our 
monitoring, but looks good.

> 3) If those still seem not to correspond then we can look at the actual
> xrootd kernel calls using strace. For instance:
>
> strace -f -xx -ttt -p<pid>  -e trace=network 2>&1 | grep  'recv('>
> <outfile>
Okay, so I see no calls to 'recv(' in the strace.  However, I do see 
calls to 'recvfrom('  that look like this:

[pid 16211] 1308923304.005638 recvfrom(19, 
"\x06\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xd7\x4e\x00\x00\x08\xa4\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24
[pid 16211] 1308923304.045490 recvfrom(19, 
"\x1f\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xdf\xf2\x00\x00\x00\x3e\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24
[pid 16211] 1308923304.046110 recvfrom(19, 
"\x05\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x30\x00\x00\x00\x44\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24
[pid 16211] 1308923304.999048 recvfrom(19, 
"\x05\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x74\x00\x00\x00\x14\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24
[pid 16211] 1308923304.999260 recvfrom(19, 
"\x1f\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x88\x00\x00\x00\x14\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24
[pid 16211] 1308923305.000027 recvfrom(19, 
"\x06\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x9c\x00\x00\x00\x14\x00\x00\x00\x00\x00", 
24, 0, NULL, NULL) = 24

> This will capture server recv() requests (inbound traffic). No need to run
> this more than a minute or two.
>
> 4) If that doesn't reveal anything then the only other option is that there
> really is something else on that machine that is accepting incoming traffic.
> If it isn't udp then netstat should show you who that might be.
At a first glance, netstat doesn't show any real flags.  Also, xrootd 
really is the only thing running on this machine, and we see lots of 
input whenever a user runs jobs, (i.e. reads data from xrootd) and no 
input otherwise.  So the two are at least correlated...

Thanks for your help!

Kyle
> Andy
>
>
> -----Original Message-----
> From: Kyle Fransham
> Sent: Thursday, June 23, 2011 1:05 PM
> To: Andrew Hanushevsky
> Cc: xrootd-l
> Subject: Re: inbound traffic
>
> Hi Andy,
>
> This is a machine at UVic that we use to serve BaBar xrootd files to
> virtual machines that we spawn in the cloud.  It's running little else
> besides xrootd.  Any traffic on the external interface (the plot that I
> sent) is xrootd.  We see very high inbound traffic almost all of the time.
>
> On the back end, we have 10TB or so of data in a lustre filesystem
> that's distributed across multiple workers.  Since this is a distributed
> filesystem, we expect (and we do see) traffic on the internal interface
> that's associated with the reading of xrootd collections.  But we don't
> expect to see that externally...
>
> What else can I tell you about this machine/setup to help diagnose the
> problem?
>
> Thanks,
>
> Kyle
>
> On 06/23/2011 03:36 PM, Andrew Hanushevsky wrote:
>> Hi Kyle,
>>
>> There should be little inbound traffic unless that machine is used for
>> more
>> than just xrootd services. What machine are we talking about?
>>
>> Andy
>>
>> -----Original Message-----
>> From: Kyle Fransham
>> Sent: Thursday, June 23, 2011 7:48 AM
>> To: xrootd-l
>> Subject: inbound traffic
>>
>> Hi all,
>>
>> We've got a single xrootd server serving out BaBar root files over the
>> WAN.  We notice that there is a lot of inbound traffic, even though our
>> files are exported read-only.  Attached is a network plot showing the
>> traffic on the xrootd interface for four simultaneous user analysis
>> jobs.  (In case you can't see the attachment, the inboud traffic tends
>> to be about 75% of the outbound traffic.)
>>
>> Is this expected behaviour?
>>
>> Thanks,
>>
>> Kyle
>>