Hi Andy, On 06/23/2011 04:28 PM, Andrew Hanushevsky wrote: > Hi Kyle, > > What we need to do is find out what the inbound traffic really is. So, some > more questions: > > 1) On your graph is that bytes in/out or packets in/out. If this is packets > then the graph is likely correct and you are simply doing a lot of very > small reads. That's definitely bytes, not packets on the graph. > 2) To find out a bit more statistics you can connect to the xrootd server > using the xrd command. Do the following: > > xrd<the_xrootd_server> > query 1 lp > exit > > The find the values between<in></in> and<out></out> that will give you > number of bytes in and number out. We need to see if that is reasonable > compared to actual requests which you will find between<rd></rd> and > <wr></wr> (read/write counters). <rd>1548943168</rd> <wr>0</wr> <in>37184778462</in> <out>26610785518162</out> So writes are 0, as expected. Inbound traffic is three orders of magnitude less than outbound, which doesn't correspond to our monitoring, but looks good. > 3) If those still seem not to correspond then we can look at the actual > xrootd kernel calls using strace. For instance: > > strace -f -xx -ttt -p<pid> -e trace=network 2>&1 | grep 'recv('> > <outfile> Okay, so I see no calls to 'recv(' in the strace. However, I do see calls to 'recvfrom(' that look like this: [pid 16211] 1308923304.005638 recvfrom(19, "\x06\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xd7\x4e\x00\x00\x08\xa4\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 [pid 16211] 1308923304.045490 recvfrom(19, "\x1f\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xdf\xf2\x00\x00\x00\x3e\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 [pid 16211] 1308923304.046110 recvfrom(19, "\x05\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x30\x00\x00\x00\x44\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 [pid 16211] 1308923304.999048 recvfrom(19, "\x05\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x74\x00\x00\x00\x14\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 [pid 16211] 1308923304.999260 recvfrom(19, "\x1f\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x88\x00\x00\x00\x14\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 [pid 16211] 1308923305.000027 recvfrom(19, "\x06\x00\x0b\xc5\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe0\x9c\x00\x00\x00\x14\x00\x00\x00\x00\x00", 24, 0, NULL, NULL) = 24 > This will capture server recv() requests (inbound traffic). No need to run > this more than a minute or two. > > 4) If that doesn't reveal anything then the only other option is that there > really is something else on that machine that is accepting incoming traffic. > If it isn't udp then netstat should show you who that might be. At a first glance, netstat doesn't show any real flags. Also, xrootd really is the only thing running on this machine, and we see lots of input whenever a user runs jobs, (i.e. reads data from xrootd) and no input otherwise. So the two are at least correlated... Thanks for your help! Kyle > Andy > > > -----Original Message----- > From: Kyle Fransham > Sent: Thursday, June 23, 2011 1:05 PM > To: Andrew Hanushevsky > Cc: xrootd-l > Subject: Re: inbound traffic > > Hi Andy, > > This is a machine at UVic that we use to serve BaBar xrootd files to > virtual machines that we spawn in the cloud. It's running little else > besides xrootd. Any traffic on the external interface (the plot that I > sent) is xrootd. We see very high inbound traffic almost all of the time. > > On the back end, we have 10TB or so of data in a lustre filesystem > that's distributed across multiple workers. Since this is a distributed > filesystem, we expect (and we do see) traffic on the internal interface > that's associated with the reading of xrootd collections. But we don't > expect to see that externally... > > What else can I tell you about this machine/setup to help diagnose the > problem? > > Thanks, > > Kyle > > On 06/23/2011 03:36 PM, Andrew Hanushevsky wrote: >> Hi Kyle, >> >> There should be little inbound traffic unless that machine is used for >> more >> than just xrootd services. What machine are we talking about? >> >> Andy >> >> -----Original Message----- >> From: Kyle Fransham >> Sent: Thursday, June 23, 2011 7:48 AM >> To: xrootd-l >> Subject: inbound traffic >> >> Hi all, >> >> We've got a single xrootd server serving out BaBar root files over the >> WAN. We notice that there is a lot of inbound traffic, even though our >> files are exported read-only. Attached is a network plot showing the >> traffic on the xrootd interface for four simultaneous user analysis >> jobs. (In case you can't see the attachment, the inboud traffic tends >> to be about 75% of the outbound traffic.) >> >> Is this expected behaviour? >> >> Thanks, >> >> Kyle >>