Print

Print


Hi Andy,

I have repeated the proxy test with the same xrootd configuration file 
on some other machine and I can confirm that it is very likely the 
problem is in the proxy server implementation.

This plot shows network traffic of the proxy server 
http://uaf-2.t2.ucsd.edu/~matevz/tmp/ReadFromProxy.png for jobs with 
vector and single reads.

Since proxy server could not handle 300 vector reads I made another test 
-- run the same 300 jobs reading directly from origin. The performance 
was as expected 
http://uaf-2.t2.ucsd.edu/~matevz/tmp/ReadvDirectlyToOrigin.png

I have also checked that the proxy machine can handle expected traffic 
in both directions using iperf 
http://uaf-2.t2.ucsd.edu/~matevz/tmp/ReadBasicTest.png


Proxy server code was taken from the master branch.

Thanks,
Alja



On 01/21/14 17:46, Alja Mrak-Tadel wrote:
> Hi Andy,
>>
>> The way I read the graph is that when you put a bigger load on the
>> proxy server it slows down. I wouldn't think that is unusual. The
>> proxy machine may be overloaded at that point. It's unlikely to be the
>> nic because we know that it can do at least 360mbps. So it might be
>> the CPU (easy to check). Alternatively, it could be the disk drive at
>> the end point that is overloaded with 300 jobs.
>>
> The proxy server had low CPU load. I did not check disk IO, but I guess
> this is negligible because caching on the proxy was not enabled, neither
> disk nor memory ... also, no prefetching of any kind.
>> But let's say the above isn't the case because 300 jobs in single read
>> perform better than 150 jobs in single read. That would point to the
>> fact that either proxy is unrolling the vector reads or that the
>> server is unrolling them. I suspect both because the readv passthrough
>> is not available until Release 4 which has not been released yet.
>> Sever-side unrolling, as unpleasant as it is, would not show such an
>> effect but proxy-side unrolling most certainly would.
>>
>> The take-away: try the test again with a proxy server based on git head.
>
> I run proxy test from master branch, which I updated 34 days ago. I will
>   run the same set of jobs from the machine where the proxy was running
> directly to the data servers (mostly at FNAL) to make sure the problem
> is really in the proxy server.
>
> Thanks,
> Alja
>
>
>
>> On Tue, 21 Jan 2014, Alja Mrak-Tadel wrote:
>>
>>> Hi,
>>>
>>> There was an issue at RAL T1 where they are trying to use proxy
>>> server as their nodes do not have outgoing connectivity.
>>>
>>> Trying to reproduce this at UCSD, I have been running test on a proxy
>>> server without any caching using the following configuration file:
>>>  http://uaf-2.t2.ucsd.edu/~alja/onlyproxy.cfg
>>>
>>>
>>> On the plot http://uaf-2.t2.ucsd.edu/~matevz/tmp/traffic.png is
>>> network traffic of the proxy server during four consecutive tests:
>>> - 300 jobs reading 2.4MB every 10s in 128 chunk-vector-read
>>> - 300 jobs reading 2.4MB every 10s in a single read
>>> - 150 jobs reading 2.4MB every 10s in a single read
>>> - 150 jobs reading 2.4MB every 10s in 128 chunk-vector-read
>>>
>>>
>>> Is there something obvious that we missed in the configuration of
>>> proxy-only server? How should we go about trying to find the
>>> bottlenecks here?
>>>
>>> Thanks,
>>> Alja
>>>
>>> ########################################################################
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the XROOTD-DEV list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
>>>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-DEV list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1