Print

Print


Hi,

 quite a complex setup that I am not sure I understand.

 What are you using as a cache?
 Is XrdCeph speaking xrootd or http in your case?
 What is xrdcp talking to? To the cache or to the backend XrdCeph?
 What is the advantage of putting a cache in front of XrdCeph?

 My suspicion is that some components of these is opening
a new connection for every chunk requested, this would
explain the effect that you see. Of course I may be wrong,
and surely you can find the final answer in the debug logs.

 If this is true, then putting unreasonably sized internal buffers
would just be a workaround for a more important problem to solve.

Fabrizio




Il 13/07/2021 13:53, James William Walder ha scritto:
> Hi Fabrizio, 
>   Thanks for the response. 
> Our Gateways are configured with a memory-cache proxy with a server on
> the same node, configured using XrdCeph to access the Ceph/rados storage
> (as object store). 
> Ceph is configured with a 64MiB stripe size.
> 
> There are some outstanding concerns on optimal configuration of a
> memcache, which may or may-not be related.
> 
> As a figure-of-merit comparison (take the relative comparison of numbers
> more seriously than absolute),
>      using root:// with an xrdcp download, and varying the client
> XRD_CPCHUNKSIZE speeds can range from:
> 
> XRD_CPCHUNKSIZE  |  Download speed (root://)
> 1 MiB ~  4 MB/s
> 4 MiB ~ 11MiB/s
> 8 MiB ~ 20 MB/s
> 16 MiB ~ 35 MB/s
> 32MiB ~ 50 MB/s
> 64MiB ~ 60 MB/s
> 
> My feeling was, if http is restricted to 1MiB (or lower) transfer sizes,
> then we’re not running optimally. 
> I’m happy to believe there are other areas for improvement/understanding
> (e.g. the men-cache, which Tom Byrnne brought up in this mail list), 
> but trying to do a direct comparison between root and webdav seemed like
> a good start.
> 
> Thanks in advance,
> James
> 
> 
> 
> 
> 
> 
> 
> 
>> On 12 Jul 2021, at 14:46, Fabrizio Furano <[log in to unmask]
>> <mailto:[log in to unmask]>> wrote:
>>
>> Hi,
>>
>> of course you are free to experiment, however my feeling is that
>> this may just be an attempt to hide some other issue elsewhere.
>> Which system do you think is sensitive to this size?
>>
>> Fabrizio
>>
>> On 09.07.21 22:43, James William Walder wrote:
>>> Hi,
>>>   I’d like to investigate various ‘block size’ variations in TPC-HTTP
>>> (especially writes); 
>>> (by which I mean the size of the write that is passed to the server from
>>> the proxy, I hope that makes sense … ) 
>>>
>>> By default, I think writes are written at 1MiB boundaries.
>>> From the code, I’m hoping I could make the following changes (see diff
>>> below), but it’s not clear if there are some subtle effects to be aware
>>> of, or if I’ve missed some obvious problem (i.e. like making sure they
>>> are at least written at sensible sizes). 
>>>
>>> The motivation is that I’m observing slow transfers in http-tpc
>>> (relative to root for example), and would like to test the effect of
>>> ‘block size’ changes, for which we are quite sensitive to.
>>> Any thoughts would be appreciated.
>>>
>>> Thanks in advance,
>>> James
>>>
>>>
>>> (Based on 5.3.0-1) 
>>>
>>> diff --git a/src/XrdTpc/XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/> <http://XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/>>
>>> b/src/XrdTpc/XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/> <http://XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/>>
>>> index 76c7d6e..c2ff92d 100644
>>> --- a/src/XrdTpc/XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/> <http://XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/>>
>>> +++ b/src/XrdTpc/XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/> <http://XrdTpcStream.cc
>>> <http://xrdtpcstream.cc/>>
>>> @@ -80,7 +80,7 @@ Stream::Write(off_t offset, const char *buf, size_t
>>> size, bool force)
>>>      // If this is write is appending to the stream and
>>>      // MB-aligned, then we write it to disk; otherwise, the
>>>      // data will be buffered.
>>> -    if (offset == m_offset && (force || (size && !(size %
>>> (1024*1024))))) {
>>> +    if (offset == m_offset && (force || (size && !(size %
>>> (16*1024*1024))))) {
>>>          retval = WriteImpl(offset, buf, size);
>>>          bytes_accepted = retval;
>>>              // On failure, we don't care about flushing buffers from
>>> memory —
>>>
>>> diff --git a/src/XrdTpc/XrdTpcTPC.cc
>>> <http://xrdtpctpc.cc/> <http://XrdTpcTPC.cc <http://xrdtpctpc.cc/>>
>>> b/src/XrdTpc/XrdTpcTPC.cc <http://xrdtpctpc.cc/> <http://XrdTpcTPC.cc
>>> <http://xrdtpctpc.cc/>>
>>> index d800dfa..fff4a1e 100644
>>> --- a/src/XrdTpc/XrdTpcTPC.cc
>>> <http://xrdtpctpc.cc/> <http://XrdTpcTPC.cc <http://xrdtpctpc.cc/>>
>>> +++ b/src/XrdTpc/XrdTpcTPC.cc
>>> <http://xrdtpctpc.cc/> <http://XrdTpcTPC.cc <http://xrdtpctpc.cc/>>
>>> @@ -26,7 +26,7 @@ using namespace TPC;
>>>  uint64_t TPCHandler::m_monid{0};
>>>  int TPCHandler::m_marker_period = 5;
>>>  size_t TPCHandler::m_block_size = 16*1024*1024;
>>> -size_t TPCHandler::m_small_block_size = 1*1024*1024;
>>> +size_t TPCHandler::m_small_block_size = 16*1024*1024;
>>>  XrdSysMutex TPCHandler::m_monid_mutex;
>>>
>>>  
>>>
>>>  XrdVERSIONINFO(XrdHttpGetExtHandler, HttpTPC);
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the XROOTD-L list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>> <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1>
>>> <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>> <https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1>>
> 

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1