Print

Print


Hi,

As I wrote on slack, we could add this to site-local conf storage.xml
without code change.

On Tue, 7 Jan 2020 at 18:13, Bockelman, Brian <[log in to unmask]>
wrote:

> Hi Matevz,
>
> It's been awhile since I read this thread.  Upon re-reading, is it
> necessary to really expose this as an option?  If CMSSW correctly manages
> when to add 'reseg' (and when _not_ to add this), could it ever reasonably
> be harmful?
>
> Brian
>
> > On Jan 7, 2020, at 3:37 PM, Matevz Tadel <[log in to unmask]> wrote:
> >
> > Hi Andy, Brian,
> >
> > I'm trying to create a ticket for CMSSW XrdAdaptor to use triedrc=.
> >
> > For xcache redirector, it is clear one should use =triedrc=resel (local
> reselection).
> >
> > With XrdAdaptor this will become the default for other multi-source
> requests ... and will thus solve FNAL issue, if the jobs talk to the local
> FNAL redirector.
> >
> > 1. Brian: how should we introduce the option for =triedrc=reseg (global
> reselection) into XrdAdaptor? Should this come from outside in some
> fashion, like an env var?
> >
> > 2. Andy: I was tracing through the code to remember how this is all done
> and noticed that kYR_tryRSEG never gets used in the code (other than for
> definition and for setting it when reseg is given). Is there something
> missing here or I just don't get the whole global reselection thing (again)?
> >
> > Matevz
> >
> > On 2019-05-09 15:36, Andrew Hanushevsky wrote:
> >> Hi Brian,
> >> OK, you got you wish along with a way to bypass it (the default being
> "keep it local"). Now, are you going to be at the XRootD Workshop? If not,
> can someone be there to give a presentation on the latest developments on
> SciTokens et al?
> >> Andy
> >> -----Original Message----- From: Bockelman, Brian
> >> Sent: Monday, May 06, 2019 1:39 AM
> >> To: Andrew Hanushevsky
> >> Cc: Matevz Tadel ; Michal Kamil Simon ; xrootd-dev
> >> Subject: Re: Proposal for new opaque URL parameter using= complementing
> tried=
> >> Hi Andy,
> >> In this case - there's good reason to not send clients offsite (even if
> the offsite server is providing better performance, the WAN costs more...)
> when there's a perfectly good copy onsite.  We can be sure to drop the
> "resel" from the "triedrc" when the job is looking for an additional source
> because of an error instead of wanting faster sources.
> >> I think it would be useful to keep the "file not found" code to only
> trigger when the file is actually not found.
> >> Brian
> >>> On May 6, 2019, at 7:35 AM, Andrew Hanushevsky <[log in to unmask]>
> wrote:
> >>>
> >>> Well, isn't the point of reselection is to find the best possible site
> which could be offsite? We could give you an option to keep it local but
> you would need to add that to the conig file.
> >>>
> >>> On Sun, 5 May 2019, Bockelman, Brian wrote:
> >>>
> >>>> Hi Matevz,
> >>>>
> >>>> The other thing that should go into the cmsd is to avoid doing a
> ?redirect on file not found? for reselection.
> >>>>
> >>>> This would help immensely in cases like FNAL which uses this for all
> jobs, causing the multi source CMSSW to pull data that is onsite, from
> offsite, due to reselection.
> >>>>
> >>>> (After telling them for 5 years to change, I guess we can tweak the
> software ;) )
> >>>>
> >>>> Brian
> >>>>
> >>>> Sent from my iPhone
> >>>>
> >>>>> On May 3, 2019, at 6:57 PM, Matevz Tadel <[log in to unmask]> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Andy realized that an option for this already exists -- triedrc=resel
> >>>>>
> >>>>> Andy impleented a change in cmsd that allows disabling opening of a
> new file on reselection, goes under cmsd.sched nomultisrc.
> >>>>>
> >>>>> Brian, now we have to propagate this into XrdAdapter.
> >>>>>
> >>>>> Matevz
> >>>>>
> >>>>>> On 4/18/19 10:45 AM, Andrew Hanushevsky wrote:
> >>>>>> After some discusion with Matevz, we decided to simplify this, So,
> it won't be exactly what was outlined but will be functionally the same.
> This requires soem development in the cmsd. That said, an issue should be
> cut.
> >>>>>> Andy
> >>>>>>> On Thu, 18 Apr 2019, Michal Kamil Simon wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> It sounds reasonable to me :-)
> >>>>>>>
> >>>>>>> Matevz: could you create an issue in github so we don't loose
> >>>>>>> track of this topic? ;-)
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Michal
> >>>>>>> ________________________________________
> >>>>>>> From: [log in to unmask] [[log in to unmask]]
> on behalf of Bockelman, Brian [[log in to unmask]]
> >>>>>>> Sent: 17 April 2019 03:59
> >>>>>>> To: [log in to unmask]
> >>>>>>> Cc: xrootd-dev
> >>>>>>> Subject: Re: Proposal for new opaque URL parameter using=
> complementing tried=
> >>>>>>>
> >>>>>>> Yes!  We definitely could benefit from this on the CMS side!
> >>>>>>>
> >>>>>>> Sent from my iPhone
> >>>>>>>
> >>>>>>>> On Apr 15, 2019, at 5:21 PM, Matevz Tadel <[log in to unmask]>
> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> [This is mostly for Andy, Brian, and Michal.]
> >>>>>>>>
> >>>>>>>> In the context of XCache cluster used by CMSSW multi-source jobs
> there is an issue with cmssw jobs requesting opening of a second source on
> the cache cluster using the tried= opaque parameter to point to cache
> server already in use. This leads to creation of another replica of the
> same file in the cache cluster.
> >>>>>>>>
> >>>>>>>> The cache still needs to honor tried= in case there is a problem
> with the existing server. However, asking for a new "extra" server in the
> context of cache does not make much sense.
> >>>>>>>>
> >>>>>>>> To distinguish these two conditions I propose to introduce a new
> opaque directive, "using=", used to signal to the redirector that the
> client is already using the listed servers.
> >>>>>>>>
> >>>>>>>> On cmsd side this would be accompanied with a cms.dfs multisource
> count ("sister" option to cms.dfs retries). These two would then control
> how many errors and parallel accesses are allowed for a client session.
> >>>>>>>>
> >>>>>>>> Does this make sense?
> >>>>>>>>
> >>>>>>>> Matevz
> >>>>>>>>
> >>>>>>>>
> ########################################################################
> >>>>>>>> Use REPLY-ALL to reply to list
> >>>>>>>>
> >>>>>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
> >>>>>>>>
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
> >>>>>>>
> >>>>>>>
> ########################################################################
> >>>>>>> Use REPLY-ALL to reply to list
> >>>>>>>
> >>>>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
> >>>>>>>
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
> >>>>>>>
> >>>>>>>
> ########################################################################
> >>>>>>> Use REPLY-ALL to reply to list
> >>>>>>>
> >>>>>>> To unsubscribe from the XROOTD-DEV list, click the following link:
> >>>>>>>
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
> >>>>>>>
> >>>>>
> >>>>
> >> ########################################################################
> >> Use REPLY-ALL to reply to list
> >> To unsubscribe from the XROOTD-DEV list, click the following link:
> >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
> >
>
>

-- 
---------------------------------------
Justas Balcas
Caltech CMS Group
CIT Downs-Lauritsen 239
CERN B32/3-A09 (72531)

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1