LISTSERV mailing list manager LISTSERV 16.5

Help for XROOTD-L Archives


XROOTD-L Archives

XROOTD-L Archives


XROOTD-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

XROOTD-L Home

XROOTD-L Home

XROOTD-L  September 2004

XROOTD-L September 2004

Subject:

Re: xrootd/data management use cases from last year's Lyon workshop

From:

Peter Elmer <[log in to unmask]>

Date:

10 Sep 2004 13:59:41 +0200Fri, 10 Sep 2004 13:59:41 +0200

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (368 lines)

  Hi Artem,

On Thu, Sep 09, 2004 at 05:48:47PM -0700, Artem Trunov wrote:
> > > Admin should be able to turn on debugging remotely, i.e via a call to the
> > >   server.
> >
> >   Since the server can be restarted easily without crashing the clients,
> > the server could also be restarted with extra options in the config file.
> > I don't feel strongly that this has to be possible via the adminstrative
> > interface and don't even know how easy it would be to add this. Andy?
> 
> Config files are tailor controled, which means you have to run tailor or
> wait for tailor scheduled run to update the config, and then restart.
> Seems like a hassle otherwise drop this use case.

  [Note for non-slac people: tailor/taylor is a slac-custom system for 
distributing config files and software to machines they maintain.]

  I don't feel strongly about it. Andy should comment, though. I can perhaps
see that being able to turn on debugging in-situ without restarting the
server might have some value in some particular (theoretical) situation where 
the server has begun to behave strangely.

> > > Admin should be able to read server's log file(s) remotely, via a call. Log
> > >   files include main log file, error messages, trace output, pid file.
> >
> >   IIRC, in the original discussion we had about this some of use felt that
> > this would be useful, but overkill, since other tools could be used.
> 
> I think overkill is to use 'ssh' and 'more' to examine up to 48 logs files
> on 40 hosts. (48 is xrootd, olb, mlog, plog, slog, Slog x 8).

  [Note for non-slac people: the mlog/plog/slog/Slog files are all (I think)
associated with the staging system at SLAC and not the xrootd/olbd itself.]

  A couple of things to note:

   o the xrootd system will likely _never_ have access to the staging log
     files (mlog/plog/slog/Slog) so you'll probably have to face this 
     problem in any case. 

   o The adminstrative interface will connect to the xrootd (and not the
     olbd). I don't know if the olbd protocol would support providing _its_
     log file to xrootd, etc. etc.

  The whole thing starts to get rather ugly. In the end you will always have
some other log file (Ganglia?) that you want to look at which is on the
data server itself. This is a general problem so there must be some tool out 
there to harvest or display log files like this. If not presumably one can 
probably make something simple as you have undoubtedly already done.

  (i.e. you can try to convince Andy, but I won't help you do it for this 
   one...)

> > > Server should be able to log it's host's and it's process' cpu and memory
> > >   utilization and other usufull paramemters.
> >
> >   At the time of the workshop SLAC had no useful system monitoring (despite
> > having tens of millions of dollars of equipment). Since that time Yemi has
> > deployed the Ganglia monitoring at SLAC. Using some external agent like
> > Ganglia to monitor things like cpu and memory usage seems a better structure
> > (and is what we have at SLAC and other places). Artem, are you happy with
> > that?  (i.e. with no features in xrootd itself for this)
> 
> We don't have programmatic access to monitoring info, and therefore can
> not manipulate all figures at will. also, other users of xrootd don't use
> Ganglia.

  Other users will then presumably use some other system for monitoring 
and/or alarms. I don't see any reason to bloat xrootd by making it a 
generalized monitoring system. There are much better external tools for
that. xrootd has to provide access to statistics/metrics _specific_ to
xrootd and it in fact does that. Some external system (like Ganglia) should
gather those, display them, archive them so time series can be examined, etc.
At SLAC Yemi is in fact doing this.

  The issue of programmatic access to Ganglia information is not related
to xrootd. We can continue that discussion, but not here...

> > > Admin should be able to dump/load server's configuration remotely.
> > > ===>24/7 availability is essential. Stoping 1000+ clients for some simple
> > >   tasks like reconfiguration is bad bad bad.
> >
> >   Well, I think we've done pretty well at SLAC in terms of 24/7 availability.
> > (CNAF in particular seems to have problems, though.)

  A side note on my comment (above) about CNAF: when I was writing up my
original posting CNAF started (again) to have problems. The issue was one
we have seen before: The CNAF BaBar xrootd/olbd system is a pure disk cache
and one of the data servers had gone down. While the xrootd/olbd and client
support handling this gracefully, CNAF has no back end mass storage or other 
means for replicating (or "reobtaining") files when disk server goes down. 
Thus any problems with a disk server result immediately in user complaints 
about missing files. 

  There are several solutions here:

   o Doubling the size of the disk cache and putting more than one copy of 
     every file on different servers in the disk cache (ok, I'm joking).

   o Using the mass storage at CNAF (apparently problematic and not currently
     foreseen)

   o Allowing the files to be read or "reobtained" in its entirety from some 
     other Tier A center (e.g. via the proxy clients that Andy has described 
     several times)

  It sounds like we may have the pieces necessary to try the last possibility
at some point in the non-too-distant future.

> >   I'm not so clear on why it is useful to dump the configuration remotely.
> > Artem, do you still feel strongly about this?
> 
> The ultimate goal of remote administaration is not to log into any of kan
> servers at all. If you need to log in to check what the config file is,
> this complicates your life.

  I guess I don't feel strongly about this one and, on reflection, can see
how it could be useful. Simply looking at the config file on disk won't
always tell you with what configuration the server was started as it may
have been overwritten with a newer version of the config file since the
server was stated. You _might_ be able to backtrack through the log files
to when the server was started to look at the printout, but if it has
been running for many days that might not be so easy. (And in fact the
log files could even have been purged.) Andy should comment.

> > > Admin should be able to give a signal to dlb to rescan file system for
> > >   new/gone files.
> >
> >   The olb (once known as the "dlb") doesn't maintain any state on the data
> > servers, or have I misunderstood? I'm not sure what this means. The manager
> > olbd obviously does have a cache, but as I understand it it also times
> > out entries older than 8 hours.
> 
> So the proposal is to make it admin-induced in addition.

  For new files, it should be sufficient to simply send a prepare with the
list of files (a "prepare" with the list rather than simply opening them
serially to avoid paying the 10s "wait" between each one). Thus this is
presumably doable. If the admin has in fact put the files on disk for some 
reason and expects that they will be accessed in the next 8 hours, it could
be worthwhile. Otherwise the users will just take care of this naturally
as they begin to access the files.

  For files which have been removed, it will presumably depend on why
they have been removed. When the client runs into the situation where
it: 

    o tries to open a file X via the redirector xrootd
    o is redirected to server Y because the manager olbd has 
      "file X -> server Y" in its cache
    o is told by server Y that it doesn't in fact have file X

it is supposed to go back to the redirector xrootd and open the file at
the redirector again, but this time with kXR_refresh set. In this case
the result will be:

   o the file is found on some other server and the client is redirected there
   o the client is redirected at some server which will stage it in again

  What exactly would the admin be trying to achieve after having removed
a file? If it is simply to save clients some time going through the 
(ask/be-redirected/not-there/go-back-to-redirector-to-ask-and-refresh)
cycle, that could be perhaps be useful, but isn't critical as the client
will do the right thing. In practice you probably want some way to something
like kXR_refresh without actually opening the file (Kind of like a 
kXR_forget...) The next client that comes in will then actually trigger
the system to find the file. Andy?

> >   In practice, however, we've not found this necessary in BaBar. Artem, what
> > was the use case for this in the past?
> 
> Inhibiting federations for maintanance, preventing users from running on
> bad data.

  Personally I don't think it is the job of the data access system to prevent 
users from running on "bad" data (i.e. data deprecated because of "data 
quality" or because it has been reprocessed or whatever). That can get
extremely complicated very quickly as (a) the granularity of what is being 
rejected can be very small and (b) there may be completely legitimate reasons
for someone to access data declared "bad" (e.g. to determine how something
changed from "bad" data to "good" data). Some bookkeeping system unrelated
to the data access system should help users with making sure they run the
"right" jobs on the right data.

  There is definitely room for "policy" in terms of who can access what (and,
for example, who can cause which things to be staged), but my guess is that
it has to be a very high granularity thing (e.g. /store/R14/* files are okay,
but /store/R10/* are just too old, sorry, charlie...). The MPS stuff that
is there as you know does allow you to do some things at a per file level,
like pinning them on disk. I'd be curious to hear what others think, though.

  The only analog of the Objy system's "inhibiting federations for 
maintanance" here is taking down a data server. The "unsolicited 
response" feature (via the adminstrative interface) is the place to deal
with that.

> > > DLB should be dynamically and remotely configured not to redirect requests to
> > >   specific hosts, either forever or for specified time.
> >
> >   I think this is just done by stopping the xrootd on that affected machines.
> > The olb can be configured not to accept requests from the manager if their
> > is is no xrootd running. Is that sufficient?
> 
> No. Stopped xrootd needs to tell clients to come back in a certain time
> that admin specifies (remotely). For example, if unix-admins need to
> reboote a machine to apply a patch, we'd rather have clients wait for 10
> minutes rather than redirecting them and restaging a file somewhere else.
> So we'd need to tell redirector to hold those clients how need to access
> that host to for 10 minutes.

  Ok, I was reading your use case too literally. Again the "unsolicited 
response" stuff was foreseen to cover this class of use cases (but it is the 
xrootd which is doing this, not the olbd). We should probably go through the 
various sub-use-cases here and categorize them to make sure they are supported.

> > > Xrootd should not stop working if hpss goes down.
> >
> >   Since it was just announced that HPSS is unavailable about 10 minutes before
> > I got to writing these lines, we'll see how this goes. I'm not sure we've yet
> > really gone through an extended HPSS outage, so we'll presumably learn some
> > things this time.
> 
> If xrootd will need to stage in a file and get an error other than "file
> not existent", it should handle this gracefully, holding client for some
> time (externally (and remotely! and dynamically!) configured).

  The claim is that this is the case. This isn't really xrootd itself, though,
but the staging software itself. (If it starts to return lots of "file not
existent" messages, there isn't much xrootd can do about it.) I've not yet
looked at the log files, do we know what happened during the period of the
HPSS outage the other day? Did it handle it gracefully?

> > > When a files system on a host crashes, xrootd should automatically recover.
> > >   It should report, that FS is down.
> >
> >   How is it supposed to recognize that there is a problem with the filesystem?
> 
> it gets distinct return code from open, seek, read etc.

  Could you be more specific about which return codes it would get to know 
that it is a "filesystem problem"? (Which I read as "hardware problem", so
perhaps you should be more specific about which filesystem problems you 
mean.)

> > > DLB should be checking xrootd "health" of a data server and it's
> > >   filesystems as a part of load measure. Should report if finds something
> > >   wrong. Anything that prevents xrootd or dbl from doing it's job, like
> > >   network problems, afs troubles, should be reported.
> >
> >   Again, it isn't clear to me what exactly should be monitored. Can you
> > give examples?
> 
> If one of file system is performing significantly worth that another one,
> this should be noticed. Just recently disk on bbr-xfer05 was very slow, no
> one could noticed, only Remi did I don't know how.

  Why in the end was the disk slow?

> >   Artem, do you agree that this isn't the job of the xrootd system itself, but
> > of something like Ganglia? (Or whatever, something designed to do monitoring
> > and alarms of systems.) We shouldn't reinvent that wheel.
> 
> I don't care what does monitoring and alarming. I gave your some
> reasoning for close-coupling it with xrootd. Again: if there an
> application that provides data access, it should monitor data access
> related performance and alaram when data access has some problems.
> Note again, that we don't have any more or less convinient, not
> to say any sofisticated alarming.

  Then we should have the separate discussion about how to deal with the
alarms and monitoring. IMO all xrootd should do is report basic statistics
about itself or which it gathers in doing its job. Some other entity should 
accumulate those and implement the logic which decides whether to "report" 
something or (for example) shutdown a particular data server or whatever.

  We may be talking past each other here as you keep using words like "report",
"monitor" and "alarm". There are tools to do that. xrootd shouldn't reinvent
it, IMO. (This is sort of a policy vs mechanism discussion.)

  Please take a look at the actual statistics which xrootd can currently
provide when queried (look for the "kXR_query" section in the protocol
document). Are there other _specific_ things (_incidental_ to its normal 
operation) which xrootd could or should collect and report as part of 
those statistics? If it is something non-incidental and non-xrootd specific
some external tool is likely to be better for the job.

  Once we have gotten past that, we could then look to see what could be
done with Ganglia and alarms at that level.

> >   As to the "reporting to the administrator" part of this use case, we
> > decided to make the "alarm" mechanism external to the xrootd system. This
> > should be handled by something else (e.g. like alarms with Ganglia, say).
> > Artem, is that sufficient?
> 
> May be, but the idea is that xrootd will detect error conditions
> immediately, while any external system will need some time. When xrootd
> detects error condition, it can react to it by adjusting something, i.e.
> turning itselft off, while external system will merely notify something.
> Besides we don't really have any kind of alarm system. This is actually
> good thing for brainstorming, since we need to clelarly define what do we
> monitor and react on or what's unix-admins responsibilities. I'v always
> wondered why do we have to tell u-a about dead hosts and file systems, and
> not vice-versa.

  Well, part of the problem was the lack of something like Ganglia at SLAC.
I agree that it would be good to discuss the specific failure modes. Could 
you talk to the others at SLAC and try to make a list of things that have 
happened? (And if possible what could be a possible reaction if the same
thing happens in the xrootd/olbd system.)

> > > Reporting: dbl should be able to send messages to some other application
> > >   for further error handling.
> > > ===> Reporting error condition on timely matter is essential. It doesn't
> > >   make a lot of sence to build another monitoring system, if dbl is
> > >   already doing so.
> 
> >   I disagree. A real (complete, full) monitoring system should be used. The
> > olbd (once called "dlb") does a very limited set of things as part of its
> > load balancing job.
> 
> I am talking about a system that reacts on errors not monitors it. So what
> if a filesystem went don't at 2 am - I am less interested to recieve and
> alarm, but I'd rather see xrootd reconfigured and restarted to avoid using
> that filesystem. Another example: user job chrashes and give message "file
> not existent". This leave user wondering why. If xrootd could not only
> print this error to client and it's log, but to pass it to some
> intellingent error processing system, such system could attempt to find
> out why the file is missing and send user more detaind explanation and
> suggestions.

  Hmm, perhaps something like: 

   Here is Artem's telephone number, call him and ask him to take a look. He's
   probably up anyway.

;-)

  Let's finish the discussion of what xrootd should and should not do. Then
we can talk about interaction with users, etc.

> > > Testing usecase
> > > ---------------
> > >
> > > Dlb sensors should be able to simulate various load conditions on a host in
> > > order to test it's functionality.
> >
> >   I'm not sure how we simulate the load conditions _internal_ to the olbd in a
> > way that isn't artificial in such a way that it doesn't really test anything.
> > What you want could however presumably be accomplished by providing dummy
> > scripts to the "olb.perf" directive.
> 
> So, the scripts should be able to simul.ate the load.

  Since the output of the script is really trivial (5 numbers), I would 
hesitate to complicte the existing script by trying to foresee the full set 
of test cases someone might want _and_ implement it in the existing script
(XrdOlbMonPerf). That would be confusing.

                                   Pete

-------------------------------------------------------------------------
Peter Elmer     E-mail: [log in to unmask]      Phone: +41 (22) 767-4644
Address: CERN Division PPE, Bat. 32 2C-14, CH-1211 Geneva 23, Switzerland
-------------------------------------------------------------------------


Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
July 2009
June 2009
May 2009
April 2009
March 2009
January 2009
December 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use