Hi Jeremy,
Thanks for forwarding my message to people who might be interested in this.
I'm not familiar with HPS analysis infrastructure, so I don't have a firm
opinion on whether keeping HPS calibration data in a database is an optimal
solution. The original request came from HPS people so I assume someone
thinks so.
The advantages of using a database include scalability and avoiding the
massive data duplication you described. Keeping all calibration data in the
resource section of your software bundle means it will be included in every
tarball sent to/from batch farm or grid machines. If this is a relatively
small amount of data that will be used by every analysis job for the next
few years, then it might be acceptable. On the other hand, I think it still
makes sense to access that data through the conditions framework or
something similar, so you don't accumulate a lot of analysis code that will
have to be modified once you get more data and your current calibration data
storage solution becomes impractical.
- Dmitry.
> -----Original Message-----
> From: McCormick, Jeremy I. [mailto:[log in to unmask]]
> Sent: Friday, September 14, 2012 12:37 PM
> To: Onoprienko, Dmitry; Neal, Homer A.; Johnson, Tony; Graf, Norman A.;
> Nelson, Timothy Knight
> Cc: Graham, Mathew Thomas; Uemura, Sho; Omar Moreno; Maurik Holtrop;
> hps-software
> Subject: RE: HPS conditions framework
>
> Hi,
>
> Thanks for the email.
>
> This is a discussion that should be shared on the hps-software mailing
list, as
> there are several people working with this more closely that should be in
the
> loop, namely Matt, Sho, and Omar. Maurik should also be aware of this
because
> he's in charge of the HPS software effort. I've added all of these people
and the
> list to this message.
>
> My opinion is that a MySQL database is overly complicated for what we're
> currently doing, and I don't see anyone needing anything like this for
now.
> Recently a bunch of files were removed from the hps-detectors CVS, which
is the
> source for HPS conditions, and put into the resources area of the hps-java
> project where they can be accessed as plain old Java resources. This data
was
> from the test run, and reason for doing this was that the conditions had
to all be
> copied between the different detector model directories to be accessed,
which,
> as you can imagine, involves an enormous amount of data duplication. And
> many of these data sets are being used to analyze ALL data from the test
run
> across all runs, so putting them into a conditions system by run or time
does not
> make that much sense.
>
> The problem with a database, as I see it, is that the user either has to
run this
> themselves on their local machine, which is an extra amount of
installation and
> setup, OR there has to be a shared instance that hps-java accesses,
especially
> for jobs run on batch farms or the grid. (The database also has to be
kept up to
> date, which is more complicated than the current method of 'cvs up' to get
new
> text files.) Either one of these working methods could cause problems
with how
> we currently do things. For one, many batch systems are not that friendly
when
> it comes to connecting out to databases unless this is preconfigured by
their
> admins to allow it. And having users run their own MySQL database just
strikes
> me as an unnecessary headache. We already have enough troubles getting
> people to install and run the software on their own machines, and this
would
> add another hurdle that frankly isn't necessary right now.
>
> For the analysis of test run data and for the foreseeable future, we are
fine in
> terms of LCSim infrastructure in this area, in my opinion. Given that the
next
> HPS "run" is not going to be for several years, at least, as the JLab
schedule was
> pushed back, I'd like to see attention focused elsewhere, namely on
JAS/AIDA
> development.
>
> We've started to collect a list of requested features and bug fixes here.
>
> https://confluence.slac.stanford.edu/display/hpsg/HPS+Software+Wish+List
>
> What you added to the current conditions system for accessing databases
> should be fine, at least for people just to play around with and
experiment.
> There isn't anything on that list having to do with conditions, but people
can add
> it if they wish or see the need. To me, the need for this kind of
"proper"
> conditions system with run/time tagging and a database backend is just so
far
> into the future that it is practically irrelevant right now.
>
> That's mostly just my opinion though. I wonder if others have anything to
add to
> the discussion...?
>
> --Jeremy
>
> -----Original Message-----
> From: Onoprienko, Dmitry
> Sent: Friday, September 14, 2012 11:24 AM
> To: Neal, Homer A.; Johnson, Tony; McCormick, Jeremy I.; Graf, Norman A.;
> Nelson, Timothy Knight
> Subject: HPS conditions framework
>
> Hello Everyone,
>
> I just want to touch base regarding the HPS use of org.lcsim conditions
> framework.
>
> A few months ago, I was asked by Tony (the original request came from
Homer I
> believe) to modify the framework and enable reading conditions from MySQL
> database instead of a zip file. I was also asked to make it possible for
conditions
> to be run-dependent. Since HPS was preparing for the beam test at that
time,
> Norman and Jeremy stressed the importance of keeping changes to org.lcsim
> code to absolute minimum. I also found that HPS was already using the
> conditions framework in the traditional way - getting data from zip files
- so that
> capability had to be preserved.
>
> Given these constraints, I made a few minor changes to standard
> implementations on ConditionManager and ConditionsReader, and created a
> hook that can be used to load experiment-specific conditions reader. The
> changes are backward compatible and transparent to the user - the
framework
> will still look for a conditions archive or directory with a name derived
from the
> detector name it finds in the org.lcsim event. However, if the
> detector.properties file in that archive contains a line in
> "ConditionsReader: <ClassName>" format, the framework will instantiate
> <ClassName> class and use it in place of the standard reader.
>
> There is an example of use in org.lcsim.hps.conditions package (hps-java
> project, see package javadoc for instructions on how to run it). Since we
knew
> nothing about the structure of the actual HPS conditions data, the example
is
> just a trivial demo that shows how to fetch something from a database and
> retrieve it through the conditions framework.
>
> Now that people seem to be actually using conditions, I wonder if it's
time to
> take the next step. Is the scheme of loading a custom conditions reader
through
> an entry in the detector.properties file convenient ? There are
alternatives that
> would make the framework behavior a bit more transparent at a price of
> requiring more substantial changes to the codebase. Is anyone familiar
with HPS
> data interested in working with me to get some more realistic conditions
into
> the database ?
>
> Best Regards,
> - Dmitry.
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1
|