Print

Print


Hello Homer and Jeremy,

It seems we all have right ideas and looks like very similar ideas on 
how analysis of data must be done.
The confusion looks to me comes from definitions of "analysis" and 
"DST"s. When about a year ago I
brought up the question of DSTs, and even sent out possible format 
(attached document), I basically
wanted what Jeremy said in the second sentence after (3), physics 
objects only. What Omar showed
today was very different from what I could describe as DSTs. I 
understand Matt's point that in some
cases you will need fine details, but I am not sure if everyone will 
need that level of details.
So I still think if we are talking about DSTs, the format should be 
"physics objects only". And if Omar
can make use of what I proposed a year ago will be great.

As for general analysis, if we stick with (1), than we will make large 
number of collaborators who are
used to do analysis in ROOT quite unhappy. I understand that duplicating 
processed data in many
formats is also not a reasonable approach. So, if (2) means (sorry for 
my ignorance) we can have some
kind of "portal" that can connect LCIO recon file to ROOT, then it is 
probably the best way to go.

Again, sorry if I am misinterpreting the issue and/or repeating what was 
already clear from your emails.

Regards, Stepan

On 3/6/13 6:10 PM, McCormick, Jeremy I. wrote:
> Hi, Homer.
>
> Thanks for the thoughts.
>
> My view is that user analysis has three possible pathways which make sense to consider:
>
> 1) Pure Java analysis using lcsim and outputting histograms to AIDA files, viewable in JAS.
>
> 2) LCIO/ROOT analysis, reading in the LCIO recon files, looping over these events, and making histograms from a ROOT script.
>
> 3) Pure ROOT analysis, operating on a ROOT DST file.
>
> I don't really think that we need a DST containing all of the information which is already present in the final LCIO recon file.  This level of duplication is not desirable.  Rather, the ROOT DST should contain physics objects only, e.g. the equivalent of LCIO ReconstructedParticles, Tracks, and Clusters, along with event information.  This should be sufficient for doing a pure physics analysis, e.g. good enough for most users.  It is also likely that it could be represented using simple arrays rather than classes, which to me is desirable for this kind of format.
>
> If one wants to look at the associated hits of the tracks, or something similarly detailed, then it seems to me that it would be better to use the #1 and #2 approaches, as we can then avoid "reinventing the wheel" by making ROOT files that mimic the structure of the existing LCIO output.  This approach would require working from the LCIO output, but I really don't see a problem there.  It is not onerous at all.  The API is straightforward and well-documented, and examples can be provided.  There is already a simple analysis script in my examples that you linked which plots information from Tracks in an LCIO file using ROOT histogramming.  Similar plots could easily be made for the hits, etc.
>
> I suppose one could demand that all this data be put into ROOT including the hits, but you're left with the same problem.  Someone still has to learn the API of whatever classes are used to store the data, and the class headers also need to be loaded to interpret the data.  Whether that format is LCIO or ROOT, it is essentially the same level of knowledge that would be required.  My feeling is actually that this will be more difficult/cumbersome to work with in ROOT rather than LCIO.  I wonder why we can't just go with what we already have, e.g. the LCIO API, rather than invent something analogous which does not seem to serve a very clear purpose.  One can already use what's there in the linked example to look at the full events, so can we start there and see how far we get?
>
> If someone has a clear use case where pure ROOT data is needed at the lowest level of detail, I would consider this request, but I have seen nothing concrete so far along these lines.
>
> --Jeremy
>
> -----Original Message-----
> From: Homer [mailto:[log in to unmask]]
> Sent: Wednesday, March 06, 2013 2:51 PM
> To: Jaros, John A.; Graham, Mathew Thomas; McCormick, Jeremy I.; Graf, Norman A.; Moreno, Omar; Nelson, Timothy Knight
> Subject: DSTs and work on slcio files using C++
>
> Hi,
>
> I decided not to comment during the meeting because it might have created more contention and I also wanted to hear Jeremy's, Norman's and Omar's responses first before throwing this out there. That said, from the point of view of someone who has been doing lcsim SiD analysis on slcio files I find the problems with using the two formats in HPS a little strange. For SiD we take slcio files and then run jet clustering and flavor tagging using C++ code in the lcfi and
> lcfi+ packages. For the flavor tagging we write out root files for
> lcfi+ running the
> TMVA training and then for both the jet clustering and the flavor tagging we write out slcio files. I believe Malachi has done his whole analysis in C++ as a Marlin processor. I had also successfully tested reading slcio files in ROOT using a recipe provided by Jeremy. I dropped using it when I realized that it was quite simple to write the analysis in java. Perhaps one solution is to stick to doing all development, even for the DST, in java/lcsim and to just provide examples of how to access the data from C++/ROOT reading slcio files. Jeremy had documented much of this long ago at:
>
> https://confluence.slac.stanford.edu/display/hpsg/Loading+LCIO+Files+into+ROOT
>
> If we just provide some examples, wouldn't that help to at least put out the current fires? This would also avoid having to support numerous extra sets of data (DSTs and microDSTs in both formats with multiple passes and subsets)??
> Maybe I'm wrong but I think one can provide simple recipes or modules for accessing any of the slcio file contents in ROOT.
>
>      Homer
>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1