Print

Print


On Jan 5, 2015, at 8:51 PM, McCormick, Jeremy I. <[log in to unmask]> wrote:

> Hi, Matt.  
> 
> I'm proposing we use this as the ECAL Eng Run recon steering, to modify the one you have in SVN....
> 
> http://www.slac.stanford.edu/~jeremym/hps/EngineeringRun2014ECalRecon.lcsim
> 
> It replaces the three clustering Drivers with those from the new ecal.cluster package.  
> 
> I'm testing the "new" Drivers right now on file 0 of run 3393 and it seems fine.  I'll look at the LCIO files tomorrow.
> 
> The collection names I changed to...
> 
> EcalClusters
> EcalClustersLegacy
> EcalClusterGTP
> 
> in order to better reflect the algorithms that are being used.
> 
> EcalClusters is the output from the default recon clusterer.  (The one implemented from CLAS-Note-2005-001 and HPS Note 2014-001 by Holly.)
> 
> EcalClustersLegacy is the old, simple Test Run algorithm.
> 
> EcalClustersGTP is of course the GTPOnlineClusterer.
> 


Sounds good...

> I also do not include the raw data in the output by default with something like this...
> 
> <driver name="LCIOWriter" type="org.lcsim.util.loop.LCIODriver">
>    <writeOnlyCollections>EcalCalHits EcalClustersLegacy EcalClusters EcalClustersGTP</writeOnlyCollections>
>    <outputFilePath>${outputFile}.slcio</outputFilePath>
> </driver>
> 
> We can keep the raw data collections for now if you think they're going to be generally useful, but I believe it adds a lot to the file size.  And there is almost no additional information in the mode 7 hits compared to the CalorimeterHits.
> 
> Similarly, perhaps we should only run the ReconClusterer, and if other collections are needed for an analysis, that particular Clusterer's Driver could be inlined in the user analysis job.  I doubt most people who look at this data are really going to want to look at output from three different clustering algorithms.  (I could be wrong there, though!)
> 
> I think adding additional Cluster collections does increase the file sizes by a lot, as the Cluster class has a lot of information in its API.  From my count there are 14 persisted fields, many of which are arrays or lists.  So we really should think about whether it is worth the overhead here, as these LCIO files are already kind of big.

My default thinking for the engineering run is “keep everything” since it’s not a huge chunk of data…that said, I don’t mind dropping the raw hits if there really isn’t any information loss.  For clusters, fine with dropping “legacy”, but I think we should keep GTP clusters since measuring the trigger efficiency run-by-run is something we need to do.  


> 
> BTW, I'm seeing periodic errors from the EVIO converter.  Did you see these when running at JLAB?  Do you think it is anything to worry about?  I'm not sure if this is corrupted data or if we are not handling something correctly.


YES! YES! YES!  Should have put this in the list…we need to figure out what’s causing these errors!

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1