Print

Print


Thanks...very useful information!

-----Original Message-----
From: Kyle McCarty [mailto:[log in to unmask]] 
Sent: Tuesday, December 16, 2014 2:55 PM
To: McCormick, Jeremy I.
Cc: Holly Vance
Subject: Re: cleaning up the ECAL clustering code

Hello Jeremy,


The two clustering algorithms that are mine are GTPEcalClusterer and GTPOnlineEcalClusterer. These are both implementations of the hardware clustering algorithm that is most current. The GTPEcalClusterer is the original algorithm and is used in the readout simulation to simulate the hardware clustering on Monte Carlo data. The GTPOnlineEcalClusterer is a work-in-progress version that is designed to run on readout data instead. The reason there are two is because the clustering algorithm uses a time window to analyze hits and determine which one falls into a cluster and which do not. For Monte Carlo, we treat each event as a 2 ns window, so the algorithm builds its time buffer of hits by storing events and treating each one as 2 ns. The readout just outputs a large number of hits that were within a certain time window and each individual event does not represent any particular time length. This means that each event must be considered independently and a time buffer must be generated from the hits within the event using their time stamp instead. Since this is a fairly significant difference in a fundamental aspect of the algorithm, I felt that it was not reasonable to try and make one algorithm that worked for both. This is particularly true because the simulation clusterer has already been tested thoroughly and added to the steering files, so changing it drastically now would risk breaking the Monte Carlo simulation.


It might be better, when the online algorithm is finished, to rename them something like "GTPMonteCarloEcalClusterer" and "GTPReadoutEcalClusterer" since these more accurately represent their function, but I was holding off on renaming them until the online algorithm is working. Currently, it can not be completed because it crashes when building clusters due to the fact that "addHit" is HPSEcalCluster uses "getRawEnergy," and as we have been discussing on the mailing list, that is a problem. Once this issue is resolved, the algorithm will be completed and tested. Also, at this point I will see if I can abstract the two drivers at all to cut down on repeated code. I did this already for the trigger drivers, but it is trickier for the clustering.


CTPEcalClusterer is the old clustering algorithm from the last run. I believe it is retained largely for legacy and reference purposes. I do not know if it is reasonable to keep. Perhaps it should be moved to a "test-run" package so that it doesn't clutter up the active code?

All of the "IC" clustering codes are Holly's and she would be able to explain them better than I would.


I do agree that it would be most reasonable to have one cluster object if that is possible, but I am not highly familiar with the regular HPSEcalCluster and only loosely familiar with Holly's version. Perhaps she could offer more insight into whether this is possible?


Let me know if I can help with anything,

Kyle


On Tue, Dec 16, 2014 at 3:28 PM, McCormick, Jeremy I. <[log in to unmask]> wrote:

	Hi,
	
	I was looking at cleaning up the ECAL clustering code with some changes to packages etc.  Right now it is a bit of a mess, because there is quite a lot of code duplication between algorithms, as well as Drivers that are all doing the same thing (setting basic collection arguments, setting common cuts, etc.)
	
	For more details, see this JIRA item where I have outlined a proposal to clean this up and do a heavy restructuring of the existing code.
	
	https://jira.slac.stanford.edu/browse/HPSJAVA-363
	
	I see in ecal.recon these clustering Drivers...
	
	CTPEcalClusterer
	EcalClusterIC
	EcalClusterICBasic
	GTPEcalClusterer
	GTPOnlineClusterer
	HPSEcalCluster
	
	Could we get a brief description of each clustering Driver for some basic documentation that I can work from to try and do this?  This can go on the JIRA page.
	
	I would also like some information about what are the different types of cuts these are using, a brief description of how the algorithm works, etc.
	
	It is also not clear to me that we need or want so many different clustering engines in our recon.  Holly suggests discussing this in detail so we can identify common algorithms, and I agree with this.
	
	Then there are now two types of clusters implemented...
	
	HPSEcalCluster
	HPSEcalClusterIC
	
	I think we should be working from one cluster class, not two.  So I would propose merging them unless there is some technical reason not to do this.
	
	Long term, I'd like to move everything to the new ecal.cluster sub-package and abandon/deprecate/remove the existing Drivers.  (I also have a few cosmic clustering Drivers that I will move to ecal.cluster too.)
	
	If you need to make immediate changes (this week) to clustering code for the reconstruction to work, please just modify/fix the classes in ecal.recon for now.  I am very aware we need not break anything with the current data taking and recon steering files, so I am not modifying any of the existing Drivers in place.  Meanwhile, I'm working on making a sub-package where things can be reimplemented in a more structured way, including pulling out the core algorithms from the actual Driver classes.  As we verify each of the clustering algorithms with tests, we can move to the re-implementation class in the sub-package and then abandon the old Driver.
	
	Any concerns/comments then please send to hps-software or write comments on the JIRA item.
	
	Thanks.
	
	--Jeremy
	


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1