LISTSERV 16.5 - LCDET-SVN Archives

--- docs/pubs/0001-lcdd/lcdd-paper.tex	2014-07-03 18:15:42 UTC (rev 3170)
+++ docs/pubs/0001-lcdd/lcdd-paper.tex	2014-07-03 23:47:24 UTC (rev 3171)
@@ -1,3 +1,9 @@

+%% TODO List
+%% -section about the calorimeter and tracker hit classes
+%% -more details about algorithms used for segmentation
+%% -write future plans section at end including integration of DDSegmentation
+%% -section on magnetic fields needs work including more details about each type
+

 %%
 %% Copyright 2007, 2008, 2009 Elsevier Ltd
 %%

@@ -116,7 +122,7 @@

 %% \[log in to unmask]
 
 \begin{abstract}

-Geant4 is a powerful software framework for simulating the interactions of particles with matter and fields. It has become the de facto standard for detector simulation in high energy physics (HEP) and is increasingly being applied in other disciplines, such as medical physics and applications in the aerospace industry.  It is designed as a toolkit rather than a pre-packaged executable.  An application must be assembled based upon the requirements of the project, requiring a considerable amount of expertise, both in the details of configuring the framework, and in the C++ language in which it is written. Providing a flexible application based on Geant4 which can meet the needs of many different users can alleviate these technical barriers. This approach requires that the simulation parameters be defined at runtime rather than embedded into !
 custom source code. Ideally, in an application of this type the complete detector description should be defined by a data format rather than a set o[...]

+Geant4 is a powerful software framework for simulating the interactions of particles with matter and fields. It has become the de facto standard for detector simulation in high energy physics (HEP) and is increasingly being applied in other disciplines, such as medical physics and applications in the aerospace industry.  It is designed as a toolkit rather than a pre-packaged executable.  An application must be assembled based upon the requirements of the project, requiring a considerable amount of expertise, both in the details of configuring the framework, and in the C++ language in which it is written. Providing a flexible application based on Geant4 which can meet the needs of many different users can alleviate these technical barriers. Ideally, the complete detector description is provided by a data format rather than a set of compiled!
  classes. The Geometry Description Markup Language (GDML) provides an XML language with bindings to the core geometry classes of the Geant4 toolkit. [...]

 \end{abstract}
 
 %% \begin{keyword}

@@ -140,47 +146,48 @@

 %% still need to know the Geant4 physics, e.g physics lists, regions, step size...
 %%

-Geant4 is an application framework that has become the primary tool used in HEP for the simulation of particle interactions in matter and fields.  It is distributed as a set of source files and examples with compilation instructions.  Geant4 is an Object Oriented toolkit which is used to assemble a domain-specific application based on experimental requirements.  The most complex requirement is usually the modeling of the geometry and detectors, which for complex setups can comprise hundreds, or even thousands of lines of custom code.  Typically, the user must implement their own geometry structure and configure all the other supplementary components for their particular simulation.  This task can be daunting, as it requires a considerable level of expertise not only in the toolkit itself, but in the details of C++ syntax.

+Geant4 is a primary software framework in HEP, used to simulate the interaction of particles in matter and fields.  Distributed as a set of source files and examples, the toolkit is used to assemble a domain-specific application based on experimental requirements.  Typically, the most complex task is modeling the geometry and detectors, which for complex setups, may take hundreds, or even thousands of lines of code.  This task can be daunting, as it requires a considerable level of expertise in the APIs and C++ language.

-When geometry is defined in an application by coding directly against an application programming interface (API) the size of the code base tends to increase greatly over time as more detector models and variants are added.  Essentially, each detector variation tends to require its own set of classes.  This can lead to severe maintenance issues in the application, including a great amount of code duplication between detector models; the treatment of specific geometries as a ''black box'' with no real external data description; and a confusion and lack of separation between the domains of procedural code and the data upon which it operates.  Some physics simulation programs use their own custom-defined data input formats for detector descriptions to alleviate some of this complexity.  But the lack of standardization in this area has hindered data interchangeability between different tools and requires learning new formats for each applicat!
 ion.

+When geometry is defined in code by directly programming against an API, the size of the code base tends to increase over time as more detectors are added.  Each detector variant tends to require its own set of custom classes.  The ``pure code'' approach can lead to severe maintenance issues: a great amount of code duplication between detector models; the treatment of specific geometries as a ``black box'' with no real accompanying data description; and a confusion and lack of separation between the procedural code and data.

-Providing a comprehensive and flexible solution to these problems has been the goal of the Linear Collider Detector Description (LCDD) project. This framework was first introduced to simulate detector designs and their variants for the International Linear Collider (ILC).  It is now being used successfully by several other experiments.  By providing a clear separation between code and detector description, researchers are freed from needing to know the complex details of the Geant4 APIs.  They may instead focus on defining the simulation inputs for their particular experiment such as the detector geometry.

+Providing a comprehensive and flexible solution to these problems has been the goal of the Linear Collider Detector Description (LCDD) project. This framework was first introduced to simulate detector models for International Linear Collider (ILC) design studies and is now being used successfully by several other experiments.  By providing a clear separation between code and detector description, researchers are freed from needing to know the complex details of the Geant4 APIs.  They may instead focus on defining the detector setup for their particular experiment in a structured data language.

-This paper will provide an overview of the LCDD language and framework. The LCDD extensions to GDML will be explained and described, with an example showing the full document structure.  Each primary XML element type will be explained in detail along with an example of its usage.  A solution will be given for authoring detector documents using another high-level format or ''compact description.''  Examples will be given of projects that have used LCDD to model their experiments.  Finally, future plans will be briefly discussed.

+This paper will provide an overview of the LCDD language and framework. The LCDD extensions to GDML will be explained and described, with an example showing the full document structure.  Each primary XML element type will be explained in detail along with an example of its usage.  Examples will be given of projects that have used LCDD to model their experiments.  Finally, future plans will be briefly discussed.

 
 \section{GDML}

-% topic: GDML/geometry

+The Geometry Description Markup Language (GDML) is an XML format for geometry description, allowing users to define hierarchical, geometric structures using a data language.  GDML fully describes materials, mathematical variables and definitions, solids such as boxes and tubes, and a hierarchical structure of logical and physical volumes.

-%% how does it answer question? import/export/exchange
-%% geometry is only a part of the project; still need regions, physics limits, fields, visualization, etc.

+GDML's \textit{define} block contains expressions and definitions read into the CLHEP Expression Evaluator.  These expressions may contain double precision numbers, simple arithmetic operators (* / + -), trigonometric functions, and units.  The processor predefines a number of standard units for distance, weight, etc.  Important primary constants such as the speed of light are also predefined.  Rotations and positions that will be referenced later to create physical volumes may also be included in this section.

-The Geometry Description Markup Language (GDML) is an XML format for geometry description.  This format allows users to define hierarchical, geometry structures using a data language.  GDML fully describes materials, mathematical variables and definitions, geometric solids such as boxes and tubes, and a hierarchical structure of logical and physical volumes.
-%Materials are defined as either chemical elements or combinations thereof.  Simple constants can be defined, as well as equations that use trigonometric functions and basic mathematical operators.  The support for different types of geometric solids is extensive, including simple shapes such as boxes and tubes or more complex tesselated volumes formed from an arbitrary number of surfaces.  A logical volume is defined by a solid and a material and may optionally contain a tree of ''daughter'' physical volumes.
-GDML's \textit{define} block contains expressions and definitions read into the CLHEP Expression Evaluator.  These expressions may contain double precision numbers, simple arithmetic operators (* / + -), trigonometric functions, and units.  The processor predefines a number of standard units for distance, weight, etc.  Important primary constants such as the speed of light are also predefined.  Rotations and positions that will be referenced later to create physical volumes are also included in this area.

+The \textit{materials} section has \textit{material} and \textit{element} elements that bind to the G4Material and G4Element classes for materials and atomic elements.  Materials are defined by atomic or mass composition and density.  Material parameter sheets may be attached to provide pre-computed values for dEdx calculations and optical properties.

-The \textit{materials} section has \textit{material} and \textit{element} elements that bind to the G4Material and G4Element classes for materials and atomic elements.  Materials are defined by atomic or mass composition and density.  Material parameter sheets may be attached to provide pre-computed values for dEdx calculations and similar algorithms.  The materials defined here are referenced by \textit{volume} elements in the \textit{structure} area.

+The \textit{solids} block contains a collection of shape definitions that are used in the geometry.  Constructive Geometry Solids (CSG) are the most common type of shapes used in Geant4 geometries.  (Boundary Represented Solids (BREPS) are also available but are not supported by the GDML system.)  GDML has bindings to a large and nearly complete subset of the CSG solids defined by the Geant4 geometry subsystem, including tubes, boxes, trapezoids, tori, twisted tubes and boxes, polyhedra, and facetted shapes.  Boolean subtraction and addition can be used with these primitives to define arbitrarily complex geometries.

-The \textit{solids} block contains shape definitions that are also referenced by the GDML volumes.  Constructive Geometry Solids (CSG) are the most common type of shapes used in Geant4 geometries.  (Boundary Represented Solids (BREPS) are also available but are not supported by the GDML system.)  GDML has bindings to a large and nearly complete subset of the CSG solids defined by the Geant4 geometry subsystem, including tubes, boxes, trapezoids, tori, twisted tubes and boxes, polyhedra, and facetted shapes.  Boolean subtraction and addition can be used with these primitives to define arbitrarily complex geometries.
-

 The \textit{structure} block contains a nested hierarchy of geometric volumes.  A volume is composed of a shape plus its material and may contain any number of sub-volumes, defined with the \textit{physvol} tag.  The volumes in the \textit{physvol} elements are called "��child"�� volumes of their "parent"�� volume.  The child volumes must contain a reference to a logical volume, plus an in-lined or referenced position and rotation.  The top-level volume or "��world volume"�, typically a large box containing the detector envelope volume, is defined in the \textit{setup} block using the \textit{world} element.
 
 A hierarchy of volumes thus defines the complete detector structure. Originally developed as a standalone application, GDML has become part of the Geant4 source distribution. It therefore serves as an ideal starting point for a complete detector description language.
 
 \section{LCDD}

-In addition to the geometric layout of an experiment, additional information is required to fully describe a valid detector setup at runtime.  This complete set of data is usually called ''detector description.''  Frameworks that use a data language such as GDML for geometry description have generally still required additional, auxiliary information at runtime, for example, through macro commands that define readouts and assign them to volumes.  There are inherent problems and limitations to this approach.  The supplementary information to the geometry is not easily accessible if it is embedded in relatively unstructured, procedural macro files.  Using ad hoc runtime commands can also make it difficult to determine later which detector simulation parameters were used to produce an output file or what readout parameters should be associated to a particular detector component.

+In addition to the geometric layout of an experiment, additional information is required to fully describe a valid detector setup at runtime.  This complete set of data is usually called ``detector description.''  Frameworks that use a data language such as GDML for geometry description generally still require additional information.  This might be provided by macro commands that define readouts and assign them to volumes.  There are inherent problems and limitations to this approach.  The supplementary information is not easily accessible, as it is embedded in relatively unstructured macro files.  Using runtime commands to provide detector information can make it difficult to determine later from an external environment which detector simulation parameters were used to produce an output file, or what readout parameters should be associated to a particular detector component.

-A more complete approach is required to guarantee the consistency and integrity of the detector data.  LCDD was designed to provide a complete description of complex experimental setups.  Various types of detectors, ranging from simple test beams to complex HEP detectors, can be modeled to an arbitrary level of detail using an XML file rather than detector-specific C++ code.  LCDD is built upon the GDML data format and C++ parser.  It extends GDML's data format by using built-in facilities of the XML Schema (XSD) language.  The GDML code infrastructure is reused by registering additional element handlers with GDML's flexible parser class.  The extended parser, without any alteration, can also read in plain GDML files, as well as LCDD, so a file with \textit{gdml} as the top element is considered valid within the LCDD processing framework. .

+A different approach is required to guarantee the consistency and integrity of the detector data.  LCDD was designed to provide a complete description of complex experimental setups.  Various types of detectors, ranging from simple test beams to complex HEP detectors, can be modeled to an arbitrary level of detail using an XML file rather than macros or detector-specific C++ code.  LCDD is built upon the GDML data format and C++ parser.  It extends GDML's data format by using built-in facilities of the XML Schema (XSD) language.  The GDML infrastructure is reused by registering additional element handlers with its flexible parser class.  The extended parser, without any alteration, can read in plain GDML files as well as LCDD.

-LCDD uses GDML to define the core geometric information about the experiment.  The LCDD schema formally extends GDML using the \textit{extension} element of XSD, and the \textit{gdml} root node is embedded as part of the document.  The GDML language is essentially left intact, with the main point of extension being the \textit{volume} element, which may contain optional references to additional LCDD elements defined outside of the GDML section.  The volume can be associated with detector readouts, visualization parameters, a region, physics limits, and other supplementary information, to provide a complete description of the detector to the simulation engine at runtime.

+LCDD uses GDML to provide the geometric information for the simulation.  The LCDD schema formally extends GDML using the \textit{extension} element of XSD, and the \textit{gdml} root node is embedded as part of the document.  The GDML language is essentially left intact, with the main point of extension being the \textit{volume} element, which may contain optional references to additional LCDD elements defined outside of the GDML section.  The volume can be associated with detector readouts, visualization parameters, a region, physics limits, and other supplementary information, to provide a complete description of the detector to the simulation engine.

 
 \subsection{Document Structure}

-Every LCDD document has the same basic structure.  The top-level \textit{lcdd} element has a list of sections, one of which includes an entire embedded GDML document.  Aside from the header, which has no child elements, the other sections each contain a list of elements with a specific type.  The elements can be referenced from the GDML to associate the supplementary information with specific logical volumes defined by the geometry.  This is done by making a language extension to the GDML volume element.

+The LCDD parser checks the input for correctness at runtime against an XML Schema, which is located at a standard URL and can be accessed over the internet via the \textit{http} protocol.

-Unlike some markup languages, such as HTML, where elements can be referenced regardless of order, GDML and LCDD support in-order references only.  An element must have already been defined to be referenced.  For this reason, the ordering of the top-level container elements in GDML and LCDD files is important and must conform to the order specified in their respective schemas.  For instance, a box solid must be defined before it can be used as the shape for a volume.  The primary benefit of this approach is reduction of memory consumption during the processing phase. The following snippet of pseudo-XML outlines the top-level structure of an LCDD file, including the embedded GDML element.

+\begin{verbatim}
+http://www.lcsim.org/schemas/lcdd/1.0/lcdd.xsd
+\end{verbatim}

+Every LCDD file has the same basic structure.  The top-level \textit{lcdd} element has a list of sections, one of which is an entire GDML geometry.  Most of the other sections have a list of elements with a specific type for that container.  The elements are referenced from the GDML in order to associate this additional detector description information to specific logical or physical volumes.  This is achieved by an extension to the GDML volume element within LCDD's XML schema.
+
+The following snippet of pseudo-XML outlines the top-level structure of an LCDD file, including the embedded GDML element.
+

 \begin{verbatim}
 <lcdd>
     <header/>

@@ -200,31 +207,29 @@

 </lcdd>
 \end{verbatim}

-The header has basic meta data about the document, such as who authored it.  The \textit{iddict} contains identifier dictionaries that provide encodings for information that can be written into hit objects at runtime, including layer numbers and detector IDs.  The \textit{sensitive\_detectors} element defines Geant4 ''sensitive detectors'' that are assigned via reference to volumes.  This causes hits to be accumulated for that detector by event, containing energy, position and time information.  The \textit{limits} are sets of physics limits which effect certain parameters in the Geant4 physics engine, such as the maximum step size of a track.  The \textit{display} element contains visualization information that can be used to assign colors and visibility settings to logical volumes.  The \textit{gdml} tag defines a GDML document, which must follow that format's syntax, but may include the optional LCDD extension elements.  Finally, the !
 \textit{fields} element contains definitions of m[...]

+The header contains basic meta data about the document, such as a name that can be used an as external tag or ID of the detector.  The \textit{iddict} is a collection of identifier dictionaries that provide encodings for packed, 64-bit IDs, that may contain information such as a layer number or a detector ID.  The \textit{sensitive\_detectors} element defines Geant4 ``sensitive detectors'' that are assigned via reference to volumes, flagging them as readout components.  This causes hits in that detector's volumes to be accumulated by event.  The \textit{limits} are sets of physics limits which effect certain parameters in the physics engine, such as the maximum step size of a track.  The \textit{display} element contains visualization settings that can be used to assign colors and visibility parameters to logical volumes.  The \textit{gdml} tag is an entire, embedded GDML document, which must conform to that format's XML syntax.  It may in!
 clude additional elements that are defined in t[...]

-The input document is checked by the parser for correctness at runtime against an XML Schema, which is located at a standard URL and can be accessed over the internet via the \textit{http} protocol.  The parser is fault tolerant, in that minor errors, such as mis-ordering of child elements, may only result in warning messages.  Other, more severe errors within a document, such as references to non-existent element IDs, will generally result in fatal exceptions that cause the application to exit.
-

 \section{Header Element}

-Every LCDD document begins with a header that provides basic metadata about the file and the detector.

+Every LCDD document begins with a header that provides basic meta data about the file and the detector.

 
 \begin{verbatim}
 <header>
     <detector name="sidloi3"/>
     <generator name="GeomConverter"� version="1.0"
         file="./detectors/sidloi3/compact.xml" checksum="��2152839912"/>

-    <author name="��Jeremy McCormick"�� email="��[log in to unmask]">

+    <author name="Jeremy McCormick" email="[log in to unmask]">

     <comment>The SiD detector</comment>
 </header>
 \end{verbatim}

-The detector element provides a name that can be used as a tag of the document to uniquely identify it.  For instance, the name can be written in the run headers of an output data format to identify which detector was used to generate the data file.  An author tag gives the names of the people who created the file, as well as an optional email contact.  The generator provides information about any external program that was used to generate the file, including a source file name, if applicable, a version of that program, and a checksum that could be created with an MD5 algorithm.  Finally, there is a free-form comment block that may be used to provide a description and notes about the detector.

+The \textit{detector} XML element provides a name that can be used as an external ID.  For instance, the name can be written into the run headers of output data files to identify which detector was used to generate it.  An author tag gives the names of the people who created the file, as well as an optional email contact.  The generator provides information about any external program that was used to generate the file, including a source file name, if applicable, a version of that program, and a checksum that could be created with an MD5 algorithm.  Finally, there is a free-form comment block that may be used to provide a description and notes about the detector.

 
 \section{Sensitive Detectors}

-A ''sensitive detector'', in Geant4 terminology, is assigned to a logical volume to indicate that it is a readout component which is capable of producing hits.  When a particle deposits energy into the volume, hit objects may be created that may later be written into an output file for analysis.  (Providing these data formats is not one of LCDD features.)  The sensitive detectors typically accumulate position, time and energy measurements from particle interactions within the material of a volume.  The hits are grouped into collections by their sensitive detector and are accumulated by event.

+A ``sensitive detector'', in Geant4 terminology, is assigned to a logical volume to indicate that it is a readout component which is capable of producing hits.  When a particle deposits energy into that volume, hits are created that may later be written into an output file for analysis.  (Providing these data formats is not one of LCDD's features.)  The sensitive detectors typically accumulate position, time and energy measurements from particle interactions within the material of a volume.  The hits are grouped into collections by their sensitive detector and are accumulated by event.

-Two primary types of detectors are modeled by the framework.  Trackers store output from the simulation that corresponds closely to individual steps within a volume.  This information can be used later to reconstruct in detail the exact particle momentum at the hit location. Calorimeters are used for the accumulation of energy in cellular volumes, and typically have much less granular position information.  Calorimeters may be virtually segmented into cells using a child \textit{segmentation} element.  There is also a third type of detector called a \textit{scorer} which is essentially a simplified tracker.  It can be used to insert scoring planes to derive simple flux counts.

+Two primary types of detectors are modeled by the framework.  Trackers store output from the simulation that corresponds closely to individual steps within a volume.  This information can be used later to reconstruct in detail the exact particle momentum at the hit location. Calorimeters are used to record the accumulation of energy within cellular volumes, and typically have much less granular position information.  Their sensor volumes may be virtually segmented into cells using a child \textit{segmentation} element.  There is also a third type of detector called a \textit{scorer} which is essentially a simplified tracker.  It can be used to insert scoring planes to derive simple flux counts.

 
 These types extend a common element which defines basic detector settings.  The common settings for all \textit{sensitive\_detector} elements include the following.

@@ -238,54 +243,54 @@

 \hline
 \end{tabular}

-Each sensitive detector has a name that is used to uniquely identify it within the document.  This is used to associate logical volumes with a sensitive detector using the \textit{sdref} element.  There is a flag which indicates whether or not the detector is an end cap.  (This is primarily a concept that is relevant for HEP collider-detectors.)  An energy cut setting can be used to discard hits that do not reach a certain threshold energy.  There is a verbosity setting to control print screen output from the detector while the simulation is running.  The name of the detector is required, and the other settings are optional.

+Each sensitive detector has a name that is used to uniquely identify it within the document.  This is used to associate logical volumes with a particular sensitive detector using the \textit{sdref} element.  There is a flag which indicates whether or not the detector is an end cap.  (This is primarily a concept that is relevant for HEP collider-detectors.)  An energy cut setting can be used to discard low-energy hits that do not reach a certain threshold.  There is a verbosity setting to control print screen output from the detector while the simulation is running.  The name of the detector is required, and the other settings are optional.

-The detectors have associated hits collections that contain objects which are implementations of the virtual hit class within Geant4.  There is no output data binding provided by LCDD itself to persist this information.  It is assumed that applications which include LCDD as a dependency will translate from these hit objects into a desired output format such as LCIO.

+Each sensitive detector has an associated hits collection that contains objects which are implementations of Geant4's virtual hit class.  There is no output data binding provided by LCDD itself to persist this information.  It is assumed that applications which include LCDD as a dependency will translate from these hit objects into an output format such as LCIO \cite{lcio}.

 
 \subsection{Trackers}

-Trackers record information from each step of a simulated track as it propagates through a volume.  The stored information includes the mid-point position, direction, length, global time in nanoseconds when the step occurred, and the energy deposited along the step length.  A \textit{TrackerHit} object is created for each step and stored into a hit collection.  The Tracker is most commonly used to model high-granularity detectors, such as those with pixels or silicon strips.  Advanced algorithms for digitizing the hits within the simulation are not provided, as it is assumed this would be done later in a reconstruction environment.

+Trackers record information from each step of a simulated track as it propagates through a volume.  The stored information includes the mid-point position, direction of the track at that point, length of the step, global time, and the energy deposited.  A \textit{TrackerHit} object is created for each step and stored into a hit collection.  The Tracker is most commonly used to model high-granularity readouts, such as pixels or silicon strips.  Algorithms for digitizing the hits within the simulation are not provided, as it is assumed this would be done later in a reconstruction environment using the stored hit collections.

-This is an example XML snippet for a simple tracking detector, similar to what might be defined for an ILC full detector concept:

+This is an example XML snippet for a simple tracking detector:

 
 \begin{verbatim}

-<tracker name="��SiTrackerBarrel"�� hits_collection="��SiTrackerBarrelHits">

+<tracker name="�SiTrackerBarrel"� hits_collection="�SiTrackerBarrelHits">

     <idspecref ref="SiTrackerBarrelHits"/>
 </tracker>
 \end{verbatim}

-Essentially the tracker as implemented is a simple detector that writes records of the individual steps, which can later be used to more fully simulate the detector response.

+Essentially the tracker is a relatively simple implementation of a sensitive detector that writes records of the individual steps.  This information can then be used later outside the simulation as input to more sophisticated digitization algorithms.

 
 \subsection{Scorer}

-The Scorer type is the simplest of the three sensitive detector implementations.  It records the passage of particles through a volume.  The main difference between the Tracker and the Scorer is that the latter will only record one hit for each unique G4Track that passes through it, whereas the Tracker class records all separate steps as individual hits.  The scorer provides a way to determine if a given track passed through the volume but it does not provide any information about the energy of that particle.

+The Scorer type is the simplest of the three sensitive detector implementations.  It records the passage of particles through a volume.  The main difference between the Tracker and the Scorer is that the latter will only record one hit for each unique G4Track that passes through it, whereas the Tracker class records all separate steps as individual hits.  The scorer simply provides a way to determine if a given track passed through the volume.

 
 \subsection{Calorimeter}

-The Calorimeter detector is used to record the energy depositions of particle showers.  Typically these showers are composed of many individual steps, the details of which are usually not of interest.  The energy depositions from all of the steps are accumulated to determine the total energy deposited in the volume.  The energy may be accumulated in an entire volume, such as a physical crystal or may be split across arrays of virtual cells.  When the volumes are artificially segmented, there is generally one hit object created per virtual cell for the entire event.

+The calorimeter is used to record the energy depositions of particle showers in homogeneous or sampling detectors.  Typically these showers are composed of many individual steps.  The energy depositions from all of these steps are accumulated to determine the total energy deposited in the volume.  The energy depositions may be summed for an entire volume, such as a physical crystal.  Or the energy may be accumulated in an array of virtual cells, similar to entries in histogram bins.  When the volumes are artificially segmented in this way, there is generally one calorimeter hit object created per virtual cell for the entire event.  Hit contributions are saved for each

-The following XML defines a calorimeter with uniform sized cells created by a virtual segmentation class.

+The following XML defines a calorimeter with uniform sized cells created by a virtual segmentation class.  The \textit{grid\_xyz} element will divide the detector's sensor layers into a grid of cells with size 3.5mm x 3.5mm.

 
 \begin{verbatim}

-<calorimeter name="EcalBarrel"�� hits_collection="EcalBarrelHits"��>

+<calorimeter name="EcalBarrel"� hits_collection="EcalBarrelHits">

     <idspecref ref="EcalBarrelHits"/>

-    <grid_xyz grid_size_x="3.5*mm"�� grid_size_y="�3.5*mm"�� grid_size_z="0.0"/>

+    <grid_xyz grid_size_x="3.5*mm"� grid_size_y="�3.5*mm"� grid_size_z="0.0"/>

 </calorimeter>
 \end{verbatim}

-The \textit{grid\_xyz} element will divide the detector's sensor layers into a grid of cells with size 3.5mm x 3.5mm.

+The calorimeter hits have a list of individual energy contributions that correspond to single steps in the simulation, recording the PDG code of the particle and the positional information about that step, e.g. its endpoints.  Depending on the application's requirements, this low-level information within the calorimeter hit may or may not be saved into the output data.  It might be useful if the readout response depends on the distance of an energy deposition from the edge of the cell, etc.

 
 \section{Segmentation}

-Sensitive volumes in a calorimeter detector usually require virtual subdivision in order that energy depositions can be accumulated into cells.  This concept of dividing geometric volumes is modeled by specific concrete implementations of the \textit{segmentation} XML element defined in the schema.  This algorithmic approach to segmented readout has some advantages over using purely geometric information.  Modeling millions of individual cell volumes could be prohibitive in terms of memory usage. There are also cases in which describing a readout system with an algorithm rather than geometry is more simple, such as in projective towers where there are many different shapes and sizes of cells which would be complicated to model using only solids and volumes.

+%% TODO: Include images that show how each segmentation works, e.g. projective, grid, etc.  This can include the specific numbering about the origin as in grid_xyz's scheme.

-Concrete segmentation types extend a basic abstract element, which has no attributes.  The names of the parameters which define the dimensions of the cells are specific to a certain type of segmentation.  The segmentation element occurs as a child of the calorimeter element.  Each calorimeter may have only one of these associated objects.  The values of the fields from the segmentation at a certain hit position can be written into the identifiers of the hits by referencing their names.

+Sensitive volumes in a calorimeter, such as planar layers, usually require virtual subdivision.  The \textit{segmentation} XML element models this division of geometric volumes into cells.  The algorithmic approach to segmented readout has some advantages over using purely geometric constructs.  Simulating millions of individual cell volumes could be expensive in terms of memory usage.  There are also certain cases in which it is simpler to describe a readout system with an algorithm rather than pure geometry.  For instance, projective calorimeter towers have many different shapes and sizes of cells depending on the radial distance from the origin, and these would be complicated to construct using only solids and volumes.

-%% TODO: Include images that show how each segmentation works, e.g. projective, grid, etc.  This can include the specific numbering about the origin as in grid_xyz's scheme.

+Concrete segmentation types extend an abstract XML element.  The parameters which define the dimensions of the cells are specific to a certain type.  The \textit{segmentation} element occurs as a child of the calorimeter element.  Each calorimeter may have only one of these associated objects.  The values of the fields from the segmentation at a certain hit position can be written into the identifiers of the hits by referencing their names in the identifier specification.

 
 \subsection{Grid XYZ Segmentation}

-The \textit{grid\_xyz} segmentation divides a volume along its X, Y, or Z Cartesian axes, creating a regular grid of box-like cells in a planar volume.  The indices are signed int values that are numbered about the natural origin of the volume's solid at (x,y,z) = (0,0,0).  Because only the position with respect to the origin is used to obtain the index value at a particular point, no additional geometric information, such as the bounds of the current solid, is required by this segmentation.

+The \textit{grid\_xyz} segmentation divides a volume along its X, Y, or Z axes, creating a regular grid of box-like cells.  The indices are signed integer values, numbered about the natural origin of the volume's solid, which is generally its center.  Only the position with respect to the origin is used to compute index value at a particular point in the segmentation grid, so no additional geometric information, such as the bounds of the current solid, is required by the algorithm.

 
 The following XML shows a \textit{grid\_xyz} segmentation that divides a volume along the X and Y axes.

@@ -293,11 +298,11 @@

 <grid_xyz grid_size_x="1.0*cm" grid_size_y="1.0*cm" />
 \end{verbatim}

-Any combination of X, Y and Z cell sizes may be provided.  The default value of zero results in that axis being unsegmented, so that the position is reported as zero, which translates to the center point of the volume on that axis.  This is typically used to divide a plane into a rectilinear grid of cells in two dimensions, say X and Y, with the third dimension, e.g. Z, left unsegmented.

+Any combination of X, Y and Z cell sizes may be provided.  The default value of zero results in that axis being left unsegmented, so that the position is reported as zero, which translates to the center point of the volume along that axis.  This segmentation is typically used to divide a plane into a rectilinear grid of cells in two dimensions, say X and Y, with the third dimension, e.g. Z, left unsegmented.

 
 \subsection{Projective Cylinder Segmentation}

-The \textit{projective\_cylinder} segmentation divides cylinders into projective towers.  Unlike most other types of segmentations, this does not result in cells with uniform sizes.  The sizes of a given cell in a projective segmentation depends on its distance from the origin.  The \textit{nphi} parameters determines how many phi bins are created within the full $360\degree$ in azimuth.  Similarly, \textit{ntheta} specifies the number of theta bins, covering the $180\degree$ in polar angle.

+The \textit{projective\_cylinder} segmentation divides cylinders into projective towers.  Unlike most other types of segmentation, this does not result in cells with uniform sizes.  The size of a given cell in a projective segmentation depends on its distance from the origin.  The \textit{nphi} parameters determines how many phi bins are created within the full $360\degree$ in azimuth.  Similarly, \textit{ntheta} specifies the number of theta bins, covering the $180\degree$ in polar angle.

 
 This is an example of a projective cylinder segmentation that divides the theta and phi regions into 1000 and 2000 bins, respectively.

@@ -305,7 +310,7 @@

 <projective_cylinder ntheta="1000"�� nphi="2000" />
 \end{verbatim}

-This segmentation is typically only used in simplified geometries where the calorimeter barrel is modeled using a series of nested tubes, rather than more realistic modules that contain planar layers.

+This segmentation is typically used in geometries where the calorimeter barrel is modeled using a series of nested tubes, rather than more realistic modules that contain planar layers.

 
 \subsection{Non-projective Cylinder Segmentation}

@@ -319,7 +324,7 @@

 
 \subsection{Projective ZPlane Segmentation}

-The \textit{projective\_zplane} segmentation divides an endcap zplane into projective segments.

+The \textit{projective\_zplane} segmentation divides an endcap zplane into projective segments, much as a \textit{projective\_cylinder} is used for a barrel.

 
 \begin{verbatim}
 <projective_zplane ntheta="�500"�� nphi="��500" />

@@ -337,15 +342,15 @@

 
 \section{Hits Processors}

-Each detector has one or more \textit{hits\_processor} objects that process steps in the simulation and turn them into hits.  This allows a detector to handle different types of particles and physics processes differently.  For instance, an optical calorimeter could write separate hit collections for the scintillation and Cherenkov energy depositions by using two different hits processors.  The tracker and calorimeter detectors each have a default hits processor that uses the segmentation, sensitive detector, and identifier classes to construct hits.  It is anticipated that the hits processor could provide an extension point for future development of flexible algorithms that have more complex requirements.

+Each sensitive detector may have one or more \textit{hits\_processor} objects to process step information into hit objects, allowing flexibility in how different types of particles and physics are handled.  For instance, an optical calorimeter can write separate hit collections for the scintillation and Cherenkov energy depositions by using two different hits processors.  The tracker and calorimeter detectors each have a default hits processor that uses the segmentation, sensitive detector, and identifier classes to construct hits.  It is anticipated that the hits processor could provide an extension point for future development of flexible algorithms that have more complex requirements.

 
 \section{Identifiers}

-Identifiers define the format for 64-bit packed long numbers that are used to associate hits with their encoded geometric and detector information.  The values of the physical volume IDs may be written into these identifiers, and the values from the segmentation objects may also be used.  Each sensitive detector may have one of these identifier specifications associated with it.  It is used to construct a unique 64-bit ID from physical volume numbers, such as layer number, and segmentation values, like X and Y cell indices.  The user is ultimately responsible for making sure this combination of field values results in globally unique values.

+Identifiers define the formats for bit-packed numbers that may be used outside the framework to associate hits with their geometric and detector information.  The values from physical volume IDs or segmentation bins may be written into these IDs.  Each sensitive detector may have one identifier specification, which is used to create a 64-bit integer.  The user is ultimately responsible for making sure that the given combination of field values in this specification results in globally unique values for each hit.

-All of the identifier specifications are contained in an ID dictionary called the \textit{iddict}.  Each specification has a corresponding element called the \textit{idspec}.  The \textit{idspec} elements contain \textit{idfield} tags that define a single field within the identifier, such as a layer number or a segmentation field.  These fields can be from 1 to 32 bits and may be signed or unsigned.

+All of the identifier specifications are contained in a dictionary called the \textit{iddict}.  Each specification has a corresponding element called the \textit{idspec}.  The \textit{idspec} contains a list of \textit{idfield} tags, each of which defines a single field of the identifier, such as a layer number or a single field from the segmentation.  The individual fields can be from 1 to 32 bits and may be signed or unsigned.

-Below is an example of an identifier for an ILC ECal detector.

+Below is an example of an identifier definition for an ILC calorimeter.

 
 \begin{verbatim}
 <idspec name="EcalBarrelHits" length="64">

@@ -360,11 +365,11 @@

 </idspec>
 \end{verbatim}

-The first five fields of this identifier derive from volume identifier numbers.  The "x"� and "��y"�� fields are read from the segmentation values calculated at the hit position.  Together, these values identify a unique cell in the ECAL and can be subsequently decoded from the identifier's 64-bit int value during reconstruction and analysis.

+The first five fields of the above identifier derive from \textit{physvolid} values.  The ``x''� and ``y''� fields are read from the segmentation bins at the hit position.  Together, these values identify a unique cell and can be subsequently decoded within an external framework.

 
 \section{Physics Limits}

-Physics limits can be assigned to volumes in order to control the low-level behavior of the simulation.  For example, the range cut which determines which secondary particles are produced can be increased to control the simulation time of electromagnetic showers by limiting the creation of many low-energy secondary particles.

+Physics limits can be assigned to volumes in order to tune certain parameters within the simulation engine.  For example, the range cut which determines which secondary particles are produced can increased to limit the production of many low-energy secondary particles in divergent electromagnetic processes.

 
 The following example will restrict the step lengths to 5 mm.

@@ -380,7 +385,7 @@

 
 \section{Regions}

-Regions are assigned to geometric volumes using the \textit{region} element.  These are collections of volumes that share similar characteristics.  A flag specifies whether or not secondary particles produced in the simulation are stored into the output particle collections.  The following example defines a tracking region in which all secondary particles will be stored into the output.

+Regions are assigned to geometric volumes using the \textit{region} element.  These are collections of volumes that share similar characteristics.  A flag specifies whether or not secondary particles produced in the simulation are stored into the output particle collections.  The following example defines a region in which all secondary particles will be stored into the output.  This is typically called a ``tracking region.''

 
 \begin{verbatim}
 <regions>

@@ -393,11 +398,8 @@

 
 \section{Magnetic Field}

-% FIXME: Section needs work.
-% TODO: Add sections for other types of fields here.

+Realistic simulation of magnetic fields is typically an important part of the detector simulation.  There are currently four types of fields available.  When the field regions overlap, the B-field components are added to each other as an overlay.  The solenoid element has an inner and outer field value.  The following is an example of a solenoid with a 5 Tesla magnetic field oriented along the z axis.

-Realistic simulation of magnetic fields is typically an important part of the detector simulation.  There are currently three types of fields available.  When the field regions overlap, the B-field components are added to each other as an overlay.  The solenoid element has an inner and outer field value.  The following is an example of a solenoid with a 5 Tesla magnetic field oriented along the z axis.
-

 \begin{verbatim}
 <fields>
     <solenoid name="GlobalSolenoid" lunit="mm" funit="tesla"

@@ -436,7 +438,7 @@

 </volume>
 \end{verbatim}

-The LCDD objects named in this volume description are actually references to previously defined elements.  For example, the "EcalBarrel" sensitive detector is defined prior to the volume definition, and the parser will retrieve its definition from an in-memory data structure and assign the sensitive detector to the named volume.  A similar strategy is used for the other objects referenced by the extended volume element.

+The LCDD objects named in this volume description are actually references to previously defined elements.  For example, the ``EcalBarrel'' sensitive detector is defined prior to the volume definition, and the parser will retrieve its definition from an in-memory data structure and assign the sensitive detector to the named volume.  A similar strategy is used for the other objects referenced by the extended volume element.

 
 \section{Physical Volume IDs}

@@ -545,6 +547,8 @@

 
 \bibitem{geant4} Geant4 - A Simulation Toolkit, S. Agostinelli et al., Nuclear Instruments and Methods A 506 (2003) 250-303

+\bibitem{lcio} LCIO - A persistency framework for linear collider simulation studies, Computing in High Energy and Nuclear Physics, 24-28 March 200
+

 \end{thebibliography}
 
 \end{document}

[Note: Some over-long lines of diff output only partialy shown]

Commit in `docs/pubs/0001-lcdd` on MAIN
`lcdd-paper.tex`	+63	-59	3170 -> 3171