docs/pubs/0001-lcdd

lcdd-paper.tex 3162 -> 3163

--- docs/pubs/0001-lcdd/lcdd-paper.tex	2014-07-01 01:07:35 UTC (rev 3162)
+++ docs/pubs/0001-lcdd/lcdd-paper.tex	2014-07-01 07:08:10 UTC (rev 3163)
@@ -116,7 +116,7 @@

 %% \[log in to unmask]
 
 \begin{abstract}

-Geant4 is a powerful software framework for simulating the interactions of particles with matter and fields. It has become the de facto standard for detector response simulations in high energy physics (HEP) and is increasingly being applied in other disciplines such as medical physics and applications in the aerospace industry. However, it is designed as a toolkit rather than a pre-packaged executable and is also large and complex.  End users must assemble an application based upon their individual requirements, requiring a considerable amount of expertise both in the details of configuring the framework and in the C++ language. Providing a flexible application based on Geant4 which can meet the needs of many different users can alleviate this technical hurdle and perhaps speed its adoption in other application domains. This approach requires that the simulation parameters be defined at runtime rather than embedded into custom source co!
 de. Ideally, in an application of this type the c[...]

+Geant4 is a powerful software framework for simulating the interactions of particles with matter and fields. It has become the de facto standard for detector simulation in high energy physics (HEP) and is increasingly being applied in other disciplines, such as medical physics and applications in the aerospace industry.  It is designed as a toolkit rather than a pre-packaged executable.  An application must be assembled based upon the requirements of the project, requiring a considerable amount of expertise, both in the details of configuring the framework, and in the C++ language in which it is written. Providing a flexible application based on Geant4 which can meet the needs of many different users can alleviate these technical barriers. This approach requires that the simulation parameters be defined at runtime rather than embedded into custom source code. Ideally, in an application of this type the complete detector description should !
 be defined by a data format rather than a set o[...]

 \end{abstract}
 
 %% \begin{keyword}

@@ -140,13 +140,13 @@

 %% still need to know the Geant4 physics, e.g physics lists, regions, step size...
 %%

-Geant4 is an application framework that has become the primary tool used in HEP for the simulation of particle interactions in matter and fields.  It is distributed as a set of source files and examples with compilation instructions.  Geant4 is an Object Oriented toolkit which is used to assemble a domain-specific application based on experimental requirements.  The most complex and lengthy requirement is usually modeling the geometry and detectors, which for complex detector setups can comprise hundreds, or even thousands of lines of custom code.  Typically, the user must implement their own geometry structure and configure all the other necessary detector components for their particular simulation.  This task can be daunting, as it requires a considerable level of expertise not only in the toolkit itself, but in the details of C++ syntax and implementations.

+Geant4 is an application framework that has become the primary tool used in HEP for the simulation of particle interactions in matter and fields.  It is distributed as a set of source files and examples with compilation instructions.  Geant4 is an Object Oriented toolkit which is used to assemble a domain-specific application based on experimental requirements.  The most complex requirement is usually the modeling of the geometry and detectors, which for complex setups can comprise hundreds, or even thousands of lines of custom code.  Typically, the user must implement their own geometry structure and configure all the other supplementary components for their particular simulation.  This task can be daunting, as it requires a considerable level of expertise not only in the toolkit itself, but in the details of C++ syntax.

-Some physics simulation programs use their own custom-defined data input formats for detector descriptions to alleviate some of this complexity.  The lack of standardization in this area of detector description has hindered data interchangeability between different tools.  When geometry is defined in an application by coding directly against an application programming interface (API) the overall size of the simulation code base tends to increase greatly over time as more and more detector models and variants are added.  This can lead to severe maintenance issues in the application, including a great amount of code duplication between detector models; the treatment of specific geometries as a ''black box'' with no real external data description; and a confusion and lack of separation between the domains of procedural code and the data upon which it operates.

+When geometry is defined in an application by coding directly against an application programming interface (API) the size of the code base tends to increase greatly over time as more detector models and variants are added.  Essentially, each detector variation tends to require its own set of classes.  This can lead to severe maintenance issues in the application, including a great amount of code duplication between detector models; the treatment of specific geometries as a ''black box'' with no real external data description; and a confusion and lack of separation between the domains of procedural code and the data upon which it operates.  Some physics simulation programs use their own custom-defined data input formats for detector descriptions to alleviate some of this complexity.  But the lack of standardization in this area has hindered data interchangeability between different tools and requires learning new formats for each applicatio!
 n.

-Providing a comprehensive solution to these problems has been the goal of the Linear Collider Detector Description (LCDD) project. This framework was first introduced as a solution for defining and simulating a myriad of detectors and their variants for the International Linear Collider (ILC) project, and it is now being used successfully by several other experiments.  By providing a clear and absolute separation between code and detector input data, researchers are freed from needing to know the complex details of the Geant4 APIs.  They may instead focus on defining the simulation inputs for their particular experiment.

+Providing a comprehensive and flexible solution to these problems has been the goal of the Linear Collider Detector Description (LCDD) project. This framework was first introduced to simulate detector designs and their variants for the International Linear Collider (ILC).  It is now being used successfully by several other experiments.  By providing a clear separation between code and detector description, researchers are freed from needing to know the complex details of the Geant4 APIs.  They may instead focus on defining the simulation inputs for their particular experiment such as the detector geometry.

-This paper will provide an overview of the LCDD language and framework. The LCDD extensions to GDML will be explained and described, with an example showing the full document structure.  Each primary XML element type will be explained in detail, along with an example of its usage.  A solution will be given for authoring detector documents using another high-level format or ''compact description.''  Examples will be given of projects that have used LCDD to model their experiments.  Finally, future plans will be briefly discussed.

+This paper will provide an overview of the LCDD language and framework. The LCDD extensions to GDML will be explained and described, with an example showing the full document structure.  Each primary XML element type will be explained in detail along with an example of its usage.  A solution will be given for authoring detector documents using another high-level format or ''compact description.''  Examples will be given of projects that have used LCDD to model their experiments.  Finally, future plans will be briefly discussed.

 
 \section{GDML}

@@ -155,15 +155,15 @@

 %% how does it answer question? import/export/exchange
 %% geometry is only a part of the project; still need regions, physics limits, fields, visualization, etc.

-The Geometry Description Markup Language (GDML) is an XML language for geometry description.  This format allows users to define hierarchical geometry structures using a data language rather than C++ source code that is customized for every detector variation.  GDML fully describes materials, mathematical variables and definitions, geometric solids such as boxes and tubes, and a hierarchical structure of logical and physical volumes.

+The Geometry Description Markup Language (GDML) is an XML format for geometry description.  This format allows users to define hierarchical, geometry structures using a data language.  GDML fully describes materials, mathematical variables and definitions, geometric solids such as boxes and tubes, and a hierarchical structure of logical and physical volumes.

 %Materials are defined as either chemical elements or combinations thereof.  Simple constants can be defined, as well as equations that use trigonometric functions and basic mathematical operators.  The support for different types of geometric solids is extensive, including simple shapes such as boxes and tubes or more complex tesselated volumes formed from an arbitrary number of surfaces.  A logical volume is defined by a solid and a material and may optionally contain a tree of ''daughter'' physical volumes.

-GDML’s \textit{define} block contains expressions and definitions read into the CLHEP Expression Evaluator.  These expressions may contain double precision numbers, simple arithmetic operators (* / + -), trigonometric functions, and units.  The processor predefines a number of standard units for distance, weight, etc.  Important primary constants such as the speed of light are also predefined.  Rotations and positions that will be referenced later to create physical volumes are also included in this area.

+GDML's \textit{define} block contains expressions and definitions read into the CLHEP Expression Evaluator.  These expressions may contain double precision numbers, simple arithmetic operators (* / + -), trigonometric functions, and units.  The processor predefines a number of standard units for distance, weight, etc.  Important primary constants such as the speed of light are also predefined.  Rotations and positions that will be referenced later to create physical volumes are also included in this area.

 
 The \textit{materials} section has \textit{material} and \textit{element} elements that bind to the G4Material and G4Element classes for materials and atomic elements.  Materials are defined by atomic or mass composition and density.  Material parameter sheets may be attached to provide pre-computed values for dEdx calculations and similar algorithms.  The materials defined here are referenced by \textit{volume} elements in the \textit{structure} area.

-The \textit{solids} block contains shape definitions that are also referenced by the GDML volumes.  Constructive Geometry Solids (CSG) are the most common type of shapes used in GEANT4 geometries.  (Boundary Represented Solids (BREPS) are also available but are not supported by the GDML system.)  GDML has bindings to a large and nearly complete subset of the CSG solids defined by the GEANT4 geometry subsystem, including tubes, boxes, trapezoids, tori, twisted tubes and boxes, polyhedra, and facetted shapes.  Boolean subtraction and addition can be used with these primitives to define arbitrarily complex geometries.

+The \textit{solids} block contains shape definitions that are also referenced by the GDML volumes.  Constructive Geometry Solids (CSG) are the most common type of shapes used in Geant4 geometries.  (Boundary Represented Solids (BREPS) are also available but are not supported by the GDML system.)  GDML has bindings to a large and nearly complete subset of the CSG solids defined by the Geant4 geometry subsystem, including tubes, boxes, trapezoids, tori, twisted tubes and boxes, polyhedra, and facetted shapes.  Boolean subtraction and addition can be used with these primitives to define arbitrarily complex geometries.

-The \textit{structure} block contains a nested hierarchy of geometric volumes.  A volume is composed of a shape plus its material and may contain any number of sub-volumes, defined with the \textit{physvol} tag.  The volumes in the \textit{physvol} elements are called “child” volumes of their “parent” volume.  The child volumes must contain a reference to a logical volume, plus an in-lined or referenced position and rotation.  The top-level volume or “world volume”, typically a large box containing the detector envelope volume, is defined in the \textit{setup} block using the \textit{world} element.

+The \textit{structure} block contains a nested hierarchy of geometric volumes.  A volume is composed of a shape plus its material and may contain any number of sub-volumes, defined with the \textit{physvol} tag.  The volumes in the \textit{physvol} elements are called "��child"�� volumes of their "parent"�� volume.  The child volumes must contain a reference to a logical volume, plus an in-lined or referenced position and rotation.  The top-level volume or "��world volume"�, typically a large box containing the detector envelope volume, is defined in the \textit{setup} block using the \textit{world} element.

 
 A hierarchy of volumes thus defines the complete detector structure. Originally developed as a standalone application, GDML has become part of the Geant4 source distribution. It therefore serves as an ideal starting point for a complete detector description language.

@@ -171,14 +171,16 @@

 
 In addition to the geometric layout of an experiment, additional information is required to fully describe a valid detector setup at runtime.  This complete set of data is usually called ''detector description.''  Frameworks that use a data language such as GDML for geometry description have generally still required additional, auxiliary information at runtime, for example, through macro commands that define readouts and assign them to volumes.  There are inherent problems and limitations to this approach.  The supplementary information to the geometry is not easily accessible if it is embedded in relatively unstructured, procedural macro files.  Using ad hoc runtime commands can also make it difficult to determine later which detector simulation parameters were used to produce an output file or what readout parameters should be associated to a particular detector component.

-A more complete approach is required to guarantee the consistency and integrity of the detector data.  LCDD was designed to provide a complete description of complex experimental setups.  Various types of detectors, ranging from simple test beams to complex HEP detectors, can be modeled to an arbitrary level of detail using an XML file rather than detector-specific C++ code.  LCDD is built upon the GDML data format and C++ parser.  It extends GDML’s data format by using built-in facilities of the XML Schema (XSD) language.  The GDML code infrastructure is reused by registering additional element handlers with GDML’s flexible parser class.  The extended parser, without any alteration, can also read in plain GDML files, as well as LCDD, so a file with \textit{gdml} as the top e!
 lement is considered valid within the LCDD processing framework. .

+A more complete approach is required to guarantee the consistency and integrity of the detector data.  LCDD was designed to provide a complete description of complex experimental setups.  Various types of detectors, ranging from simple test beams to complex HEP detectors, can be modeled to an arbitrary level of detail using an XML file rather than detector-specific C++ code.  LCDD is built upon the GDML data format and C++ parser.  It extends GDML's data format by using built-in facilities of the XML Schema (XSD) language.  The GDML code infrastructure is reused by registering additional element handlers with GDML's flexible parser class.  The extended parser, without any alteration, can also read in plain GDML files, as well as LCDD, so a file with \textit{gdml} as the top element i!
 s considered valid within the LCDD processing framework. .

-LCDD uses GDML to define the core geometric information about the experiment.  The LCDD schema formally extends GDML using the \textit{extension} element, so the \textit{gdml} root node is embedded as part of the document.  The GDML language is essentially left intact, with a single point of extension added to the \textit{volume} element, which may then contain optional references to additional LCDD elements.  The volume can be associated with detector readouts, visualization parameters, a region, physics limits, and other supplementary information, to provide a complete description of the detector to the simulation engine at runtime.

+LCDD uses GDML to define the core geometric information about the experiment.  The LCDD schema formally extends GDML using the \textit{extension} element of XSD, and the \textit{gdml} root node is embedded as part of the document.  The GDML language is essentially left intact, with the main point of extension being the \textit{volume} element, which may contain optional references to additional LCDD elements defined outside of the GDML section.  The volume can be associated with detector readouts, visualization parameters, a region, physics limits, and other supplementary information, to provide a complete description of the detector to the simulation engine at runtime.

 
 \subsection{Document Structure}
 
 Every LCDD document has the same basic structure.  The top-level \textit{lcdd} element has a list of sections, one of which includes an entire embedded GDML document.  Aside from the header, which has no child elements, the other sections each contain a list of elements with a specific type.  The elements can be referenced from the GDML to associate the supplementary information with specific logical volumes defined by the geometry.  This is done by making a language extension to the GDML volume element.

 Unlike some markup languages, such as HTML, where elements can be referenced regardless of order, GDML and LCDD support in-order references only.  An element must have already been defined to be referenced.  For this reason, the ordering of the top-level container elements in GDML and LCDD files is important and must conform to the order specified in their respective schemas.  For instance, a box solid must be defined before it can be used as the shape for a volume.  The primary benefit of this approach is reduction of memory consumption during the processing phase. The following snippet of pseudo-XML outlines the top-level structure of an LCDD file, including the embedded GDML element.

 \begin{verbatim}
 <lcdd>
     <header/>

@@ -198,9 +200,9 @@

 </lcdd>
 \end{verbatim}

-The header has basic meta data about the document, such as who authored it.  The \textit{iddict} contains identifier dictionaries that provide encodings for information that can be written into hit objects at runtime, including layer numbers and detector IDs.  The \textit{sensitive\_detectors} element defines Geant4 ''sensitive detectors'' that are assigned via reference to volumes.  This causes hits to be accumulated for that detector by event, containing energy, position and time information.  The \textit{limits} are sets of physics limits that determine some behavior in the Geant4 physics engine, such as the maximum step size of a track.  The \textit{display} element contains visualization information that can be used to assign colors and visibility settings to logical volumes.  The \textit{gdml} tag defines a GDML document, which must follow that format's syntax.  The LCDD schema extends GDML so that the logical and physical volumes !
 can be assigned additional information.  Finally,[...]

+The header has basic meta data about the document, such as who authored it.  The \textit{iddict} contains identifier dictionaries that provide encodings for information that can be written into hit objects at runtime, including layer numbers and detector IDs.  The \textit{sensitive\_detectors} element defines Geant4 ''sensitive detectors'' that are assigned via reference to volumes.  This causes hits to be accumulated for that detector by event, containing energy, position and time information.  The \textit{limits} are sets of physics limits which effect certain parameters in the Geant4 physics engine, such as the maximum step size of a track.  The \textit{display} element contains visualization information that can be used to assign colors and visibility settings to logical volumes.  The \textit{gdml} tag defines a GDML document, which must follow that format's syntax, but may include the optional LCDD extension elements.  Finally, the \t!
 extit{fields} element contains definitions of m[...]

-The input document is checked by the parser for correctness at runtime against an XML Schema, which is located at a standard URL and can be accessed over the internet via the \textit{http} protocol.  The parser is fault tolerant, in that minor errors, such as mis-ordering of child elements, may only result in warning messages.  Other, more severe errors within a document, such as references to non-existent element IDs, will result in fatal exceptions that cause the application to exit.

+The input document is checked by the parser for correctness at runtime against an XML Schema, which is located at a standard URL and can be accessed over the internet via the \textit{http} protocol.  The parser is fault tolerant, in that minor errors, such as mis-ordering of child elements, may only result in warning messages.  Other, more severe errors within a document, such as references to non-existent element IDs, will generally result in fatal exceptions that cause the application to exit.

 
 \section{Header Element}

@@ -208,64 +210,66 @@

 
 \begin{verbatim}
 <header>

-    <detector name=”sidloi3”/>
-    <generator name=”GeomConverter” version=”1.0”
-        file=”./detectors/sidloi3/compact.xml” checksum=”2152839912”/>
-    <author name=”Jeremy McCormick” email=”[log in to unmask]”>

+    <detector name="sidloi3"/>
+    <generator name="GeomConverter"� version="1.0" 
+        file="./detectors/sidloi3/compact.xml" checksum="��2152839912"/>
+    <author name="��Jeremy McCormick"�� email="��[log in to unmask]">

     <comment>The SiD detector</comment>
 </header>
 \end{verbatim}

-An author tag gives the names of the people who created the file, as well as an optional email contact.  The detector element provides a name that can be used as a ''tag'' of the document to uniquely identify it.  The generator provides information about any external program that was used to generate the file, including a source file name, if applicable, a version of that program, and a checksum that could be created with an MD5 algorithm.  There is a free-form comment block that can contain a description and notes about the detector.

+The detector element provides a name that can be used as a tag of the document to uniquely identify it.  For instance, the name can be written in the run headers of an output data format to identify which detector was used to generate the data file.  An author tag gives the names of the people who created the file, as well as an optional email contact.  The generator provides information about any external program that was used to generate the file, including a source file name, if applicable, a version of that program, and a checksum that could be created with an MD5 algorithm.  Finally, there is a free-form comment block that may be used to provide a description and notes about the detector.

 
 \section{Sensitive Detectors}

-A ''sensitive detector'' is assigned to a logical volume to indicate that it is a readout component which is capable of producing hits.  When a particle deposits energy into the volume, hit objects may be created and can later be written into an output file for analysis.  The sensitive detectors typically accumulate position, time and energy measurements from particle interactions within the material of a volume.  Usually these hits are grouped into collections by event.

+A ''sensitive detector'', in Geant4 terminology, is assigned to a logical volume to indicate that it is a readout component which is capable of producing hits.  When a particle deposits energy into the volume, hit objects may be created that may later be written into an output file for analysis.  (Providing these data formats is not one of LCDD features.)  The sensitive detectors typically accumulate position, time and energy measurements from particle interactions within the material of a volume.  The hits are grouped into collections by their sensitive detector and are accumulated by event.

-Two primary types of detectors are modeled by the framework.  Trackers typically store output from the simulation that corresponds closely to individual steps within a volume.  This information can be used later to reconstruct in detail the exact particle momentum at the hit location. Calorimeters are used for the accumulation of energy in cellular volumes, and typically have much less granular position information.  Calorimeters may be virtually segmented into cells using a child \textit{segmentation} element.  There is also a third type of detector called a \textit{scorer} which is essentially a simplified tracker.  It can be used to insert scoring planes to derive simple flux counts.

+Two primary types of detectors are modeled by the framework.  Trackers store output from the simulation that corresponds closely to individual steps within a volume.  This information can be used later to reconstruct in detail the exact particle momentum at the hit location. Calorimeters are used for the accumulation of energy in cellular volumes, and typically have much less granular position information.  Calorimeters may be virtually segmented into cells using a child \textit{segmentation} element.  There is also a third type of detector called a \textit{scorer} which is essentially a simplified tracker.  It can be used to insert scoring planes to derive simple flux counts.

 
 These types extend a common element which defines basic detector settings.  The common settings for all \textit{sensitive\_detector} elements include the following.
 
 \begin{tabular}{ | l | l | }

-  \hline
-  name & unique string identifying the sub-detector \\ \hline
-  endcap\_flag & indicates if volume is a barrel or endcap \\ \hline
-  ecut & a minimum energy cut for individual hits \\ \hline
-  eunit & energy unit for cut \\ \hline
-  verbose & verbosity setting \\
-  \hline

+\hline
+name & unique string identifying the sub-detector \\ \hline
+endcap\_flag & indicates if volume is a barrel or endcap \\ \hline
+ecut & a minimum energy cut for individual hits \\ \hline
+eunit & energy unit for cut \\ \hline
+verbose & verbosity setting \\
+\hline

 \end{tabular}

-Each detector has a name that is used to uniquely identify it within the document.  This is used to associate logical volumes with a sensitive detector using the \textit{sdref} element.  There is a flag which indicates whether or not the detector is an end cap.  (This is primarily a concept that is relevant for HEP collider-detectors.)  An energy cut setting can be used to discard hits that do not reach a certain threshold energy.  There is a verbosity setting to control print screen output from the detector while the simulation is running.  The name of the detector is required, and the rest of the settings are optional.

+Each sensitive detector has a name that is used to uniquely identify it within the document.  This is used to associate logical volumes with a sensitive detector using the \textit{sdref} element.  There is a flag which indicates whether or not the detector is an end cap.  (This is primarily a concept that is relevant for HEP collider-detectors.)  An energy cut setting can be used to discard hits that do not reach a certain threshold energy.  There is a verbosity setting to control print screen output from the detector while the simulation is running.  The name of the detector is required, and the other settings are optional.

 
 The detectors have associated hits collections that contain objects which are implementations of the virtual hit class within Geant4.  There is no output data binding provided by LCDD itself to persist this information.  It is assumed that applications which include LCDD as a dependency will translate from these hit objects into a desired output format such as LCIO.
 
 \subsection{Trackers}

-Trackers record information from each step of a simulated track as it propagates through a sensitive volume.  The stored information includes the mid-point position, direction, length, global time in nanoseconds when the step occurred in the simulation, and the energy deposited along the step length.  A \textit{TrackerHit} object is created for each step and stored into a hit collection.  The Tracker is most commonly used to model high-granularity detectors, such as those with pixels or silicon strips.  Advanced algorithms for digitizing the hits within the simulation are not provided, as it is assumed this would be done later in a reconstruction environment.

+Trackers record information from each step of a simulated track as it propagates through a volume.  The stored information includes the mid-point position, direction, length, global time in nanoseconds when the step occurred, and the energy deposited along the step length.  A \textit{TrackerHit} object is created for each step and stored into a hit collection.  The Tracker is most commonly used to model high-granularity detectors, such as those with pixels or silicon strips.  Advanced algorithms for digitizing the hits within the simulation are not provided, as it is assumed this would be done later in a reconstruction environment.

 
 This is an example XML snippet for a simple tracking detector, similar to what might be defined for an ILC full detector concept:
 
 \begin{verbatim}

-<tracker name=”SiTrackerBarrel” hits_collection=”SiTrackerBarrelHits”>
-    <idspecref ref=”SiTrackerBarrelHits/>

+<tracker name="��SiTrackerBarrel"�� hits_collection="��SiTrackerBarrelHits">
+    <idspecref ref="SiTrackerBarrelHits"/>

 </tracker>
 \end{verbatim}

+Essentially the tracker as implemented is a simple detector that writes records of the individual steps, which can later be used to more fully simulate the detector response.
+

 \subsection{Scorer}

-The Scorer type is the simplest of the three sensitive detector implementations.  It records the passage of particles through a volume.  The main difference between the Tracker and the Scorer is that the latter will only record one hit for each unique G4Track that passes through it, whereas the Tracker class records all separate steps as individual hits.

+The Scorer type is the simplest of the three sensitive detector implementations.  It records the passage of particles through a volume.  The main difference between the Tracker and the Scorer is that the latter will only record one hit for each unique G4Track that passes through it, whereas the Tracker class records all separate steps as individual hits.  The scorer provides a way to determine if a given track passed through the volume but it does not provide any information about the energy of that particle.

 
 \subsection{Calorimeter}

-The Calorimeter detector is used to record the energy deposition of showering particles in a volume.  Since the details of each step of individual secondary particles generated in such a shower are usually not of interest, energy depositions are accumulated from multiple steps to determine the total energy deposited in the volume for the event.  These energy depositions may be accumulated in an entire volume, such as a physical crystal, or the energy may be split across arrays of cells that are created through a virtual segmentation of the volume.  In general, when the volumes are segmented, there is one hit object created per cell for the entire event. Each calorimeter may have an associated \textit{segmentation} object that bins the energy depositions by position during the simulation.

+The Calorimeter detector is used to record the energy depositions of particle showers.  Typically these showers are composed of many individual steps, the details of which are usually not of interest.  The energy depositions from all of the steps are accumulated to determine the total energy deposited in the volume.  The energy may be accumulated in an entire volume, such as a physical crystal or may be split across arrays of virtual cells.  When the volumes are artificially segmented, there is generally one hit object created per virtual cell for the entire event.

 
 The following XML defines a calorimeter with uniform sized cells created by a virtual segmentation class.
 
 \begin{verbatim}

-<calorimeter name=”EcalBarrel” hits_collection=”EcalBarrelHits”>
-    <idspecref ref=”EcalBarrelHits/>
-    <grid_xyz grid_size_x=”3.5” grid_size_y=”3.5” grid_size_z=”0.0”/>

+<calorimeter name="EcalBarrel"�� hits_collection="EcalBarrelHits"��>
+    <idspecref ref="EcalBarrelHits"/>
+    <grid_xyz grid_size_x="3.5*mm"�� grid_size_y="�3.5*mm"�� grid_size_z="0.0"/>

 </calorimeter>
 \end{verbatim}

@@ -273,15 +277,15 @@

 
 \section{Segmentation}

-Sensitive volumes in a calorimeter detector usually require virtual subdivision in order that energy depositions can be accumulated into cells.  This concept of dividing geometric volumes is modeled by specific concrete implementations of the \textit{segmentation} element.  This algorithmic, rather than geometric, approach to segmented readout has other advantages.  Modeling millions of individual cell volumes could be prohibitive in terms of memory usage. There are also cases in which modeling a readout system with an algorithm rather than geometry is more simple, such as in projective towers where there are many different shapes and sizes of cells which would be complicated to model using only solids and volumes.

+Sensitive volumes in a calorimeter detector usually require virtual subdivision in order that energy depositions can be accumulated into cells.  This concept of dividing geometric volumes is modeled by specific concrete implementations of the \textit{segmentation} XML element defined in the schema.  This algorithmic approach to segmented readout has some advantages over using purely geometric information.  Modeling millions of individual cell volumes could be prohibitive in terms of memory usage. There are also cases in which describing a readout system with an algorithm rather than geometry is more simple, such as in projective towers where there are many different shapes and sizes of cells which would be complicated to model using only solids and volumes.

-Concrete segmentation types extend a basic abstract element, which has no attributes.  The names of the parameters which define the dimensions of the cells are specific to a certain type of segmentation.  The segmentation element occurs as a child of the calorimeter sensitive detector.  Each calorimeter may have one of these associated objects.  The values of the fields from the segmentation at a certain hit position can be written in the identifiers of the hits by referencing their names.

+Concrete segmentation types extend a basic abstract element, which has no attributes.  The names of the parameters which define the dimensions of the cells are specific to a certain type of segmentation.  The segmentation element occurs as a child of the calorimeter element.  Each calorimeter may have only one of these associated objects.  The values of the fields from the segmentation at a certain hit position can be written into the identifiers of the hits by referencing their names.

-%% TODO: Include images that show how each segmentation works, e.g. projective, grid, etc.  This can include numbering about the origin as in grid_xyz.

+%% TODO: Include images that show how each segmentation works, e.g. projective, grid, etc.  This can include the specific numbering about the origin as in grid_xyz's scheme.

 
 \subsection{Grid XYZ Segmentation}

-The \textit{grid\_xyz} segmentation divides a volume along its X, Y, or Z Cartesian axes.  It can be used to create a regular grid of box-like cells in a planar volume.  The values of the cell indices are available as the fields “x”, “y”, and “z” in an identifier.  The indices are numbered from –N to N about an origin at (x,y,z) = (0,0,0), so that no information about the boundaries of the volume being segmented is required by the algorithm.

+The \textit{grid\_xyz} segmentation divides a volume along its X, Y, or Z Cartesian axes, creating a regular grid of box-like cells in a planar volume.  The indices are signed int values that are numbered about the natural origin of the volume's solid at (x,y,z) = (0,0,0).  Because only the position with respect to the origin is used to obtain the index value at a particular point, no additional geometric information, such as the bounds of the current solid, is required by this segmentation.

 
 The following XML shows a \textit{grid\_xyz} segmentation that divides a volume along the X and Y axes.

@@ -289,6 +293,8 @@

 <grid_xyz grid_size_x="1.0*cm" grid_size_y="1.0*cm" />
 \end{verbatim}

+Any combination of X, Y and Z cell sizes may be provided.  The default value of zero results in that axis being unsegmented, so that the position is reported as zero, which translates to the center point of the volume on that axis.  This is typically used to divide a plane into a rectilinear grid of cells in two dimensions, say X and Y, with the third dimension, e.g. Z, left unsegmented.
+

 \subsection{Projective Cylinder Segmentation}
 
 The \textit{projective\_cylinder} segmentation divides cylinders into projective towers.  Unlike most other types of segmentations, this does not result in cells with uniform sizes.  The sizes of a given cell in a projective segmentation depends on its distance from the origin.  The \textit{nphi} parameters determines how many phi bins are created within the full $360\degree$ in azimuth.  Similarly, \textit{ntheta} specifies the number of theta bins, covering the $180\degree$ in polar angle.

@@ -296,15 +302,17 @@

 This is an example of a projective cylinder segmentation that divides the theta and phi regions into 1000 and 2000 bins, respectively.
 
 \begin{verbatim}

-<projective_cylinder ntheta=”1000” nphi=”2000” />

+<projective_cylinder ntheta="1000"�� nphi="2000" />

 \end{verbatim}

+This segmentation is typically only used in simplified geometries where the calorimeter barrel is modeled using a series of nested tubes, rather than more realistic modules that contain planar layers.
+

 \subsection{Non-projective Cylinder Segmentation}
 
 A \textit{nonprojective\_cylinder} segmentation element will divide the surface of a cylinder into cells of equal size along its length.
 %% TODO is the segmentation in phi really in r*phi?
 \begin{verbatim}

-<nonprojective_cylinder grid_size_phi=”10.0” grid_size_z=”10.0” />

+<nonprojective_cylinder grid_size_phi="10.0*mm"�� grid_size_z="�10.0*mm"� />

 \end{verbatim}
 
 The above segmentation will divide the surface of a cylinder into 10 x 10 mm cells.

@@ -314,7 +322,7 @@

 The \textit{projective\_zplane} segmentation divides an endcap zplane into projective segments.
 
 \begin{verbatim}

-<projective_zplane ntheta=”500” nphi=”500” />

+<projective_zplane ntheta="�500"�� nphi="��500" />

 \end{verbatim}
 
 \subsection{Global Grid XYZ Segmentation}

@@ -322,18 +330,20 @@

 The \textit{global\_grid\_xyz} segmentation divides a global space into regular sized rectilinear cells.
 
 \begin{verbatim}

-<global_grid_xyz grid_size_x=”50.0” grid_size_y=”50.0” />

+<global_grid_xyz grid_size_x="��50.0*mm"�� grid_size_y="50.0*mm" />

 \end{verbatim}

+Unlike most other segmentations, this algorithm uses the origin of the world volume rather than the volume's center.
+

 \section{Hits Processors}

-Each detector has one or more \textit{hits\_processor} objects that process steps in the simulation and turn them into hits.  This allows a detector to handle different types of particles and physics processes differently.  For instance, an optical calorimeter could write separate hit collections for the scintillation and Cherenkov energy depositions by using two different hits processors.

+Each detector has one or more \textit{hits\_processor} objects that process steps in the simulation and turn them into hits.  This allows a detector to handle different types of particles and physics processes differently.  For instance, an optical calorimeter could write separate hit collections for the scintillation and Cherenkov energy depositions by using two different hits processors.  The tracker and calorimeter detectors each have a default hits processor that uses the segmentation, sensitive detector, and identifier classes to construct hits.  It is anticipated that the hits processor could provide an extension point for future development of flexible algorithms that have more complex requirements.

 
 \section{Identifiers}

-Identifiers associate hits from sensitive detectors to their geometric components, as well as cell indices from the segmentation grid in the case of calorimeters.  Each sensitive detector may have an identifier specification associated with it.  This is used to construct a unique 64-bit ID from physical volume numbers, such as layer number, and segmentation values, like X and Y cell indices.  The user is ultimately responsible for making sure this combination of values uniquely identifies a hit.

+Identifiers define the format for 64-bit packed long numbers that are used to associate hits with their encoded geometric and detector information.  The values of the physical volume IDs may be written into these identifiers, and the values from the segmentation objects may also be used.  Each sensitive detector may have one of these identifier specifications associated with it.  It is used to construct a unique 64-bit ID from physical volume numbers, such as layer number, and segmentation values, like X and Y cell indices.  The user is ultimately responsible for making sure this combination of field values results in globally unique values.

-All of the identifier specifications are contained in an ID dictionary called the \textit{iddict}.  Each specification has a corresponding element called the \textit{idspec}.  The \textit{idspec} elements contain \textit{idfield} tags that define a single field within the identifier.  These fields can be from 1 to 32 bits and may be signed or unsigned.

+All of the identifier specifications are contained in an ID dictionary called the \textit{iddict}.  Each specification has a corresponding element called the \textit{idspec}.  The \textit{idspec} elements contain \textit{idfield} tags that define a single field within the identifier, such as a layer number or a segmentation field.  These fields can be from 1 to 32 bits and may be signed or unsigned.

 
 Below is an example of an identifier for an ILC ECal detector.

@@ -350,11 +360,11 @@

 </idspec>
 \end{verbatim}

-The first five fields of this identifier derive from volume identifier numbers.  The “x” and “y” fields are taken from the segmentation values calculated at the hit’s step position during the simulation.  Together, these values identify a unique cell in the ECal and can be used to recalculate the cell’s position during reconstruction and analysis.

+The first five fields of this identifier derive from volume identifier numbers.  The "x"� and "��y"�� fields are read from the segmentation values calculated at the hit position.  Together, these values identify a unique cell in the ECAL and can be subsequently decoded from the identifier's 64-bit int value during reconstruction and analysis.

 
 \section{Physics Limits}

-Physics limits can be assigned to volumes in order to control the low-level behavior of the simulation.  For instance, for performance purposes, the range cut can be increased to control the simulation time of electromagnetic showers.

+Physics limits can be assigned to volumes in order to control the low-level behavior of the simulation.  For example, the range cut which determines which secondary particles are produced can be increased to control the simulation time of electromagnetic showers by limiting the creation of many low-energy secondary particles.

 
 The following example will restrict the step lengths to 5 mm.

@@ -366,7 +376,7 @@

 </limits>
 \end{verbatim}

-The limits that can be set include the maximum step length, the maximum track length, the maximum particle lifetime, the minimum particle kinetic energy, and the minimum range (or “range cut��� in Geant4 terminology).

+The limits that can be set include the maximum step length, the maximum track length, the maximum particle lifetime, the minimum particle kinetic energy, and the minimum range (or "��range cut"�� in Geant4 terminology).

 
 \section{Regions}

@@ -383,6 +393,9 @@

 
 \section{Magnetic Field}

+% FIXME: Section needs work.
+% TODO: Add sections for other types of fields here.
+

 Realistic simulation of magnetic fields is typically an important part of the detector simulation.  There are currently three types of fields available.  When the field regions overlap, the B-field components are added to each other as an overlay.  The solenoid element has an inner and outer field value.  The following is an example of a solenoid with a 5 Tesla magnetic field oriented along the z axis.
 
 \begin{verbatim}

@@ -401,9 +414,9 @@

 
 \begin{verbatim}
 <display>

-    <vis name=”CalVis” line_style=”unbroken” drawing_style=”wireframe”
-        show_daughters=”true” visible=”true”>
-        <color R=”1.0” G=”0.0” B=”0.0” alpha=”1.0”/>

+    <vis name="�CalVis"�� line_style="�unbroken"�� drawing_style="wireframe"
+        show_daughters="true"�� visible="�true"��>
+        <color R="1.0"�� G="�0.0"�� B="0.0"� alpha="��1.0"��/>

     </vis>
 </display>
 \end{verbatim}

@@ -413,50 +426,42 @@

 LCDD extends GDML by adding optional elements to the volume element.  A proper GDML parser will simply ignore these unknown tags when processing the file.  There are no other alterations to standard GDML made by LCDD, so the extension is relatively clean.  Deriving a valid GDML file from LCDD is therefore quite straightforward.  The following example assigns to an example volume a sensitive detector, a set of physics limits, a detector region, a set of visualization attributes, and a region.
 
 \begin{verbatim}

-<volume name=”EcalBarrel_layer0”>
-    <materialref ref=”Silicon”/>
-    <solidref ref=”EcalBarrel_layer0_box”/>
-    <sdref ref=”EcalBarrel”/>
-    <limitsetref ref=”CalLimits”/>
-    <visref ref=”CalVis”/>
-    <regionref ref=”CalRegion”/>

+<volume name="EcalBarrel_layer0">
+    <materialref ref="Silicon"��/>
+    <solidref ref="�EcalBarrel_layer0_box"/>
+    <sdref ref="EcalBarrel"��/>
+    <limitsetref ref="CalLimits"��/>
+    <visref ref="CalVis"��/>
+    <regionref ref="CalRegion"��/>

 </volume>
 \end{verbatim}

-The LCDD objects named in this volume description are actually references to previously defined elements.  For example, the “EcalBarrel” sensitive detector is defined prior to the volume definition, and the parser will retrieve its definition from an in-memory data structure and assign the sensitive detector to the named volume.  A similar strategy is used for the other objects referenced by the extended volume element.

+The LCDD objects named in this volume description are actually references to previously defined elements.  For example, the "EcalBarrel" sensitive detector is defined prior to the volume definition, and the parser will retrieve its definition from an in-memory data structure and assign the sensitive detector to the named volume.  A similar strategy is used for the other objects referenced by the extended volume element.

-\section{Command Interface}

+\section{Using LCDD in an Application}

-%% TODO: example of loading an LCDD file using Geant4 macro commands

+LCDD implements the Geant4 class G4VUserDetectorConstruction so that using it in an application becomes very simple.  It is simply a matter of registering the LCDD detector construction class with the Geant4 run manager.

-\section{Compact Detector Description}
-%% TODO Flesh this out more
-Though LCDD solves a certain problem, certain complexities are introduced.  The format, especially the embedded GDML document, is highly verbose, and complex structures can be tedious to hand-code.  For this reason, an intermediate format that translates from high level concepts and parameters to the low-level representation of LCDD can be helpful and time-saving.

+\begin{verbatim}
+theRunManager->SetUserInitialization(new LCDDDetectorConstruction());
+\end{verbatim}

-A compact detector description has been used for ILC work and in the HPS experiment to represent many different detector variations.

+This will then allow the detector document to be specified using a macro command such as the following command which will load a detector from a URL.

 
 \begin{verbatim}

-<detector id="7" name="HcalBarrel"
-          type="PolyhedraBarrelCalorimeter2" readout="HcalBarrelHits"
-          vis="HcalBarrelVis" calorimeterType="HAD_BARREL">
-    <dimensions numsides="12" rmin="1419.0" z="3018.0 * 2"/>
-    <layer repeat="40">
-        <slice material = "Steel235" thickness = "1.89*cm" />
-        <slice material = "PyrexGlass" thickness = "0.11*cm" />
-        <slice material = "RPCGasDefault" thickness = "0.12*cm" sensitive = "yes" limits="cal_limits" />
-        <slice material = "PyrexGlass" thickness = "0.11*cm" />
-        <slice material = "G10" thickness = "0.3*cm" />
-        <slice material = "Air" thickness = "0.16*cm" />
-    </layer>
-</detector>

+/lcdd/url http://www.lcsim.org/detectors/sidloi3/sidloi3.lcdd

 \end{verbatim}

-A Java library of converters is able to translate from the terse description into the hierarchical volume structure defined by LCDD.

+Local files can also be read in by using the correct protocol.

+\begin{verbatim}
+/lcdd/url file:///local/path/to/sidloi3.lcdd
+\end{verbatim}
+
+The detector document must be set in the pre-initialization phase, and then the macro would typically execute \textit{/control/initialize} to initialization the application and setup the detector geometry.
+

 \section{Examples}

-%% TODO: reference DD4Hep's N04.lcdd as implementation matching Geant4's N04 example
-

 \subsection{Linear Collider}
 
 Linear Collider detector research programs have simulated in detail the response of a number of different detector designs and subdetector technologies.  The Silicon Detector (SiD) collaboration has optimized the design of its full detector concept through many different iterations.  This required the simulation of widely varying geometric layouts and readout schemes and the development of software to support this flexibility.  The current design for its Detector Baseline Document (DBD) is the sidloi3 detector, which is composed of vertex, tracking, and calorimeter sub-systems, as well as support, masks and dead material.  LCDD was used to model and simulate these sub-detectors in a variety of physics scenarios.  This includes an ECAL with several million readout channels as well as a Silicon Vertex Tracker with thousands of tracking modules per sub-detector.

@@ -551,10 +556,37 @@

 %</xs:extension>
 %\end{verbatim}
 %

-%Aside from this addition, the GDML XML format is unchanged and is simply in-lined within its LCDD container.  The LCDD extension classes handle these references.  The volume elements can also be read as plain GDML by parser’s such as the one in ROOT, as long as it skips over these extension elements.

+%Aside from this addition, the GDML XML format is unchanged and is simply in-lined within its LCDD container.  The LCDD extension classes handle these references.  The volume elements can also be read as plain GDML by parser's such as the one in ROOT, as long as it skips over these extension elements.

 
 %% ===============================================================

+%\section{Compact Detector Description}
+
+%% TODO Flesh this out more
+%Though LCDD solves a certain problem, certain complexities are introduced.  The format, especially the embedded GDML document, is highly verbose, and complex structures can be tedious to hand-code.  For this reason, an intermediate format that translates from high level concepts and parameters to the low-level representation of LCDD can be helpful and time-saving.
+
+%A compact detector description has been used for ILC work and in the HPS experiment to represent many different detector variations.
+
+%\begin{verbatim}
+%<detector id="7" name="HcalBarrel"
+%          type="PolyhedraBarrelCalorimeter2" readout="HcalBarrelHits"
+%          vis="HcalBarrelVis" calorimeterType="HAD_BARREL">
+%    <dimensions numsides="12" rmin="1419.0" z="3018.0 * 2"/>
+%   <layer repeat="40">
+%        <slice material = "Steel235" thickness = "1.89*cm" />
+%        <slice material = "PyrexGlass" thickness = "0.11*cm" />
+%        <slice material = "RPCGasDefault" thickness = "0.12*cm" sensitive = "yes" limits="cal_limits" />
+%        <slice material = "PyrexGlass" thickness = "0.11*cm" />
+%        <slice material = "G10" thickness = "0.3*cm" />
+%        <slice material = "Air" thickness = "0.16*cm" />
+%    </layer>
+%</detector>
+%\end{verbatim}
+
+%A Java library of converters is able to translate from the terse description into the hierarchical volume structure defined by LCDD.
+
+%% ===============================================================
+

 %% table examples
 %% http://en.wikibooks.org/wiki/LaTeX/Tables#Basic_examples

[Note: Some over-long lines of diff output only partialy shown]

Commit in `docs/pubs/0001-lcdd` on MAIN
`lcdd-paper.tex`	+116	-84	3162 -> 3163