Print

Print


The only other thing that changed was the version of Geant4 (9.6.p01->10.01.p02), as it was updated in the slic/HEAD, which is sourced the same way as before. The mixing is also the same, using the procedure in
/u/group/hps/production/mc/EngRun2015Scripts/slic/beam-tri_100.xml

However, some good news: I made some files with the HPS-EngRun2015-Nominal-v3-1-fieldmap detector today, and the output looks correct from the logs. Please verify, of course:
output:
/mss/hallb/hps/production/pass3/slic/beam-tri/1pt05/egsv3-triv2-g4v1_HPS-EngRun2015-Nominal-v3-1-fieldmap_*.slcio
logs:
/work/hallb/hps/mc_production/pass3/logs/slic/beam-tri/1pt05/egsv3-triv2-g4v1_HPS-EngRun2015-Nominal-v3-1-fieldmap_*.out

If this is correct, then the changes between detectors may be a clue, 
but we could just start using Nominal-v3-1-fieldmap in the meantime since it doesn't make beam-tri crash.

________________________________________
From: [log in to unmask] <[log in to unmask]> on behalf of McCormick, Jeremy I. <[log in to unmask]>
Sent: Thursday, October 15, 2015 6:15 PM
To: Bradley T Yale
Cc: hps-software
Subject: RE: new stdhep error

Hi,

I wrote a bit of Java code to check StdHep files and indeed something seems amiss with the current beam-tri samples:

[1078 $] java -cp ./distribution/target/hps-distribution-3.4.2-SNAPSHOT-bin.jar org.hps.users.jeremym.StdHepChecker /work/hps/data/stdhep/beam-tri_14.stdhep
Exception in thread "main" hep.io.mcfio.MCFIOException: Block error, expected 4 got 1070746256
        at hep.io.mcfio.MCFIOBlock.read(MCFIOBlock.java:31)
        at hep.io.mcfio.MCFIOReader$EventHeader.read(MCFIOReader.java:271)

So we need to look at how these files are being created.  The current crash doesn't appear to be an issue with SLIC.

Was this procedure changed recently?

--Jeremy

-----Original Message-----
From: McCormick, Jeremy I.
Sent: Thursday, October 15, 2015 2:52 PM
To: McCormick, Jeremy I.; Bradley T Yale
Cc: hps-software
Subject: RE: new stdhep error

Hi, Bradley.

I've looking into the beam-tri errors more closely.  I'm pretty sure that some of these StdHep files have bad event header data in them that is crashing SLIC.

I added some debug prints to the StdHep reader and for normal events it prints:

stdhep blockid <4>
stdhep ntot <80>
stdhep version <2.00>

But then once in awhile it is reading in garbage like this for the events that crash:

stdhep blockid <1070746256>
stdhep ntot <-773091366>
stdhep version < ??? >

The stdhep version doesn't even print at all, as it isn't read in as a char* successfully.

Specifically, I'm seeing this bad data in the following job using the StdHep files you saved for me on the work disk.

slic -g ./detector.lcdd -i ./stdhep/beam-tri_14.stdhep -r 2 -o beam-tri_scratch -s 76843 -x

This occurs in my test setup here at SLAC and also when I run the same job at JLAB, so it isn't an issue with the data becoming corrupted when I ftp it from JLAB to SLAC.

So I think we need to check these StdHep files carefully to make sure that somehow they are not corrupted with bad header blocks.  (Maybe there is a Java utility for reading through StdHep events that could do this?)

The other cause could be some kind of subtle bug in the StdHep reader utility we use from LCIO, but I'm not sure this is very likely.  For instance, the input stream could get "off" and start reading data at the wrong place.  This would cause problems such as this.  But I think it is more likely that occassionally the header data itself is bad in these files.

--Jeremy

-----Original Message-----
From: [log in to unmask] [mailto:[log in to unmask]] On Behalf Of McCormick, Jeremy I.
Sent: Thursday, October 15, 2015 2:17 PM
To: Bradley T Yale
Cc: hps-software
Subject: new stdhep error

Hi, Bradley.

I am able now to process the majority of the beam-tri StdHep events but occassionally I'm seeing a Segmentation fault (not in every file though and it seems pretty rare).

When I skipped to the specific events with the error and did a gdb backtrace I see this for one of them:

0x00007ffff0575e7e in UTIL::lStdHep::Event::read (this=0x45499d8, ls=...) at /work/ilcsoft/slic/HEAD/lcio/HEAD/src/cpp/src/UTIL/lStdHep.cc:420
420        if (*version == '2') {

This is a crash in an LCIO utility class for reading StdHep files, so it is not technically in the SLIC code.

I'm not quite sure of the exact cause, because reading of the version is done for every event.

Is it possible that every so often the event generator code is not assigning a version correctly or possibly the entire event is corrupted?

I'll continue to investigate and let you know what I find out here.

--Jeremy

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1