​Hello Maurik,


Have you been able to run the profiler under equivalent conditions with the two jar files yet?


Norman



From: [log in to unmask] <[log in to unmask]> on behalf of Maurik Holtrop <[log in to unmask]>
Sent: Sunday, May 21, 2017 7:04 AM
To: Rafayel Paremuzyan
Cc: Alessandra Filippi; hps-software; [log in to unmask]
Subject: Re: Test pass1 Status
 
Hello Rafo,

One thing that probably is different between the last time we ran with the 3.8 jar and now is a different version of the Java VM. It could well be that the newer version of Java is not faster. Also, it is tricky to compare Endeavour with the Jlab farm computers. They are probably not equivalent in speed. At UNH, Pumpkin has the more modern processors, whereas Endeavour is now ~5 years old.

Best,
Maurik
 
On May 21, 2017, at 6:54 AM, Rafayel Paremuzyan <[log in to unmask]> wrote:

Hi Alessandra, Norman, all

thank you for replay and your tests.

I tested both 2015 and 2016 data using v4-4 detector on UNH computers.
I have use 3.8 JAR (the jar for 2015 pass6). 3.9 JAR (the jar for 2016 pass0 recon), and the NEW jar v051717 (the newest jar tag is v051717)

Ok, I also noticed that recon of 2015 data is faster that 2016 data. 
Also seems the new jar is 20% slower than the 3.9 jar for 2016 data, and about 60% slower for 2015 data.
now recon speed is about 2.55 EV/S for 2015 data, This is too slow
it cause more than 40h for a single file.

Ths is summary of code speeds with different jar files

V4-4 Detector, UNH (endeavour), 5K events are reconstructed

Events per second Events per second Events per second

3.8JAR (2015 recon jar) 3.9JAR, 2016 pass0 recon jar v051717JAR, jar vor tpass1
2015 Data 5772, file 20 5.07 5.19 3.157
2016 Data file 25
3.11 2.53


*However* I looked into job Wall times for pass0 recon.
The recon speed is more than 7.4 Events/sec, which is about x3 faster than with the new JAR.

I again checked *same 3.9 jar*, which is slower again.
I don't know why now the code speed is so low!


Norman, I have tried the "-DdisableSvtAlignmentConstants", but it didn't work

=================The command===============
java -XX:+UseSerialGC -cp hps-distribution-3.11-v051717-bin.jar org.hps.evio.EvioToLcio -x /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim -r -d HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign -R 7796 -DoutputFile=out_7796_0 -DdisableSvtAlignmentConstants hps_007796.evio.25 -n 10000

============The error backtrache============
2017-05-21 00:45:39 [CONFIG] org.hps.evio.EvioToLcio parse :: using steering resource /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim
2017-05-21 00:45:39 [CONFIG] org.hps.evio.EvioToLcio parse :: set max events to 10000
2017-05-21 00:45:48 [INFO] org.hps.rundb.RunManager <init> :: ConnectionParameters { database: hps_run_db_v2, hostname: hpsdb.jlab.org, password: darkphoton, port: 3306, user: hpsuser }
2017-05-21 00:45:48 [CONFIG] org.lcsim.job.JobControlManager addVariableDefinition :: outputFile = out_7796_0
2017-05-21 00:45:48 [CONFIG] org.hps.evio.EvioToLcio parse :: set steering variable: outputFile=out_7796_0
2017-05-21 00:45:48 [SEVERE] org.hps.evio.EvioToLcio parse :: bad variable format: disableSvtAlignmentConstants
java.lang.IllegalArgumentException: Bad variable format: disableSvtAlignmentConstants
        at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:393)
        at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97)

Exception in thread "main" java.lang.IllegalArgumentException: Bad variable format: disableSvtAlignmentConstants
        at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:393)
        at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97)

Rafo






On 05/20/2017 06:17 AM, Alessandra Filippi wrote:
Hi Rafo, all, 
I also noticed that the reconstruction for 2016 data is about twice as slow as compared to 2015 (whichever geometry and reconstruction version). 
This happens when I run the aligned geometry as well as the "current" one (v5.0), and the geometry taken from the db as well (the result is the same as v5.0). I did not make any test with v4.4, though - actually as far as svt alignment is concerned it should be the same as v5.0. Can you please try and make the same short test with the newest jar with v4.4? 
This happens to me both with hps-java 5.10 and 5.11 (not the most updated one). 

I would be surprised if it could be something connected to the alignment, unless for some reason new positions and harder tracks trigger some long loops in the reconstruction. But this happens (to me) also with the 
standard geometry, so a check with the one used with pass0 (that should however be equivalent to v5.0) could at least help to rule out, or blame on, the alignment step. 
Thanks, cheers 
    Alessandra 


ps. make also sure that the correct fieldmap is called in all the compact files - you never know! 



On Fri, 19 May 2017, Rafayel Paremuzyan wrote: 

Hi All, 

During the testing the recon for test pass1, 
I noticed the recon time is more than x2 longer wrt pass0 recon time. 

To demonstrate it 
I submit 3 simple jobs with 10K events to reconstruct, with new pass1 xml 
file (this has the new jar v051717, and the new detector 
HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign), 
and the old pass0 xml file (pass0 jar release 3.9, and the detector 
HPS-PhysicsRun2016-Nominal-v4-4-fieldmap) 

Below is a printout from the jobs with a new JAR, v051717, the average time 
for 1000 events is more than 7 minutes 
===================== LOG from the v051717 JAR 
============================== 
2017-05-19 09:36:51 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10614074 with sequence 0 
2017-05-19 09:43:13 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10615074 with sequence 1000 
2017-05-19 09:49:18 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10616074 with sequence 2000 
2017-05-19 09:55:54 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10617074 with sequence 3000 
2017-05-19 10:02:55 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10618074 with sequence 4000 
2017-05-19 10:09:57 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10619074 with sequence 5000 
2017-05-19 10:16:13 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10620074 with sequence 6000 
2017-05-19 10:25:20 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10621074 with sequence 7000 
2017-05-19 10:32:56 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10622074 with sequence 8000 
2017-05-19 10:36:19 [WARNING] org.hps.recon.tracking.TrackerReconDriver 
process :: Discarding track with bad HelicalTrackHit (correction distance 
0.000000, chisq penalty 0.000000) 
2017-05-19 10:42:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10623074 with sequence 9000 
2017-05-19 10:47:44 [INFO] org.hps.evio.EvioToLcio run :: maxEvents 10000 
was reached 
2017-05-19 10:47:44 [INFO] org.lcsim.job.EventMarkerDriver endOfData :: 
10000 events processed in job. 
2017-05-19 10:47:44 [INFO] org.hps.evio.EvioToLcio run :: Job finished 
successfully! 


And below is the Job log info from the pass0 jar. The average time for 1000 
events is less than 3 minutes 
===================== LOG from the 3.9 release JAR 
============================== 
2017-05-19 13:19:46 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10614074 with sequence 0 
2017-05-19 13:23:36 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10615074 with sequence 1000 
2017-05-19 13:27:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10616074 with sequence 2000 
2017-05-19 13:30:40 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10617074 with sequence 3000 
2017-05-19 13:34:20 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10618074 with sequence 4000 
2017-05-19 13:38:11 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10619074 with sequence 5000 
2017-05-19 13:41:43 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10620074 with sequence 6000 
2017-05-19 13:45:54 [WARNING] org.hps.recon.tracking.TrackerReconDriver 
process :: Discarding track with bad HelicalTrackHit (correction distance 
0.000000, chisq penalty 0.000000) 
2017-05-19 13:46:05 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10621074 with sequence 7000 
2017-05-19 13:50:08 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10622074 with sequence 8000 
2017-05-19 13:55:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
10623074 with sequence 9000 
2017-05-19 13:58:27 [INFO] org.hps.evio.EvioToLcio run :: maxEvents 10000 
was reached 
2017-05-19 13:58:27 [INFO] org.lcsim.job.EventMarkerDriver endOfData :: 
10000 events processed in job. 
2017-05-19 13:58:27 [INFO] org.hps.evio.EvioToLcio run :: Job finished 
successfully! 

I also tried to do reconstruction by myself interactively, but I am getting 
error below, 

The command 
/apps/scicomp/java/jdk1.7/bin/java -XX:+UseSerialGC -cp 
hps-distribution-3.9-bin.jar org.hps.evio.EvioToLcio -x 
/org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim -r -d 
HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign -R 7796 -DoutputFile=out_7796_0 
hps_007796.evio.0 -n 10000 

The Error traceback 
017-05-19 14:58:44 [CONFIG] org.hps.evio.EvioToLcio parse :: using steering 
resource /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim 
2017-05-19 14:58:44 [CONFIG] org.hps.evio.EvioToLcio parse :: set max events 
to 10000 
2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager 
addVariableDefinition :: outputFile = out_7796_0 
2017-05-19 14:58:45 [CONFIG] org.hps.evio.EvioToLcio parse :: set steering 
variable: outputFile=out_7796_0 
2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager initializeLoop 
:: initializing LCSim loop 
2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager initializeLoop 
:: Event marker printing disabled. 
2017-05-19 14:58:45 [INFO] 
org.hps.conditions.database.DatabaseConditionsManager resetInstance :: 
DatabaseConditionsManager instance is reset 
Exception in thread "main" java.lang.UnsatisfiedLinkError: 
/u/apps/scicomp/java/jdk1.7.0_75/jre/lib/i386/xawt/libmawt.so: libXext.so.6: 
cannot open shared object file: No such file or directory 
        at java.lang.ClassLoader$NativeLibrary.load(Native Method) 
        at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965) 
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890) 
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1851) 
        at java.lang.Runtime.load0(Runtime.java:795) 
        at java.lang.System.load(System.java:1062) 
        at java.lang.ClassLoader$NativeLibrary.load(Native Method) 
        at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965) 
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890) 
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872) 
        at java.lang.Runtime.loadLibrary0(Runtime.java:849) 
        at java.lang.System.loadLibrary(System.java:1088) 
        at 
sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67) 
        at 
sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47) 
        at java.security.AccessController.doPrivileged(Native Method) 
        at java.awt.Toolkit.loadLibraries(Toolkit.java:1653) 
        at java.awt.Toolkit.<clinit>(Toolkit.java:1682) 
        at java.awt.Component.<clinit>(Component.java:595) 
        at org.lcsim.util.aida.AIDA.<init>(AIDA.java:68) 
        at org.lcsim.util.aida.AIDA.defaultInstance(AIDA.java:53) 
        at org.hps.evio.RfFitterDriver.<init>(RfFitterDriver.java:31) 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) 
        atsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcce
ssorImpl.java:57) 
        atsun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstru
ctorAccessorImpl.java:45) 
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
        at java.lang.Class.newInstance(Class.java:379) 
        at 
org.lcsim.job.JobControlManager.setupDrivers(JobControlManager.java:1199) 
        at org.hps.job.JobManager.setupDrivers(JobManager.java:82) 
        at 
org.lcsim.job.JobControlManager.setup(JobControlManager.java:1052) 
        at 
org.lcsim.job.JobControlManager.setup(JobControlManager.java:1110) 
        at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:407) 
        at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97) 



I see this library libXext.so.6: in /usr/lib64, but not in /usr/lib, 
when I put /usr/lib64 in my LD_LIBRARY_PATH, then it complaines again (see 
below) 

Exception in thread "main" java.lang.UnsatisfiedLinkError: 
/u/apps/scicomp/java/jdk1.7.0_75/jre/lib/i386/xawt/libmawt.so: libXext.so.6: 
wrong ELF class: ELFCLASS64 

I would appreciate, if I get some help on running the reconstruction 
interactively, then I could look more closely into logs 
of the old, and new JAR files. 

Rafo 


____________________________________________________________________________ 

Use REPLY-ALL to reply to list 

To unsubscribe from the HPS-SOFTWARE list, click the following link: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__listserv.slac.stanford.edu_cgi-2Dbin_wa-3FSUBED1-3DHPS-2DSOFTWARE-26A-3D1&d=DwIDaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=0HDJrGO9TZQTE97J9Abt2A&m=xnbGP5VHYWRAQRWRksVgMnYvBkXWI4roLxztdJ0Tp9I&s=ppNYedSrn5DPaIZZJgRZu8tBDeSjroqbj_PoevFoFpI&e= 



######################################################################## 
Use REPLY-ALL to reply to list 

To unsubscribe from the HPS-SOFTWARE list, click the following link: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__listserv.slac.stanford.edu_cgi-2Dbin_wa-3FSUBED1-3DHPS-2DSOFTWARE-26A-3D1&d=DwIDaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=0HDJrGO9TZQTE97J9Abt2A&m=xnbGP5VHYWRAQRWRksVgMnYvBkXWI4roLxztdJ0Tp9I&s=ppNYedSrn5DPaIZZJgRZu8tBDeSjroqbj_PoevFoFpI&e=



Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1




Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1



Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1