Print

Print


Hi Rafo,
I suggest memory profiling to see if GC is the reason of the performance degradation, or quick check would be to set -Xms=-Xmx=someLargeValue and see if you get some speed up.
-vardan 

Sent from my iPhone

> On May 21, 2017, at 12:54 AM, Rafayel Paremuzyan <[log in to unmask]> wrote:
> 
> Hi Alessandra, Norman, all
> 
> thank you for replay and your tests.
> 
> I tested both 2015 and 2016 data using v4-4 detector on UNH computers.
> I have use 3.8 JAR (the jar for 2015 pass6). 3.9 JAR (the jar for 2016 pass0 recon), and the NEW jar v051717 (the newest jar tag is v051717)
> 
> Ok, I also noticed that recon of 2015 data is faster that 2016 data. 
> Also seems the new jar is 20% slower than the 3.9 jar for 2016 data, and about 60% slower for 2015 data.
> now recon speed is about 2.55 EV/S for 2015 data, This is too slow
> it cause more than 40h for a single file.
> 
> Ths is summary of code speeds with different jar files
> V4-4 Detector, UNH (endeavour), 5K events are reconstructed
> 
> Events per second	Events per second	Events per second
> 
> 3.8JAR (2015 recon jar)	3.9JAR, 2016 pass0 recon jar	v051717JAR, jar vor tpass1
> 2015 Data 5772, file 20	5.07	5.19	3.157
> 2016 Data file 25	
> 3.11	2.53
> 
> *However* I looked into job Wall times for pass0 recon.
> The recon speed is more than 7.4 Events/sec, which is about x3 faster than with the new JAR.
> 
> I again checked *same 3.9 jar*, which is slower again.
> I don't know why now the code speed is so low!
> 
> 
> Norman, I have tried the "-DdisableSvtAlignmentConstants", but it didn't work
> 
> =================The command===============
> java -XX:+UseSerialGC -cp hps-distribution-3.11-v051717-bin.jar org.hps.evio.EvioToLcio -x /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim -r -d HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign -R 7796 -DoutputFile=out_7796_0 -DdisableSvtAlignmentConstants hps_007796.evio.25 -n 10000
> 
> ============The error backtrache============
> 2017-05-21 00:45:39 [CONFIG] org.hps.evio.EvioToLcio parse :: using steering resource /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim
> 2017-05-21 00:45:39 [CONFIG] org.hps.evio.EvioToLcio parse :: set max events to 10000
> 2017-05-21 00:45:48 [INFO] org.hps.rundb.RunManager <init> :: ConnectionParameters { database: hps_run_db_v2, hostname: hpsdb.jlab.org, password: darkphoton, port: 3306, user: hpsuser }
> 2017-05-21 00:45:48 [CONFIG] org.lcsim.job.JobControlManager addVariableDefinition :: outputFile = out_7796_0
> 2017-05-21 00:45:48 [CONFIG] org.hps.evio.EvioToLcio parse :: set steering variable: outputFile=out_7796_0
> 2017-05-21 00:45:48 [SEVERE] org.hps.evio.EvioToLcio parse :: bad variable format: disableSvtAlignmentConstants
> java.lang.IllegalArgumentException: Bad variable format: disableSvtAlignmentConstants
>         at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:393)
>         at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97)
> 
> Exception in thread "main" java.lang.IllegalArgumentException: Bad variable format: disableSvtAlignmentConstants
>         at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:393)
>         at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97)
> 
> Rafo
> 
> 
> 
> 
> 
> 
>> On 05/20/2017 06:17 AM, Alessandra Filippi wrote:
>> Hi Rafo, all, 
>> I also noticed that the reconstruction for 2016 data is about twice as slow as compared to 2015 (whichever geometry and reconstruction version). 
>> This happens when I run the aligned geometry as well as the "current" one (v5.0), and the geometry taken from the db as well (the result is the same as v5.0). I did not make any test with v4.4, though - actually as far as svt alignment is concerned it should be the same as v5.0. Can you please try and make the same short test with the newest jar with v4.4? 
>> This happens to me both with hps-java 5.10 and 5.11 (not the most updated one). 
>> 
>> I would be surprised if it could be something connected to the alignment, unless for some reason new positions and harder tracks trigger some long loops in the reconstruction. But this happens (to me) also with the 
>> standard geometry, so a check with the one used with pass0 (that should however be equivalent to v5.0) could at least help to rule out, or blame on, the alignment step. 
>> Thanks, cheers 
>>     Alessandra 
>> 
>> 
>> ps. make also sure that the correct fieldmap is called in all the compact files - you never know! 
>> 
>> 
>> 
>> On Fri, 19 May 2017, Rafayel Paremuzyan wrote: 
>> 
>>> Hi All, 
>>> 
>>> During the testing the recon for test pass1, 
>>> I noticed the recon time is more than x2 longer wrt pass0 recon time. 
>>> 
>>> To demonstrate it 
>>> I submit 3 simple jobs with 10K events to reconstruct, with new pass1 xml 
>>> file (this has the new jar v051717, and the new detector 
>>> HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign), 
>>> and the old pass0 xml file (pass0 jar release 3.9, and the detector 
>>> HPS-PhysicsRun2016-Nominal-v4-4-fieldmap) 
>>> 
>>> Below is a printout from the jobs with a new JAR, v051717, the average time 
>>> for 1000 events is more than 7 minutes 
>>> ===================== LOG from the v051717 JAR 
>>> ============================== 
>>> 2017-05-19 09:36:51 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10614074 with sequence 0 
>>> 2017-05-19 09:43:13 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10615074 with sequence 1000 
>>> 2017-05-19 09:49:18 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10616074 with sequence 2000 
>>> 2017-05-19 09:55:54 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10617074 with sequence 3000 
>>> 2017-05-19 10:02:55 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10618074 with sequence 4000 
>>> 2017-05-19 10:09:57 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10619074 with sequence 5000 
>>> 2017-05-19 10:16:13 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10620074 with sequence 6000 
>>> 2017-05-19 10:25:20 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10621074 with sequence 7000 
>>> 2017-05-19 10:32:56 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10622074 with sequence 8000 
>>> 2017-05-19 10:36:19 [WARNING] org.hps.recon.tracking.TrackerReconDriver 
>>> process :: Discarding track with bad HelicalTrackHit (correction distance 
>>> 0.000000, chisq penalty 0.000000) 
>>> 2017-05-19 10:42:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10623074 with sequence 9000 
>>> 2017-05-19 10:47:44 [INFO] org.hps.evio.EvioToLcio run :: maxEvents 10000 
>>> was reached 
>>> 2017-05-19 10:47:44 [INFO] org.lcsim.job.EventMarkerDriver endOfData :: 
>>> 10000 events processed in job. 
>>> 2017-05-19 10:47:44 [INFO] org.hps.evio.EvioToLcio run :: Job finished 
>>> successfully! 
>>> 
>>> 
>>> And below is the Job log info from the pass0 jar. The average time for 1000 
>>> events is less than 3 minutes 
>>> ===================== LOG from the 3.9 release JAR 
>>> ============================== 
>>> 2017-05-19 13:19:46 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10614074 with sequence 0 
>>> 2017-05-19 13:23:36 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10615074 with sequence 1000 
>>> 2017-05-19 13:27:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10616074 with sequence 2000 
>>> 2017-05-19 13:30:40 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10617074 with sequence 3000 
>>> 2017-05-19 13:34:20 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10618074 with sequence 4000 
>>> 2017-05-19 13:38:11 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10619074 with sequence 5000 
>>> 2017-05-19 13:41:43 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10620074 with sequence 6000 
>>> 2017-05-19 13:45:54 [WARNING] org.hps.recon.tracking.TrackerReconDriver 
>>> process :: Discarding track with bad HelicalTrackHit (correction distance 
>>> 0.000000, chisq penalty 0.000000) 
>>> 2017-05-19 13:46:05 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10621074 with sequence 7000 
>>> 2017-05-19 13:50:08 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10622074 with sequence 8000 
>>> 2017-05-19 13:55:03 [INFO] org.lcsim.job.EventMarkerDriver process :: Event 
>>> 10623074 with sequence 9000 
>>> 2017-05-19 13:58:27 [INFO] org.hps.evio.EvioToLcio run :: maxEvents 10000 
>>> was reached 
>>> 2017-05-19 13:58:27 [INFO] org.lcsim.job.EventMarkerDriver endOfData :: 
>>> 10000 events processed in job. 
>>> 2017-05-19 13:58:27 [INFO] org.hps.evio.EvioToLcio run :: Job finished 
>>> successfully! 
>>> 
>>> I also tried to do reconstruction by myself interactively, but I am getting 
>>> error below, 
>>> 
>>> The command 
>>> /apps/scicomp/java/jdk1.7/bin/java -XX:+UseSerialGC -cp 
>>> hps-distribution-3.9-bin.jar org.hps.evio.EvioToLcio -x 
>>> /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim -r -d 
>>> HPS-PhysicsRun2016-v5-3-fieldmap_globalAlign -R 7796 -DoutputFile=out_7796_0 
>>> hps_007796.evio.0 -n 10000 
>>> 
>>> The Error traceback 
>>> 017-05-19 14:58:44 [CONFIG] org.hps.evio.EvioToLcio parse :: using steering 
>>> resource /org/hps/steering/recon/PhysicsRun2016FullRecon.lcsim 
>>> 2017-05-19 14:58:44 [CONFIG] org.hps.evio.EvioToLcio parse :: set max events 
>>> to 10000 
>>> 2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager 
>>> addVariableDefinition :: outputFile = out_7796_0 
>>> 2017-05-19 14:58:45 [CONFIG] org.hps.evio.EvioToLcio parse :: set steering 
>>> variable: outputFile=out_7796_0 
>>> 2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager initializeLoop 
>>> :: initializing LCSim loop 
>>> 2017-05-19 14:58:45 [CONFIG] org.lcsim.job.JobControlManager initializeLoop 
>>> :: Event marker printing disabled. 
>>> 2017-05-19 14:58:45 [INFO] 
>>> org.hps.conditions.database.DatabaseConditionsManager resetInstance :: 
>>> DatabaseConditionsManager instance is reset 
>>> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
>>> /u/apps/scicomp/java/jdk1.7.0_75/jre/lib/i386/xawt/libmawt.so: libXext.so.6: 
>>> cannot open shared object file: No such file or directory 
>>>         at java.lang.ClassLoader$NativeLibrary.load(Native Method) 
>>>         at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965) 
>>>         at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890) 
>>>         at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1851) 
>>>         at java.lang.Runtime.load0(Runtime.java:795) 
>>>         at java.lang.System.load(System.java:1062) 
>>>         at java.lang.ClassLoader$NativeLibrary.load(Native Method) 
>>>         at java.lang.ClassLoader.loadLibrary1(ClassLoader.java:1965) 
>>>         at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1890) 
>>>         at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1872) 
>>>         at java.lang.Runtime.loadLibrary0(Runtime.java:849) 
>>>         at java.lang.System.loadLibrary(System.java:1088) 
>>>         at 
>>> sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:67) 
>>>         at 
>>> sun.security.action.LoadLibraryAction.run(LoadLibraryAction.java:47) 
>>>         at java.security.AccessController.doPrivileged(Native Method) 
>>>         at java.awt.Toolkit.loadLibraries(Toolkit.java:1653) 
>>>         at java.awt.Toolkit.<clinit>(Toolkit.java:1682) 
>>>         at java.awt.Component.<clinit>(Component.java:595) 
>>>         at org.lcsim.util.aida.AIDA.<init>(AIDA.java:68) 
>>>         at org.lcsim.util.aida.AIDA.defaultInstance(AIDA.java:53) 
>>>         at org.hps.evio.RfFitterDriver.<init>(RfFitterDriver.java:31) 
>>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
>>> Method) 
>>>         atsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcce
>>> ssorImpl.java:57) 
>>>         atsun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstru
>>> ctorAccessorImpl.java:45) 
>>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526) 
>>>         at java.lang.Class.newInstance(Class.java:379) 
>>>         at 
>>> org.lcsim.job.JobControlManager.setupDrivers(JobControlManager.java:1199) 
>>>         at org.hps.job.JobManager.setupDrivers(JobManager.java:82) 
>>>         at 
>>> org.lcsim.job.JobControlManager.setup(JobControlManager.java:1052) 
>>>         at 
>>> org.lcsim.job.JobControlManager.setup(JobControlManager.java:1110) 
>>>         at org.hps.evio.EvioToLcio.parse(EvioToLcio.java:407) 
>>>         at org.hps.evio.EvioToLcio.main(EvioToLcio.java:97) 
>>> 
>>> 
>>> 
>>> I see this library libXext.so.6: in /usr/lib64, but not in /usr/lib, 
>>> when I put /usr/lib64 in my LD_LIBRARY_PATH, then it complaines again (see 
>>> below) 
>>> 
>>> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
>>> /u/apps/scicomp/java/jdk1.7.0_75/jre/lib/i386/xawt/libmawt.so: libXext.so.6: 
>>> wrong ELF class: ELFCLASS64 
>>> 
>>> I would appreciate, if I get some help on running the reconstruction 
>>> interactively, then I could look more closely into logs 
>>> of the old, and new JAR files. 
>>> 
>>> Rafo 
>>> 
>>> 
>>> ____________________________________________________________________________ 
>>> 
>>> Use REPLY-ALL to reply to list 
>>> 
>>> To unsubscribe from the HPS-SOFTWARE list, click the following link: 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__listserv.slac.stanford.edu_cgi-2Dbin_wa-3FSUBED1-3DHPS-2DSOFTWARE-26A-3D1&d=DwIDaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=0HDJrGO9TZQTE97J9Abt2A&m=xnbGP5VHYWRAQRWRksVgMnYvBkXWI4roLxztdJ0Tp9I&s=ppNYedSrn5DPaIZZJgRZu8tBDeSjroqbj_PoevFoFpI&e= 
>>> 
>>> 
>> 
>> ######################################################################## 
>> Use REPLY-ALL to reply to list 
>> 
>> To unsubscribe from the HPS-SOFTWARE list, click the following link: 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__listserv.slac.stanford.edu_cgi-2Dbin_wa-3FSUBED1-3DHPS-2DSOFTWARE-26A-3D1&d=DwIDaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=0HDJrGO9TZQTE97J9Abt2A&m=xnbGP5VHYWRAQRWRksVgMnYvBkXWI4roLxztdJ0Tp9I&s=ppNYedSrn5DPaIZZJgRZu8tBDeSjroqbj_PoevFoFpI&e=
> 
> 
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1