Print

Print


On Mon, Apr 20, 2015 at 10:14 PM, Maurik Holtrop <[log in to unmask]>
wrote:

> Hello Jeremy,
>
> I am afraid that things that are fairly easy to do on your own laptop or
> desktop machine, are not so trivial on one of the clon machine. There are
> many reasons for this, one of which is the difficulty moving anything from
> these machines to anywhere else.
>
>
>    1. Most likely the process takes up too much memory when it maps the
>    file. That is why you can only use the last file of a run, because that
>    file usually is much smaller than the 2GB for all the other files. Carl
>    Timmer has nothing to do with this, it is not a bug in Jevio, and we
>    already solved the sequential read issue that would overcome this problem.
>       1. We can now copy the files to /w/stage3/BUFFERED and access them
>       from other machines. I verified this works on file hps_004892.evio.0 using
>       the svt_timing_in_monitoring I was trying to run this morning. So this is
>       one good option.
>    2. I think that the SVT Monitoring steering files had code in them
>    that was not checked in, so I could not run these from my own jar file that
>    can do the sequential read. I don’t know how I can get some methods from
>    one Jar and combine these with methods from another Jar. In other words, I
>    don’t know how to “link” Omar’s jar with my Jevio.jar the way I could for
>    compiled code with LD_PRELOAD. Is there a Java equivalent?
>
>
​What steering files are you trying to use?  If you want to see how the SVT
monitoring is being run, log into clonpc19 as hpsrun and look in the folder
svt_monitoring.​


>
>    1. From the clondaq5 machine, it is a tall order to get the huge crash
>       dump from the screen into an email message. You cannot pipe it into “mail
>       -c [log in to unmask]”, because sendmail is not setup. I would need to
>       somehow get it into a file, transfer the file to the outside, then import
>       it into my laptop and send it to you. It can be done, but when you are
>       under real time pressure to get things done, that is too many steps and too
>       many passwords. I was able to do this now while we are waiting for beam
>       restoration.
>       2. Attached files: crash1.txt is when running from clondaq5 and
>       opening an EVIO file.  crash2.txt shows the crash when I run the
>       svt_monitoring from my own hps-distribution-bin.jar. The error messages are
>       not very clear, but I think in the latter case there are resources missing
>       in the jar. I think Omar just checked in all the code required, so I will
>       try again with a “latest”.
>    1. I do not have the tools to upload the new jevio jar so that we can
>    have that in our distribution instead of the old one. I don’t understand
>    why you are taking so long doing this, since this is a 100% solved issue,
>    as I told you before. The new jar is a drop in for the old one.
>    2. There is no issue on a normal machine, or when running sequential
>    mode, to open any of the evio.x files. IF you are running on clondaq5, then
>    only the smaller files seem to work (I haven’t tried this)
>       1. It used to be the opposite, where the last file would not open
>       if the DAQ had crashed, and we have had a lot of DAQ crashes. That issue
>       was successfully fixed already.
>
>
> Let’s please move forward and start using the latest jevio 4.4.5 instead
> of the old one.
>
> Best,
> Maurik
>
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1
>
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1
>
>
>
>
>
> On Apr 20, 2015, at 4:38 PM, McCormick, Jeremy I. <
> [log in to unmask]> wrote:
>
> Hi, Maurik.
>
> It seems to me that there are the following issues to solve here:
>
> 1) EVIO memory mapping fails on clondaq5.
>
> Do we know why?  Have you or anyone else contacted Carl Timmer about this
> issue?  He is the appropriate go to person for all Java EVIO issues, and he
> is quite responsive when issues arise, in my experience.  If there is a
> per-process memory limit on this machine (maybe from a ulimit setting?),
> can this be increased so that the ~2.0 GB files are mappable?  Is copying
> data files to /work or /volatile where they are accessible from
> less-restricted nodes a viable work around?  Is running memory intensive
> GUI applications on the "daq" machine even something we should be doing?
>  (I don't know for sure.  It just strikes me as not a great idea to run a
> process on part of the DAQ system that potentially takes up 2+ GB of RAM.)
>
> 2) The SVT monitoring steering config you are trying to use is not
> working.
>
> Having no specific traceback information about this, it is difficult to
> say anything much.  Do you see a full error traceback printed to the
> console?  Most log messages that show up as WARNING or SEVERE should have a
> corresponding traceback message that is printed to the console.  I spoke
> with Omar who told me that the SVT steering configs are all working for
> him.  Perhaps you just need to pull the latest from trunk to get this
> working?
>
> 3) We should use a new JEVIO version that supports sequential reading.
>
> I can work on getting this going but first I would like to tag what we
> have working right now by making an HPS Java release (planned for this
> evening).  Then I'd like to work from the 4.4.6-SNAPSHOT that Carl Timmer
> is developing on right now in the new CODA Maven repo (yay!).  Any/all
> changes for supporting sequential reading of EVIO files should go into the
> official JEVIO distribution.  It wasn't clear to me if you were working
> from an official JEVIO release jar downloaded from the CODA website or had
> hacked up JEVIO yourself, so perhaps you can clarify this for me.
>
> 4) Only the last file in a run is readable with the EvioReader.
>
> This was not mentioned in your email, but Omar reported this to me this
> morning.  But I have never seen this problem.  Apparently, only the last
> file in a run can be read into our framework.  Can someone confirm?  How
> can I reproduce this issue?
>
> Anything else that is in urgent need of attention right now?
>
> --Jeremy
>
> On Apr 20, 2015, at 4:46 AM, Maurik Holtrop <[log in to unmask]>
> wrote:
>
> Hello Jeremy,
>
>
> We have a bit of a productivity killer in the following dilemma:
>
> clondaq5 cannot run the monitoring app and read from a file. It dies on
> the "Map Failed", which I think is opening the EVIO file and memory mapping
> it.
> clondaq5 is the only machine in the counting house that can directly "see"
> the /data/hps drive where all our data is stored.
> There is no other drive or space available to use (/volatile or /work)
> where we can copy a data file to be analyzed on another machine.
>
> The result is that it is nearly impossible to do a quick replay of a data
> file using the monitoring app.
> When you are in the counting house and need to quickly get something
> looked at, this is a real issue.
>
> From a quick look, It *appears* that the sequential read works. However,
> when trying the SVT timing in monitoring, I then get a new error: "Error
> setting up LCSim", after "adding coditions listener".
>
> Result: I have no plots to show of all the really hard work that was done
> tonight to time in the SVT and get some tracks. This is a real shame!
>
> Best,
>   Maurik
>
> CONFIG: added EvioDetectorConditionsProcessor to job with detector
> HPS-EngRun2015-Nominal-v0 Opening reader for file
> /data/hps/hps_004870.evio.0 ...
> Mon Apr 20 07:16:49 EDT 2015 MonitoringApplication log
> SEVERE: java.io.IOException: Map failed
> java.lang.RuntimeException: java.io.IOException: Map failed
>   at org.hps.record.evio.EvioFileSource.openReader(EvioFileSource.java:157)
>   at org.hps.record.evio.EvioFileSource.<init>(EvioFileSource.java:51)
>   at
> org.hps.record.composite.CompositeLoop.setCompositeLoopConfiguration(CompositeLoop.java:261)
>   at org.hps.record.composite.CompositeLoop.<init>(CompositeLoop.java:94)
>   at
> org.hps.monitoring.application.EventProcessing.setupLoop(EventProcessing.java:595)
>   at
> org.hps.monitoring.application.EventProcessing.setup(EventProcessing.java:444)
>   at
> org.hps.monitoring.application.MonitoringApplication.startSession(MonitoringApplication.java:996)
>   at
> org.hps.monitoring.application.MonitoringApplication.actionPerformed(MonitoringApplication.java:335)
>   at
> javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)
>   at
> javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2341)
>   at
> javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)
>   at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
>   at
> javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:252)
>   at java.awt.Component.processMouseEvent(Component.java:6516)
>   at javax.swing.JComponent.processMouseEvent(JComponent.java:3320)
>   at java.awt.Component.processEvent(Component.java:6281)
>   at java.awt.Container.processEvent(Container.java:2229)
>   at java.awt.Component.dispatchEventImpl(Component.java:4872)
>   at java.awt.Container.dispatchEventImpl(Container.java:2287)
>   at java.awt.Component.dispatchEvent(Component.java:4698)
>   at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)
>   at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)
>   at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4422)
>   at java.awt.Container.dispatchEventImpl(Container.java:2273)
>   at java.awt.Window.dispatchEventImpl(Window.java:2719)
>   at java.awt.Component.dispatchEvent(Component.java:4698)
>   at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)
>   at java.awt.EventQueue.access$200(EventQueue.java:103)
>   at java.awt.EventQueue$3.run(EventQueue.java:694)
>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1
>
>
>
> ------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the HPS-SOFTWARE list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1
>
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1