Print

Print


Hello Everyone,

After assessing tape performance for these jobs at larger scale (maybe related to larger-than-expected incoming data from the other halls in December), I killed the current workflow a couple weeks ago and refactored it to be 1:1 on jobs to EVIO files for tape performance (which required accommodating for void outputs on some files), added independent 100:1 merger jobs to write final files to /cache and cleanup temporaries on /volatile, and resumed a few days ago.

Based on recent performance, should be about 20 days remaining on these skims, dictated by tape access.  I suspect that can be reduced a good bit, maybe 2x, by ordering future jobs by position on tape.  In this particular case of trigger-bit skimming, there's a competing issue of larger temporary disk footprint before merging.

Regarding the 5 batches: the first one I truncated at 50% for refactoring a couple weeks ago and will be cleaned up later, the second one started 2 days ago and is about 70% complete.

Final outputs will always be here:

/cache/hallb/hps/physrun2021/production/evio-skims

-Nathan


> On Dec 7, 2021, at 10:03 PM, Nathan Baltzell <[log in to unmask]> wrote:
> 
> Hello All,
> 
> After some further preparations, the 2021 trigger skims are launched.
> 
> Outputs will be going to /cache/hallb/hps/physrun2021/production/evio-skims.
> 
> I broke the run list from Norman into 5 lists, and started with the first 20% in one batch, all submitted.  I'll proceed to the other 4 batches over the holidays, assessing tape usage as we go.
> 
> -Nathan
> 
>> On Nov 29, 2021, at 3:39 PM, Nathan Baltzell <[log in to unmask]> wrote:
>> 
>> The 10x larger test is done at /volatile/hallb/hps/baltzell/trigtest3
>> 
>> -Nathan
>> 
>> 
>>> On Nov 29, 2021, at 2:52 PM, Nathan Baltzell <[log in to unmask]> wrote:
>>> 
>>> Hello All,
>>> 
>>> Before running over the entire 2021 data set, I ran some test jobs using Maurik’s EVIO trigger bit skimmer.   Here’s the fraction of events kept in 14750 for each skim:
>>> 
>>> fee 2.0%
>>> moll 3.3%
>>> muon 1.9%
>>> rndm 2.9%
>>> 
>>> In each case, it’s inclusive of all such types, e.g., moll=moll+moll_pde+moll_pair, rndm=fcup+pulser.
>>> 
>>> Are those numbers in line with expectations?  The total is 10% and not a problem if these skims are expected to be useful. The outputs are at /volatile/hallb/hps/baltzell/trigtest2 if people are interested to check things.
>>> 
>>> A 10x larger test is running now and going to /volatile/hallb/hps/baltzell/trigtest3 and should be done in the next couple hours.
>>> 
>>> ************
>>> 
>>> Note, it would be prudent to do this *only* for production runs, those that would be used in physics analysis, to avoid unnecessary tape access.  By that I mean removing junk runs, keeping only those with some significant number of events, and only keeping those with physics trigger settings (not special runs).  For that we need a run list.  I think we have close to a PB, but I remember hearing at the collaboration meeting that at least 20% is not useful for the porpoises of trigger bit skimming.
>>> 
>>> -Nathan_______________________________________________
>>> Hps-analysis mailing list
>>> [log in to unmask]
>>> https://mailman.jlab.org/mailman/listinfo/hps-analysis
>> 
> 


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1