LISTSERV 16.5 - HPS-SOFTWARE Archives

On Feb 7, 2019, at 23:42, Graf, Norman A. <[log in to unmask]> wrote:

Hello Rafo, Nathan,

Thanks for shepherding and monitoring the reconstruction.

I, too, have noticed some issues with the /work disks. Transfers to SLAC have been timing out very often in the past few weeks, and I have been noticing the same latency when doing a simple "ls."

Norman

From: [log in to unmask] <[log in to unmask]> on behalf of Rafayel Paremuzyan <[log in to unmask]>
Sent: Thursday, February 7, 2019 7:28 PM
To: Nathan Baltzell
Cc: hps-software
Subject: Re: The new cooking

Hi Nathan,

These are the last jobs of the 10% pass.

Usually yes I am keeping the queue, I am not waiting all the jobs to be finished before submitting the next chunk.

Another thing that is starting to annoy more and more, is that ifarms are becoming practically unusable, most of the time

they are overloaded, and a simple "ls" command takes forever. This happens to "/work" disk too.

This behavior impacts on tarring, i.e. farms put a lot of outputs to work disk, but ifarms are "Too slow" yo catch up with

farm jobs.

One thing I was thinking, will be good if hps would have a separate machine designed for tarring files and sending to tape,

or even would be much better if that machine would have about 20T free space, that we will be separate from the work disk.

I don't know how reasonable is this...

Rafo

From: Nathan Baltzell
Sent: Thursday, February 7, 2019 9:47:27 PM
To: Rafayel Paremuzyan
Cc: HPS-SOFTWARE
Subject: Re: The new cooking

Hi Rafo,

Based on scicomp.jlab.org website, I see ~800 hps jobs in SLURM currently running (great, almost 3x more than hps's previous *average* experience on jlab batch farm) but almost none in the queue or in depend state. Wondering if it may be better to keep the queue more saturated? I guess you are staging things in batches (including tarballing to tape)? Ok, I was just taking a look and curious about throughput:)

Jefferson Lab Scientific Computing

scicomp.jlab.org

Toggle navigation Scientific Computing. Getting Started; Support . Report Problems; Staff Members; Suggestion

-Nathan

On Feb 6, 2019, at 23:52, Rafayel Paremuzyan <[log in to unmask]> wrote:

Hi all,

while the new pass BLPass4b is being cooked,

you could look into pass related details in this confluence page.

https://confluence.slac.stanford.edu/display/hpsg/BLPass4b,

Rafo

Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1

Use REPLY-ALL to reply to list

To unsubscribe from the HPS-SOFTWARE list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=HPS-SOFTWARE&A=1