Hi Alessandra, The site is CC. They didn't seem to want to mount the cvmfs repository but maybe we could convince them to. I can download the file explicitly instead when required. Sorry, I hadn't realised that this would put such a load on the system. Thanks, James On 07/06/2019 15:16, Alessandra Forti wrote: > Hi James, > > Is there a reason why they can't mount it? Is it LAPP or CC? > > I would recommend that you don't use the software as an input but you > download it explicitely from the job if you cannot find it in CVMFS. > And/or the tarball should be copied to the French site storage closest > to their nodes. > > The tarball on our storage was being accessed by 1500 processes > concurrently on the same machine earlier today and I had already to > replicate 3 times the file to try to spread the load on others. I'm > surprised you didn't have time outs. > > cheers > alessandra > > On 07/06/2019 14:59, PERRY James wrote: >> Hi Alessandra, >> >> We are mostly using CVMFS, but one of the compute nodes in France >> doesn't mount our CVMFS repository so we need the tarball for that one. >> Unfortunately because I can't predict when I submit a job whether it >> will go to that node or not, all the jobs have the tarball listed as an >> input file. I tried uploading copies to other storage elements as well >> when I first put it on the grid, but at the time only Manchester was >> working for me. I'm happy to discuss other solutions to this if it's >> causing problems. >> >> Thanks, >> James >> >> >> On 07/06/2019 14:52, Alessandra Forti wrote: >>> Hi James, >>> >>> can you let me know how you do software distribution? It seems you have >>> 1 single tarball on the Manchester storage that is creating a large >>> amount of connections. >>> >>> They might be among the causes of the current load we are experiencing. >>> Manchester isn't running anything at the moment, so either those are ill >>> closed connections (could be) or the tar ball you have on the manchester >>> storage is the only source access by WNs at other sites in the UK. >>> >>> We always said that until the software was in development and LSST run >>> smaller scale the storage was fine, but it wouldn't work if too many >>> jobs tried to access the same file on one storage. Have you thought >>> about using cvmfs or at the very least replicate the tarball at other >>> sites? >>> >>> thanks >>> >>> cheers >>> alessandra >>> >> -- >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> James Perry Room 2.41, Bayes Centre >> Software Architect The University of Edinburgh >> EPCC 47 Potterrow >> Tel: +44 131 650 5173 Edinburgh, EH8 9BT >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ James Perry Room 2.41, Bayes Centre Software Architect The University of Edinburgh EPCC 47 Potterrow Tel: +44 131 650 5173 Edinburgh, EH8 9BT ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the LSST-DESC-GRID list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=LSST-DESC-GRID&A=1