We do get huge lists of files to HPSS (and do not use individual pftps). We observed within a single process P1 access patterns along the lines of one file open / processing and while processing, another client P2 requests other files, kicking the tape out of the drive (imagine many clients simultaneously of course). On the next file N+1 from P1, the same tape is being mounted again and so on ... leading to several 10th of mount / dismount along the work unit of {P}i which could be avoided. Pre-staging would allow larger bulk request, tape sorting etc ... Worth a try. Thank you, Andrew Hanushevsky wrote: > Hi Pavel, > > Yes, look at XrdClientAdmin.hh. The XrdClientAdmin interface has a > Prepare() method that allows you to pass a list of files. This list is > passed to xrootd which then locates the files and locates them. If you > also specify kXR_stage then the files will be staged if not on disk. There > are additional options but for now this should be enough to get you going. > > That said, what we have found is that unless you have taken great care to > group files that will be likely used together on the same tape (something > we find difficult to do), the probability of improving tape access is > rather low unless you get a huge list of files into HPSS. That in itself > causes other problems. > > Andy > > On Wed, 25 Oct 2006, Pavel Jakl wrote: > >> Hi Andy and others, >> >> I would like to discuss the topic of having a possibility to "prepare" a >> list of files before user's jobs processing or at the beginning of the >> job. The fact why I am rising this topic is a optimization of the >> access to our tape system (HPSS). >> >> What we are doing now is that we organize the requests in a fashion that >> we have as many file requests as possible on the same tape at HPSS and >> therefore achieve better IO performance. Of course, one can imagine that >> when we would have bigger list than we can sort better and have more >> file's requests at the same tape. >> >> Our framework gives a possibility that user's lists within one job are >> already sorted in some fashion and with a big hope that those files are >> presumably at the same tape. The problem is that processing of these >> lists goes at subsequent order, one by one. >> So, my effort is to give to xrootd a list of files to "prepared" before >> processing them (or at least the full list would start to be prepared >> when job will start). Can I somehow "publish" to server that these files >> from the list, I will use in very close future ? >> >> So, I have figured 2 possible cases how to do it: >> >> 1) Use AsyncOpen at client side >> >> I am scared of this solution, since it can use lots of resources >> with many simultaneously opened connections. I can see jobs with >> thousand of files. In our case, when we have 400 nodes for job's >> processing, it could be very big number of connections to redirector node. >> >> 2) There is some sort of "prepare" methods at server side. So, how can I >> call them from the client side ? >> >> This solution would be better for me, if I could somehow pass the list >> to the server through the client and do not need any other assistance or >> presence of the client at each file's preparation. >> I know that there could be a problem with this. The files can disappear >> between the prepared and actual requested time of processing. (server >> went down, purging etc.). I think that the occurrence of this case is >> usually very small. However, I didn't make any sophisticated >> investigation to prove it since it would be very hard to get these >> statistic. >> >> Thanks for any suggestion or help >> Cheers >> Pavel >> -- ,,,,, ( o o ) --m---U---m-- Jerome