Hi Andy and others,
I would like to discuss the topic of having a possibility to "prepare" a
list of files before user's jobs processing or at the beginning of the
job. The fact why I am rising this topic is a optimization of the
access to our tape system (HPSS).
What we are doing now is that we organize the requests in a fashion that
we have as many file requests as possible on the same tape at HPSS and
therefore achieve better IO performance. Of course, one can imagine that
when we would have bigger list than we can sort better and have more
file's requests at the same tape.
Our framework gives a possibility that user's lists within one job are
already sorted in some fashion and with a big hope that those files are
presumably at the same tape. The problem is that processing of these
lists goes at subsequent order, one by one.
So, my effort is to give to xrootd a list of files to "prepared" before
processing them (or at least the full list would start to be prepared
when job will start). Can I somehow "publish" to server that these files
from the list, I will use in very close future ?
So, I have figured 2 possible cases how to do it:
1) Use AsyncOpen at client side
I am scared of this solution, since it can use lots of resources
with many simultaneously opened connections. I can see jobs with
thousand of files. In our case, when we have 400 nodes for job's
processing, it could be very big number of connections to redirector node.
2) There is some sort of "prepare" methods at server side. So, how can I
call them from the client side ?
This solution would be better for me, if I could somehow pass the list
to the server through the client and do not need any other assistance or
presence of the client at each file's preparation.
I know that there could be a problem with this. The files can disappear
between the prepared and actual requested time of processing. (server
went down, purging etc.). I think that the occurrence of this case is
usually very small. However, I didn't make any sophisticated
investigation to prove it since it would be very hard to get these
statistic.
Thanks for any suggestion or help
Cheers
Pavel
|