Hi,
STFC in the UK are moving to using XRoot to access their Castor system, replacing RFIO. I have a service which interfaces with Castor for deposition of experimental data into an archive. The service is driven by a multi-threaded TCPServer written in Python, so I was pleased to find your bindings.
The service receives large numbers of small files and concatenates them into large files before sending them to Castor for storage on tape. On retrieval from tape, we perform a stager_get and then we run multiple retrieval jobs copying the large files back to disk, and we serve files to the client from there. We do this to attempt to avoid long delays waiting for files to be staged from tape, and it works well for us.
I modified the backend to use these bindings for copying the large files to Castor (previously it was shelling out to rfcp), but I found that when I called FileSystem.copy() on a large file it would hang the whole process until the copy finished. I assume this is because the copy() is implemented in C and therefore is not subject to the timeslicing done by the Python interpreter (2.6.6)?
I'm aware of the CopyProcess() functionality you provide, but would that also pause until all the jobs are complete?
If I were to perform a FileSystem.copy() asynchronously with a callback, would that allow other threads of execution to carry on in the meantime?
I could use a loop and File.write() but since we're not concerned with writing or reading portions of the files it would seem preferable to just deal with a put/get style of operation.
I have sorted this for the time being by just shelling out to xrdcp for the copy to Castor, but I would really like to use these bindings. Is there any strategy I can adopt that would circumvent the locking I think I see?

Thanks in advance,

Roger Downing


Reply to this email directly or view it on GitHub.



Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1