Thank you Justas for the logs. This was sort of a tortuous journey with many misunderstanding on my part. Now that I've seen the logs, dissected the multi-user plugin, and understood the non-obvious interactions between that plugin and the rest of the plugins I understand the issue and it's not particularly easy to fix as there are several conflicting assumptions made by the two most relevant plugins: multi-user and the OFS plugin on two major levels. A lot has to do with the positioning of the multi-user plugin and it would be somewhat easier to solve if it fronted the oss plugin as opposed to the sfs plugin. That said, my conversation with Derek indicated that it would be too disruptive to change the architecture at this point, so we won't take that route.

Problem #1: As indicated by Brian (and missed the fine point at the time) the failure occurs because of the way checksums are computed in the background. Once we establish that a checksum is missing a background job is run to compute it. However, since the server has already verified the permissions to do this, it runs the computation without involving authorization as it's not needed and may fail if it tried to do so in background without the multi-user plugin. This is done by not passing the client information to the checksum call. Of course, this conflicts with the multi-user plugin which does require client information. The server really has no way of knowing that; so the solution here will need some changes in the checksum launch code as well as the multi-user plugin to keep compatibility.

Problem #2: The file /storage/cms/store/temp/jbalcas/test-xrootd14 was open using the POSC option. Since the uid/gix was set prior to calling the underlying plugin, when called the control files created to track the file so it could be deleted were written using the set uid/gid not the xroot uid/gid. When the file was finally closed using the xroot user we got a failure:
Unable to fchmod /storage/cms/store/temp/jbalcas/test-xrootd14; operation not permitted

This likely means that the control files were not cleaned up. It also means that the daemon will not be able to restart should the control files indicate that the file still exists because it will not be able to delete the file. This is inherent to the multi-user plugin positioning and there really is no way to easily get around the problem. The short answer is that the multi-user plugin has to indicate that it does not support POSC because it really does not.

As I said, I can fix both problems with a specific patch for multi-user but I would need to change the multi-user plugin as well. There is no other way to do this the way it is written now. I will generate the appropriate pull request for the multi-user plugin to a) get the correct username for background checksums yet not apply authorization at that point and
b) indicate that POSC is not supported via the multi-user plugin (this also will turn off checkpointing).

Note that these fixes will apply to R5.1.0 as there is no easy way to backport them to the 5.x series. OK?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.

[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/xrootd/xrootd/issues/1294#issuecomment-783794258", "url": "https://github.com/xrootd/xrootd/issues/1294#issuecomment-783794258", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1