Print

Print


Hi Andy,

I can add a bit more info to this ticket as we now have a way to fully reproduce this issue. Note that all the investigation in this case happened while still using XRootD 4.12.8, but I think not much has changes in XRootD 5 with respect to this. First of all, a bit of context: in EOS we have two types of XRootD daemons, MGM which holds the metadata and FSTs which are actually the disk servers. For communication inside the cluster between the MGM and the FSTs we use XrdCl with sss authentication.

An example how sss configuration looks like at the MGM:

grep sss /etc/xrd.cf.mgm
sec.protocol  sss -c /etc/eos.keytab -s /etc/eos-archive.keytab
sec.protbind  * only krb5 gsi sss unix

So we have two keytab files, one that the server uses and contains the list of all accepted sss keys and one that is used as a hint for the clients connecting to the MGM. Just for completeness, I will also list the contents of these files:

$ xrdsssadmin list /etc/eos-archive.keytab
     Number Len Date/Time Created Expires  Keyname User & Group
     ------ --- --------- ------- -------- -------
          2  32 09/17/14 19:25:01 -------- archive eosarchi c3
          1  32 10/20/20 22:01:59 -------- eoshomecanary daemon daemon
          1  32 10/26/19 13:17:24 -------- eosnobody eosnobody def-cg
$ xrdsssadmin list /etc/eos.keytab
     Number Len Date/Time Created Expires  Keyname User & Group
     ------ --- --------- ------- -------- -------
          1  32 10/20/20 22:01:59 -------- eoshomecanary daemon daemon 

The key with name eoshomecanary is the one to be used internally for all communication. Now, the configuration on the FST side looks pretty similar, it's just that it only contains this cluster sss key.

grep sss /etc/xrd.cf.fst.
sec.protocol  sss -c /etc/eos.keytab -s /etc/eos.keytab
sec.protbind  * only unix sss

The contents of the keytab file on the FST is:

xrdsssadmin list /etc/eos.keytab
     Number Len Date/Time Created Expires  Keyname User & Group
     ------ --- --------- ------- -------- -------
          1  32 10/20/20 22:01:59 -------- eoshomecanary daemon daemon

The issue we experienced manifests itself when the eoshomecanary sss key is not the last one in the sss "server" keytab file used by the MGM. Therefore, when we use the XRootD client from inside the MGM process to connect to the FST, and explicitly request sss authentication, the XRootD client will use the last key in the server keytab file. Normally, we have all our instances configured with the cluster sss key in the last position but we were bitten by this when a new sss key was appended for this particular instance.

Is this behavior expected? Is there any place where this is documented? Why the XRootD client used from inside the MGM process does not follow the hint that the FST process sends it with respect to the sss key that it should provide? Note the FST will send the hint to use /etc/eos.keytab.

Below you have some commands that illustrate this issue based on the contents of the keytab files pasted above. These commands are executed from the MGM and target one FST.

[root@eoshomecanary-deneb (mgm:master mq:master) ~]$ XrdSecPROTOCOL=sss XrdSecSSSKT=/etc/eos-archive.keytab xrdfs root:[log in to unmask]:1110 query opaquefile /?fst.pcmd=fsck
[FATAL] Auth failed
[root@eoshomecanary-deneb (mgm:master mq:master) ~]$ XrdSecPROTOCOL=sss XrdSecSSSKT=/etc/eos-archive.keytab xrdfs root://p05151113741284.cern.ch:1110 query opaquefile /?fst.pcmd=fsck
[FATAL] Auth failed

Also at some point we expected that specifying the username when constructing the URL (like in the first case above) it would actually "guide" the sss authentication to select the sss key matching the username provided. But this clearly is not the case. If we use the /etc/eos.keytab file that only contains the cluster sss key things work as expected:

XrdSecPROTOCOL=sss XrdSecSSSKT=/etc/eos.keytab xrdfs root://p05151113741284.cern.ch:1110 query opaquefile /?fst.pcmd=fsck
blockxs_err@114
d_cx_diff@114
d_mem_sz_diff@114
m_cx_diff@114

So, I am not really sure if this is actually a bug or just an undocumented behavior and what we would like is to clarify this aspect first.

Thanks,
Elvin


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <xrootd/xrootd/issues/1683/1131388494@github.com>

[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/xrootd/xrootd/issues/1683#issuecomment-1131388494", "url": "https://github.com/xrootd/xrootd/issues/1683#issuecomment-1131388494", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1