Print

Print


Hey,


It appears that installing the xrootd packages from another repository fixes. We switched from installing from the epel repository to installing from OSG makes it work without issue.


The versions for the relevant packages were:

xrootd-client-libs-5.5.1-1.el7.x86_64
xrootd-libs-5.5.1-1.el7.x86_64
xrootd-server-libs-5.5.1-1.el7.x86_64
xrootd-server-5.5.1-1.el7.x86_64
xrootd-selinux-5.5.1-1.el7.noarch
xrootd-5.5.1-1.el7.x86_64
xrootd-voms-5.5.1-1.el7.x86_64


And are now:

xrootd-client-libs-5.5.1-1.4.osg36.el7.x86_64
xrootd-libs-5.5.1-1.4.osg36.el7.x86_64
xrootd-server-libs-5.5.1-1.4.osg36.el7.x86_64
xrootd-server-5.5.1-1.4.osg36.el7.x86_64
xrootd-selinux-5.5.1-1.4.osg36.el7.noarch
xrootd-5.5.1-1.4.osg36.el7.x86_64
xrootd-voms-5.5.1-1.4.osg36.el7.x86_64


________________________________
From: [log in to unmask] <[log in to unmask]> on behalf of Marcus Ebert <[log in to unmask]>
Sent: Wednesday, January 25, 2023 10:31:16 AM
To: [log in to unmask]
Subject: TPC: push as active site works, pull does not

Hi all,

A followup on my email from Jan 23rd, for which I just realized that the subject line was deleted before sending, making it impossible to view on the web archive:

Issue we see is that on new installations of plain xrootd the HTTP TPC does not fully work. There are no issues when being the passive site, but when the newly setup machines are the active site then a pull from any other site does not work, while a push to any other site does.

After some investigation we found:
- repopulating the whole /etc/grid-security/certificates/ does not help
- on a fresh install with no running of fetch-crl yet, pull and push as active site fail
- after running fetch-crl push as active site works, pull does not

In the above failures, the files are created but with 0 byte length. The error in the above failures shown in the xrootd log is like (Bearer token removed to make it better readable):
230124 17:14:16 228752 http_Protocol:  rc:616 got hdr line: TransferHeaderAuthorization: Bearer MD...
230124 17:14:16 228752 http_Protocol:  rc:18 got hdr line: Credential: none
230124 17:14:16 228752 http_Protocol:  rc:503 got hdr line: Authorization: Bearer MD...
230124 17:14:16 228752 http_Protocol:  rc:36 got hdr line: RequireChecksumVerification: false
230124 17:14:16 228752 http_Protocol:  rc:2 got hdr line:
230124 17:14:16 228752 http_Protocol:  rc:2 detected header end.
230124 17:14:16 228752 anon.0:[log in to unmask]<mailto:[log in to unmask]> http_Req: Appended header fields to opaque info: 'authz=Bearer MD...'
230124 17:14:16 228752 TPC_: event=SIZE_FAIL, local=, remote=, user=(anonymous), status=500; HTTP library failed: Peer certificate cannot be authenticated with given CA certificates

If there would be a general issue with the cert of a site, I would expect that it does not happen on other sites and if it happens then it would also happen for a push, which however works after running fetch-crl.
Also gfal-copy that is used for the transfers by users, does not show any error - it just shows "DONE".

Now, what we found since then is that xrootd also looks under /etc/pki/  When we add all pem files from /etc/grid-security/certificates/ to /etc/pki/ca-trust/source/anchors/ and run update-ca-trust on the machine to generate a new bundle under /etc/pki/
then the error in xrootd changes to:
230125 08:56:41 231349 http_Protocol:  rc:616 got hdr line: TransferHeaderAuthorization: Bearer MD...
230125 08:56:41 231349 http_Protocol:  rc:18 got hdr line: Credential: none
230125 08:56:41 231349 http_Protocol:  rc:503 got hdr line: Authorization: Bearer MD...
230125 08:56:41 231349 http_Protocol:  rc:36 got hdr line: RequireChecksumVerification: false
230125 08:56:41 231349 http_Protocol:  rc:2 got hdr line:
230125 08:56:41 231349 http_Protocol:  rc:2 detected header end.
230125 08:56:41 231349 anon.0:[log in to unmask]<mailto:[log in to unmask]> http_Req: Appended header fields to opaque info: 'authz=Bearer MD...'
230125 08:56:42 231349 TPC_: event=SIZE_FAIL, local=, remote=, user=(anonymous), status=500; Remote side failed with status code 401

I am puzzled why just the pull would fail while a push works when the machine is the active party in the TPC.

xrootd config is:
=============

all.export /
oss.localroot /rdc-test/belle

ofs.trace all
http.trace all
sec.trace all

xrootd.chksum adler32

xrd.tls /etc/grid-security/xrd/xrdcert.pem /etc/grid-security/xrd/xrdkey.pem
xrd.tlsca certdir /etc/grid-security/certificates
xrootd.tls capable all

xrootd.seclib libXrdSec.so
sec.protocol gsi -cert:/etc/grid-security/xrd/xrdcert.pem \
                   -certdir:/etc/grid-security/certificates \
                   -dlgpxy:request \
                   -vomsat:extract \
                   -d:3 -vomsfunparms:dbg \
                   -vomsfun:default

http.secxtractor /usr/lib64/libXrdHttpVOMS-5.so
xrd.protocol http:8080 libXrdHttp.so

ofs.authorize
acc.audit deny grant
acc.authdb /etc/xrootd/Authfile

ofs.tpc logok autorm pgm /etc/xrootd/tpc.sh
http.cadir /etc/grid-security/certificates/
http.cert /etc/grid-security/xrd/xrdcert.pem
http.key /etc/grid-security/xrd/xrdkey.pem

http.exthandler xrdtpc libXrdHttpTPC.so
http.header2cgi Authorization authz
http.exthandler xrdmacaroons libXrdMacaroons.so
macaroons.secretkey /etc/xrootd/macaroon-secret
all.sitename test_site
ofs.authlib libXrdMacaroons.so


Cheers,
 Marcus



________________________________

Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1