Print

Print


Hi Matevz,

I might have something like a "minimal failing example". Unfortunately
the problem only appears when authentication is required, so the example
will only work on a machine that has a valid host certificate and the
corresponding directory has to be bind-mounted into the container.

I uploaded my container image here:

https://cloud.physik.lmu.de/index.php/s/RFC6Q89FBxxNMXF

and made a directory structure (tar archive attached) to bind mount into
the container (and containing the minimal failing xcache config and a
script for starting gdb inside the container)

To reproduce, extract the archive, enter the directory and run (as
non-root user)

singularity run -B $(pwd)/data:/data -B $(pwd)/config:/etc/xrootd:ro -B
<hostkey-dir>:/etc/grid-security:ro <singularity-image>

where <hostkey-dir> is a directory that contains

hostkey.pem
hostcert.pem
vomsdir (will become X509_VOMS_DIR)
certificates (will become X509_CERT_DIR)

and <singularity-image> is the path to the singularity image.

That should run xrootd and the log should appear in
data/xrd/var/log/xrootd.log

I used this example to produce the failure:

xrdcp -f
root://lcg-lrz-xcache0.grid.lrz.de:1094//root://eospublic.cern.ch//eos/opendata/lhcb/AntimatterMatters2017/data/PhaseSpaceSimulation.root
/dev/null

The simplest way to run gdb seemed to directly start xrootd with gdb.
This can be done with the script run_xcache_debug.sh in the attached
archive. Instead of the command above just use

singularity exec -B $(pwd)/data:/data -B $(pwd)/config:/etc/xrootd:ro -B
<hostkey-dir>:/etc/grid-security:ro <singularity-image>
./run_xcache_debug.sh

Note: Before restarting, best delete the content of the data directory
since the bug also did not seem to occur when the file was already
cached (e.g after testing without authentication)

Sorry for the overly complicated reproducing steps, but since it only
happened when i authentication was enabled i didn't know how to do it
simpler. I hope it helps.

Thanks,
Nikolai

On 7/7/20 8:42 PM, Matevz Tadel wrote:
> Thanks Nikolai, I shall continue my investigation :)
> 
> Matevz
> 
> On 2020-07-06 23:59, Nikolai Hartmann wrote:
>> Hi Matevz,
>>
>> Thanks a lot for looking into this.
>>
>> - The crash seems to happen always when i make a request
>> - Currently prefetching is disabled
>> - Yes, i think it is direct proxy mode
>> - stack trace is attached
>>
>> A similar setup seems to work for Ilija without issues with the xcaches
>> using slate - i tried to mimic that setup closely. Running xrootd from
>> this container image:
>>
>> https://urldefense.com/v3/__https://gitlab.physik.uni-muenchen.de/Nikolai.Hartmann/xcache-singularity-lrz/-/blob/51d2da52829eb6d8ea377539884f337208141aca/xcache.singularity.def__;!!Mih3wA!SJibOzmy2P3rdD8Ut7m7gYp_bah2pQX2dR2V9U6xiTq9PoQtfjb_MHHDljpOV0aWvVYj$
>>
>>
>> using this config
>>
>> https://urldefense.com/v3/__https://gitlab.physik.uni-muenchen.de/Nikolai.Hartmann/xcache-singularity-lrz/-/blob/51d2da52829eb6d8ea377539884f337208141aca/etc/xrootd/xcache.cfg__;!!Mih3wA!SJibOzmy2P3rdD8Ut7m7gYp_bah2pQX2dR2V9U6xiTq9PoQtfjb_MHHDljpOVzHQF5CU$
>>
>>
>> Cheers,
>> Nikolai
>>
>> On 7/7/20 1:38 AM, Matevz Tadel wrote:
>>> Hi Nikolai,
>>>
>>> I tried to reproduce it with current master in nearly all ways,
>>> with/without prefetching and with direct/forwarding mode. Also, with std
>>> malloc and tcmalloc. No luck :(
>>>
>>> Backtrace or core would help a lot at this point.
>>>
>>> Cheers,
>>> Matevz
>>>
>>> On 2020-07-03 00:54, Nikolai Hartmann wrote:
>>>> Hi,
>>>>
>>>> I'm trying to upgrade to xrootd5 rc4 for our xcache server to
>>>> mitigate a
>>>> problem with dCache.
>>>>
>>>> Now when i try to read a file through xcache it crashes with
>>>> "Attempt to
>>>> free invalid pointer". I attached the corresponding part of the log.
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>> Nikolai
>>>>
>>>> ########################################################################
>>>>
>>>> Use REPLY-ALL to reply to list
>>>>
>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>> https://urldefense.com/v3/__https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1__;!!Mih3wA!Xzk53aW-mEg2pavzme9Hd49MPmno8frpbkh2YetRsquNyAt5jiVsDB91pTNUHA$
>>>>
>>>>
>>>>
>>>
> 

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1