Print

Print


Hi Petr,

I have yet to look at the logs to see where the corruption occurs. I 
assume the last email contains an instance of the problem with the tracing 
on. Once I see that I will have additional suggestions. Anyway, could you 
tell me what the crash was here, SEGV?

Andy

On Wed, 11 Sep 2019, Petr Vokac wrote:

> I found that xrootd daemon occasionally crash (it is automatically
> restarted by systemd) and for problematic file mentioned below I can
> find core file that deals with this transfer
>
> (gdb) bt
> #0  XrdXrootdAioReq::sendError (this=0x2069cd8, tident=0x7fbcfc00a547 "condor.114208:47@[2001:718:401:6e03::1:82]") at /usr/src/debug/xrootd-4.9.1/src/XrdXrootd/XrdXrootdAio.cc:652
> #1  0x00007fbd138c020c in XrdXrootdAioReq::endWrite (this=0x2069cd8) at /usr/src/debug/xrootd-4.9.1/src/XrdXrootd/XrdXrootdAio.cc:601
> #2  0x00007fbd1365699f in XrdScheduler::Run (this=0x610e98 <XrdMain::Config+440>) at /usr/src/debug/xrootd-4.9.1/src/Xrd/XrdScheduler.cc:357
> #3  0x00007fbd13656ae9 in XrdStartWorking (carg=<optimized out>) at /usr/src/debug/xrootd-4.9.1/src/Xrd/XrdScheduler.cc:87
> #4  0x00007fbd13619617 in XrdSysThread_Xeq (myargs=0x7fbcfc010d20) at /usr/src/debug/xrootd-4.9.1/src/XrdSys/XrdSysPthread.cc:86
> #5  0x00007fbd131ccdd5 in start_thread (arg=0x7fbd018fa700) at pthread_create.c:307
> #6  0x00007fbd124cf02d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
> http://novastore.farm.particle.cz/tmp/xrdcp-20190911/1M.20190911070503.057846787.thread_apply_all_bt
>
> Petr
>
> On 9/11/19 9:35 AM, Petr Vokac wrote:
>> Now I have following lines on all our DPM diskservers
>> ofs.trace write
>> xrootd.trace fs
>> and I created simple script [1] that use just xrdcp to copy 1MB file
>> CURR=$(date +%Y%m%d%H%M%S.%N)
>> export XRD_LOGLEVEL=Dump
>> export XRD_LOGMASK=All
>> export XRD_LOGFILE=1M.${CURR}.dump
>> xrdcp "file:///tmp/1M" "root://golias100.farm.particle.cz//dpm/farm.particle.cz/home/atlas/1M.xrdcp.${CURR}"
>> Even this simple code create corrupted files at our DPM storage, e.g.
>> following 1.5GB file
>> *[root@dpmpool1 ~]# ls -la
>> /mnt/fs1/atlas/2019-09-11/1M.xrdcp.20190911070503.057846787.143302.1568178311*
>> -rw-rw---- 1 dpmmgr dpmmgr 1761607680 Sep 11 07:05 /mnt/fs1/atlas/2019-09-11/1M.xrdcp.20190911070503.057846787.143302.1568178311
>> *[root@dpmpool1 ~]# du --block-size=1
>> /mnt/fs1/atlas/2019-09-11/1M.xrdcp.20190911070503.057846787.143302.1568178311*
>> 1048576	/mnt/fs1/atlas/2019-09-11/1M.xrdcp.20190911070503.057846787.143302.1568178311
>> *[root@dpmpool1 ~]# grep condor.114208:47
>> /var/log/xrootd/dpmdisk/xrootd.log*
>> 190911 07:05:12 31717 XrootdXeq: condor.114208:47@[2001:718:401:6e03::1:82] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=vokac/CN=610071/CN=Petr Vokac
>> 190911 07:05:12 31717 condor.114208:47@[2001:718:401:6e03::1:82] XrootdProtocol: open unmat /dpm/farm.particle.cz/home/atlas/1M.xrdcp.20190911070503.057846787?dpm.chunk0=0,0,dpmpool1.farm.particle.cz:/mnt/fs1/atlas/2019-09-11/1M.xrdcp.20190911070503.057846787.143302.1568178311%3fsfn=%252Fdpm%252Ffarm.particle.cz%252Fhome%252Fatlas%252F1M.xrdcp.20190911070503.057846787&dpm.dhost=dpmpool1.farm.particle.cz&dpm.hv2=NBoSW+W5T7Eti/8yeLbu0w==&dpm.loc=1&dpm.nonce=870b737&dpm.put=1&dpm.time=1568178311&oss.asize=1048576
>> 190911 07:05:12 31717 condor.114208:47@[2001:718:401:6e03::1:82] XrootdProtocol: fh=0 write 1048576@0
>> 190911 07:05:20 23385 condor.114208:47@[2001:718:401:6e03::1:82] XrootdProtocol: close rc=0 fh=0
>> 190911 07:05:20 23385 condor.114208:47@[2001:718:401:6e03::1:82] XrootdFile: closing w /dpm/farm.particle.cz/home/atlas/1M.xrdcp.20190911070503.057846787
>> 190911 07:05:20 23385 XrootdXeq: condor.114208:47@[2001:718:401:6e03::1:82] disc 0:00:09
>> Client side details are in attachment. I have more examples [2] with
>> following wrong size
>> 1M.20190911005425.364456793.bad ... Wed Sep 11 00:54:31 CEST 2019 ERROR BAD SIZE 1124073472
>> 1M.20190911010555.932867187.bad ... Wed Sep 11 01:06:10 CEST 2019 ERROR BAD SIZE 0
>> 1M.20190911012755.924566113.bad ... Wed Sep 11 01:28:08 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911014400.800640006.bad ... Wed Sep 11 01:44:24 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911014622.520944167.bad ... Wed Sep 11 01:46:40 CEST 2019 ERROR BAD SIZE 3741319168
>> 1M.20190911014901.149846614.bad ... Wed Sep 11 01:49:26 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911015203.831921517.bad ... Wed Sep 11 01:52:29 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911020907.396911618.bad ... Wed Sep 11 02:09:32 CEST 2019 ERROR BAD SIZE 3791650816
>> 1M.20190911021159.705868237.bad ... Wed Sep 11 02:12:27 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911021251.426135326.bad ... Wed Sep 11 02:13:13 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911021737.228623622.bad ... Wed Sep 11 02:18:01 CEST 2019 ERROR BAD SIZE 771751936
>> 1M.20190911022813.246485614.bad ... Wed Sep 11 02:28:40 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911025601.833862548.bad ... Wed Sep 11 02:56:20 CEST 2019 ERROR BAD SIZE 1694498816
>> 1M.20190911030159.405097568.bad ... Wed Sep 11 03:02:28 CEST 2019 ERROR BAD SIZE 5016387584
>> 1M.20190911030913.965667783.bad ... Wed Sep 11 03:09:54 CEST 2019 ERROR BAD SIZE 218103808
>> 1M.20190911030955.983267709.bad ... Wed Sep 11 03:10:01 CEST 2019 ERROR BAD SIZE 15946576
>> 1M.20190911031027.126885413.bad ... Wed Sep 11 03:10:55 CEST 2019 ERROR BAD SIZE 1459617792
>> 1M.20190911031535.972442346.bad ... Wed Sep 11 03:16:03 CEST 2019 ERROR BAD SIZE 24848976
>> 1M.20190911032921.385479218.bad ... Wed Sep 11 03:29:37 CEST 2019 ERROR BAD SIZE 352321536
>> 1M.20190911050741.513088940.bad ... Wed Sep 11 05:07:57 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911053936.964249354.bad ... Wed Sep 11 05:39:56 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911055721.511952376.bad ... Wed Sep 11 05:57:39 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911060520.998050566.bad ... Wed Sep 11 06:05:27 CEST 2019 ERROR BAD SIZE 2292134
>> 1M.20190911061254.989202949.bad ... Wed Sep 11 06:13:20 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911062609.645848195.bad ... Wed Sep 11 06:26:21 CEST 2019 ERROR BAD SIZE 1207959552
>> 1M.20190911063849.007385225.bad ... Wed Sep 11 06:38:55 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911063907.064936962.bad ... Wed Sep 11 06:39:15 CEST 2019 ERROR BAD SIZE 26413431
>> 1M.20190911065351.412877379.bad ... Wed Sep 11 06:54:00 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911070503.057846787.bad ... Wed Sep 11 07:05:20 CEST 2019 ERROR BAD SIZE 1761607680
>> 1M.20190911070525.814871895.bad ... Wed Sep 11 07:05:36 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911070634.781534227.bad ... Wed Sep 11 07:06:42 CEST 2019 ERROR BAD SIZE 3758096384
>> 1M.20190911072040.275437982.bad ... Wed Sep 11 07:21:07 CEST 2019 ERROR BAD SIZE 786432
>> 1M.20190911072340.253938152.bad ... Wed Sep 11 07:24:00 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911074911.208877064.bad ... Wed Sep 11 07:49:40 CEST 2019 ERROR BAD SIZE 388133552
>> 1M.20190911080141.583691537.bad ... Wed Sep 11 08:01:56 CEST 2019 ERROR BAD SIZE 917504
>> 1M.20190911082145.247040374.bad ... Wed Sep 11 08:22:17 CEST 2019 ERROR BAD SIZE 786432
>> 1M.20190911085008.626914519.bad ... Wed Sep 11 08:50:29 CEST 2019 ERROR BAD SIZE 917504
>> Petr
>>
>> [1]
>> http://novastore.farm.particle.cz/tmp/xrdcp-20190911/test_dpm_xroot_file_copy.sh
>> [2] http://novastore.farm.particle.cz/tmp/xrdcp-20190911/
>>
>> On 9/10/19 2:12 AM, Andrew Hanushevsky wrote:
>>> Hi Petr,
>>>  
>>> If you also specify in the config file
>>>  
>>> xrootd.trace fs
>>>  
>>> then we will get a much earlier snapshot of the offset and length.
>>> This will allow us to tell whether the data is getting corrupted
>>> before it reaches the server or afterward. Could you do that (at
>>> least for one test) to see where the corruption may happen?
>>>  
>>> Andy
>>>  
>>> *From:* Petr Vokac <mailto:[log in to unmask]>
>>> *Sent:* Sunday, September 08, 2019 1:46 AM
>>> *To:* andrea manzi <mailto:[log in to unmask]> ;
>>> [log in to unmask] <mailto:[log in to unmask]> ; Andrew
>>> Hanushevsky <mailto:[log in to unmask]>
>>> *Cc:* dpm-devel <mailto:[log in to unmask]> ;
>>> [log in to unmask] <mailto:[log in to unmask]>
>>> *Subject:* Re: Uploaded file 128TB on filesystem
>>>  
>>> I just found one file where problem with offset occurred in the
>>> middle of transfer (not at the beginning):
>>> 190907 20:03:53 25211 XrootdXeq: atlasprd.2817:168@[2001:718:401:6e03::1:51] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>> 190907 20:03:55 25211 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@0 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:03:55 25211 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@262144 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:03:55 25211 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@524288 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:03:55 25211 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@786432 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:03:55 25211 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@1048576 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> ...
>>> 190907 20:40:35 31589 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2952003584 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:40:53 31589 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2952265728 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:41:55 31609 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2952265728 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> *190907 20:42:05 25089 atlasprd.2817:168@[2001:718:401:6e03::1:51]
>>> ofs_write: 1919238960@7379557499097268579
>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload****190907 20:42:13 25118 atlasprd.2817:168@[2001:718:401:6e03::1:51]
>>> ofs_write: 1919238960@7379557499097268579
>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload***190907 20:42:13 25118 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2953314304 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:42:13 25118 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2953576448 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> 190907 20:42:20 25118 atlasprd.2817:168@[2001:718:401:6e03::1:51] ofs_write: 262144@2953576448 fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_13TeV/9c/6a/AOD.18892557._000593.pool.root.1.rucio.upload
>>> Whole xrootd.log for this transfer available at
>>> http://novastore.farm.particle.cz/tmp/xrootd_bigfile_wrong_offset_in_the_middle_of_transfer.txt
>>>
>>> Petr
>>>
>>> On 9/8/19 10:25 AM, Petr Vokac wrote:
>>>> I executed basically following python code that from bash script in
>>>> a loop
>>>> import gfal2
>>>> ctx = gfal2.creat_context()
>>>> ctx.set_opt_string("XROOTD PLUGIN", "XRD.WANTPROT", "gsi,unix")
>>>> params = ctx.transfer_parameters()
>>>> params.timeout = 60
>>>> ctx.filecopy(params, "file:///tmp/1M", "root://golias100.farm.particle.cz/dpm/farm.particle.cz/home/atlas/1M")
>>>> and after a while new corrupted files appeared at our storage
>>>> including one big file:
>>>> [vokac@ui1 ~]$ gfal-ls -l 'davs://golias100.farm.particle.cz/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000'
>>>> -rw-rw-r--   0 1310  5281  139703410778112 Sep  7 19:01 davs://golias100.farm.particle.cz/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> Now I have also debug output from gfal client including
>>>> XRD_LOGLEVEL=Dump + XRD_LOGMASK=All for this file with xrootd.log on
>>>> server side:
>>>> [root@dpmpool19 ~]$ grep condor.39253:171 /var/log/xrootd/dpmdisk/xrootd.log
>>>> 190907 19:01:19 29645 XrootdXeq: condor.39253:171@[2001:718:401:6e03::1:82] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=vokac/CN=610071/CN=Petr Vokac
>>>> 190907 19:01:22 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 1009440128@139703410643808 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:23 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@131072 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:23 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@262144 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:24 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@393216 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:24 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@524288 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:25 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@655360 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:25 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@786432 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:25 32625 condor.39253:171@[2001:718:401:6e03::1:82] ofs_write: 131072@917504 fn=/dpm/farm.particle.cz/home/atlas/1M.01556.20190907190118.646902567.00000
>>>> 190907 19:01:26 32625 XrootdXeq: condor.39253:171@[2001:718:401:6e03::1:82] disc 0:00:07
>>>> This file was transferred from mff82.farm.particle.cz
>>>> (2001:718:401:6e03::1:82) to the storage dpmpool19.farm.particle.cz
>>>> (2001:718:401:6017:2::19) and both machines use
>>>> xrootd-4.9.1-1.el7.x86_64 and mff82 use gfal2-2.16.3-1.el7.x86_64
>>>>
>>>> I also tried to calculate adler32 for data blocks from positions
>>>> logged by server and result match source data checksum. When I look
>>>> at other transfers finished with "SUCCESS" but with the filesize
>>>> different from source file I can see that problems always occurs
>>>> only with first "ofs_write" log line where size@offset contains
>>>> wrong numbers.
>>>>
>>>> Attached files:
>>>>
>>>> 1M.01556.20190907190118.646902567.00000.dump ... xrd with dump log level
>>>> 1M.01556.20190907190118.646902567.00000.log ... gfal2 verbose
>>>> logging (+ few log lines from my script)
>>>> 1M.01556.20190907190118.646902567.00000.xrootd.long ... dpmpool19
>>>> xrootd.log
>>>>
>>>> I have more examples of successfully transferred files with wrong
>>>> size (sometimes bigger than original 1M size and sometimes smaller):
>>>>
>>>> 07 Sep 18:39:33.879 [ERROR] dpmpool1.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 18:40:45.039 [ERROR] dpmpool21.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 18:41:22.555 [ERROR] dpmpool21.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 18:41:39.248 [ERROR] dpmpool21.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 18:56:51.113 [ERROR] dpmpool24.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 18:58:04.436 [ERROR] dpmpool1.farm.particle.cz BAD SIZE 393194936
>>>> 07 Sep 19:01:25.908 [ERROR] dpmpool19.farm.particle.cz BAD SIZE
>>>> 139703410778112
>>>> 07 Sep 19:02:17.183 [ERROR] dpmpool24.farm.particle.cz BAD SIZE 0
>>>> 07 Sep 19:02:53.014 [ERROR] dpmpool16.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 19:03:49.424 [ERROR] dpmpool1.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 19:05:47.743 [ERROR] dpmpool21.farm.particle.cz BAD SIZE 13983727
>>>> 07 Sep 19:15:15.761 [ERROR] dpmpool22.farm.particle.cz BAD SIZE 12752345
>>>> 07 Sep 19:16:58.898 [ERROR] dpmpool20.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 19:18:51.590 [ERROR] dpmpool17.farm.particle.cz BAD SIZE 6302080
>>>> 07 Sep 19:20:37.006 [ERROR] dpmpool21.farm.particle.cz BAD SIZE
>>>> 1456153896
>>>> 07 Sep 19:52:32.553 [ERROR] dpmpool16.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 19:53:45.155 [ERROR] dpmpool22.farm.particle.cz BAD SIZE
>>>> 1000000000
>>>> 07 Sep 20:11:14.497 [ERROR] dpmpool22.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 20:12:51.081 [ERROR] dpmpool20.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 20:14:58.090 [ERROR] dpmpool20.farm.particle.cz BAD SIZE
>>>> 100663296
>>>> 07 Sep 20:24:57.457 [ERROR] dpmpool1.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 20:27:37.596 [ERROR] dpmpool19.farm.particle.cz BAD SIZE 917504
>>>> 07 Sep 20:28:21.617 [ERROR] dpmpool16.farm.particle.cz BAD SIZE 0
>>>>
>>>> Petr
>>>>
>>>> On 9/5/19 2:44 PM, Petr Vokac wrote:
>>>>> I'm going to test in a loop uploads with this code and also enabled gfal
>>>>> debug logging from our remote machines that talks with storage through
>>>>> quite saturated network (we never saw same issue from cluster worker
>>>>> nodes that shares room with diskservers). I also saw few transfers that
>>>>> did not create big sparse file, but just failed while comparing
>>>>> checksums after upload and this is also quite suspicious behavior.
>>>>>
>>>>> Petr
>>>>>
>>>>> On 9/5/19 12:57 PM, andrea manzi wrote:
>>>>>>> Ciao Andrea,
>>>>>>>
>>>>>>> we practically do this:     ctx = creat_context(...     ctx.set_opt_string("XROOTD PLUGIN", "XRD.WANTPROT", "gsi,unix")     ctx.mkdir_rec(...     ctx.filecopy(..
>>>>>>>
>>>>>>> Is that the gfal transfer API?
>>>>>> yup, internally in gfal2 we then use the XrdCl to perform the copy (
>>>>>> both for local files and TPC)
>>>>>>
>>>>>> today is holiday but we can start investigating with Michal tomorrow
>>>>>> or Monday the possible issue
>>>>>>
>>>>>> cheers
>>>>>>
>>>>>> Andrea
>>>>>>
>>>>>>> Cheers,
>>>>>>> Mario
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 05/09/2019 10:35, andrea manzi wrote:
>>>>>>>> ciao Mario,
>>>>>>>>
>>>>>>>> i guess you are using the gfal2 transfer API in Rucio right ? ( the
>>>>>>>> equivalent to gfal-copy)
>>>>>>>>
>>>>>>>> in gfal2,  in the case of the transfer API  for xrootd, we don't do
>>>>>>>> much
>>>>>>>> as we just use the xrootd client (  XrdCl::CopyProcess), so we don't
>>>>>>>> touch file offsets.
>>>>>>>>
>>>>>>>> so we may want to investigate if there is a problem in XrdCl. Do you
>>>>>>>> have the logs of the client?
>>>>>>>>
>>>>>>>> If you are using the gfal2 Posix API instead we need to investigate a
>>>>>>>> possible bug there
>>>>>>>>
>>>>>>>> cheers
>>>>>>>>
>>>>>>>> Andrea
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi Petr, Andy,
>>>>>>>>>
>>>>>>>>> once extra piece of information here: Rucio uses the python
>>>>>>>>> bindings to
>>>>>>>>> the gfal library to upload these files, and de-facto always uses the
>>>>>>>>> current stable gfal version that is published on ALRB. Rucio doesn't
>>>>>>>>> implement any other upload by itself.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Mario
>>>>>>>>>
>>>>>>>>> On 05/09/2019 04:21, Andrew Hanushevsky wrote:
>>>>>>>>>> Hi Petr,   So, look at this line:   190904 17:02:16 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 1784104010@6506989258351653185
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>>
>>>>>>>>>> Why did the application doing the write specify an offset of
>>>>>>>>>> 5506989258351653185? So, that is the problem and, as you noted, it
>>>>>>>>>> doesn?t matter whether POSC is on or not. Indeed, it?s the first
>>>>>>>>>> write
>>>>>>>>>> to the file and only the first write that suffers from this
>>>>>>>>>> problem. So,
>>>>>>>>>> there are two possibilities here: a) there is a bug in the
>>>>>>>>>> uploader, or
>>>>>>>>>> b) there was a network data corruption that managed to hit the offset
>>>>>>>>>> (somewhat unlikely).   Andy   *From:* Petr Vokac mailto:[log in to unmask]
>>>>>>>>>> *Sent:* Wednesday, September 04, 2019 12:19 PM
>>>>>>>>>> *To:* Andrew Hanushevsky mailto:[log in to unmask]
>>>>>>>>>> *Cc:* dpm-devel mailto:[log in to unmask] ;
>>>>>>>>>> [log in to unmask]
>>>>>>>>>> mailto:[log in to unmask]
>>>>>>>>>> *Subject:* Re: Uploaded file 128TB on filesystem   I configured our diskservers with "ofs.trace write" (I did not
>>>>>>>>>> yet add
>>>>>>>>>> in xrootd configuraton "ofs.persist hold 0") and today new file
>>>>>>>>>> appeared
>>>>>>>>>> on one of our storage servers:
>>>>>>>>>>
>>>>>>>>>> *[root@dpmpool1 ~]# grep atlasprd.2817:152
>>>>>>>>>> /var/log/xrootd/dpmdisk/xrootd.log*
>>>>>>>>>> 190904 17:02:08 1894 XrootdXeq:
>>>>>>>>>> atlasprd.2817:152@[2001:718:401:6e03::1:82] pub IP64 login as
>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>> 190904 17:02:16 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 1784104010@6506989258351653185
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:16 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@262144
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@524288
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@786432
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@1048576
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@1310720
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@1572864
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 262144@1835008
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:02:17 22915 atlasprd.2817:152@[2001:718:401:6e03::1:82]
>>>>>>>>>> ofs_write: 135000@2097152
>>>>>>>>>> fn=/dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_5TeV/90/9f/EVNT.18960107._035789.pool.root.1.rucio.upload
>>>>>>>>>> 190904 17:07:32 2163 XrootdXeq:
>>>>>>>>>> atlasprd.2817:152@[2001:718:401:6e03::1:82] disc 0:05:24
>>>>>>>>>>
>>>>>>>>>> *[root@dpmpool1 ~]# ls -la
>>>>>>>>>> /mnt/fs2/atlas/2019-09-04/EVNT.18960107._035789.pool.root.1.rucio.upload.309650.1567609328*
>>>>>>>>>>
>>>>>>>>>> -rw-rw---- 1 dpmmgr dpmmgr 6506989258351919104 Sep  4 17:02
>>>>>>>>>> /mnt/fs2/atlas/2019-09-04/EVNT.18960107._035789.pool.root.1.rucio.upload.309650.1567609328
>>>>>>>>>>
>>>>>>>>>> *[root@dpmpool1 ~]# du --block-size=1
>>>>>>>>>> /mnt/fs2/atlas/2019-09-04/EVNT.18960107._035789.pool.root.1.rucio.upload.309650.1567609328*
>>>>>>>>>>
>>>>>>>>>> 2236416    /mnt/fs2/atlas/2019-09-04/EVNT.18960107._035789.pool.root.1.rucio.upload.309650.1567609328
>>>>>>>>>>
>>>>>>>>>> This file has first 256kb filled with 0x00 (adler32 up to original
>>>>>>>>>> size
>>>>>>>>>> 2232152 doesn't match) and this time there is no ofs_Posc in the
>>>>>>>>>> xrootd.log file.
>>>>>>>>>>
>>>>>>>>>> Petr Vokac
>>>>>>>>>>
>>>>>>>>>> On 8/19/19 3:24 AM, Andrew Hanushevsky wrote:
>>>>>>>>>>> Hi Petr,
>>>>>>>>>>>
>>>>>>>>>>> The issue here is that the python script (I believe it's python) is
>>>>>>>>>>> disconnecting without closing the file and then reconnecting and
>>>>>>>>>>> writing at a high offset. The log is pretty clear on that. You can
>>>>>>>>>>> verify by adding "ofs.trace write" to see the write offsets
>>>>>>>>>>> (hough it
>>>>>>>>>>> will produce a lot of output). You can also prohibit reconnection by
>>>>>>>>>>> adding this to the persist directive.
>>>>>>>>>>>
>>>>>>>>>>> ofs.persist hold 0
>>>>>>>>>>>
>>>>>>>>>>> which would delete the file the moment the client disconnects
>>>>>>>>>>> without
>>>>>>>>>>> closing the file.
>>>>>>>>>>>
>>>>>>>>>>> Andy
>>>>>>>>>>>
>>>>>>>>>>> On Sat, 17 Aug 2019, Petr Vokac wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Today new "65PB" file appeared on our storage and it was again
>>>>>>>>>>>> transferred with xrootd protocol. I can see again same error in
>>>>>>>>>>>> the log
>>>>>>>>>>>> file:
>>>>>>>>>>>>
>>>>>>>>>>>> **[root@dpmpool21 ~]# ls -la
>>>>>>>>>>>> /mnt/fs4/atlas/2019-08-17/EVNT.18893579._003197.pool.root.1.rucio.upload.234855.1566031522*
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *-rw-rw---- 1 dpmmgr dpmmgr 72340172838077281 Aug 17 10:48
>>>>>>>>>>>> /mnt/fs4/atlas/2019-08-17/EVNT.18893579._003197.pool.root.1.rucio.upload.234855.1566031522
>>>>>>>>>>>>
>>>>>>>>>>>> ***[root@dpmpool21 ~]# grep atlasprd.2757
>>>>>>>>>>>> /var/log/xrootd/dpmdisk/xrootd.log*
>>>>>>>>>>>> ...
>>>>>>>>>>>> 190817 10:47:55 57803 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:65@[2001:718:401:6e03::1:43] pub IP64 login as
>>>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>>>> 190817 10:47:55 61859 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:66@[2001:718:401:6e03::1:83] pub IP64 login as
>>>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>>>> 190817 10:48:52 62233 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:46@[2001:718:401:6e03::1:72] disc 0:00:58
>>>>>>>>>>>> 190817 10:48:52 62954 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:25@[2001:718:401:6e03::1:72] disc 0:03:30
>>>>>>>>>>>> 190817 10:48:52 57804 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:28@[2001:718:401:6e03::1:72] pub IP64 login as
>>>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>>>> 190817 10:48:52 57804 ofs_Posc: Creator changed from
>>>>>>>>>>>> atlasprd.2757:25@[2001:718:401:6e03::1:72] to
>>>>>>>>>>>> atlasprd.2757:28@[2001:718:401:6e03::1:72] for
>>>>>>>>>>>> /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/8b/f3/EVNT.18893579._003197.pool.root.1.rucio.upload
>>>>>>>>>>>>
>>>>>>>>>>>> 190817 10:48:53 57803 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:46@[2001:718:401:6017:20:0:14:44] pub IP64 login as
>>>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>>>> 190817 10:49:01 57818 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:70@[2001:718:401:6017:20:0:11:11] pub IP64 login as
>>>>>>>>>>>> /DC=ch/DC=cern/OU=Organic
>>>>>>>>>>>> Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>>>>>>>>>>>> 190817 10:49:07 62954 XrootdXeq:
>>>>>>>>>>>> atlasprd.2757:55@[2001:718:401:6e03::1:81] disc 0:10:17
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> Any suggestion how to trace what is causing this behavior?
>>>>>>>>>>>>
>>>>>>>>>>>> Petr
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/20/19 7:48 AM, Petr Vokac wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I ran the command you sent me
>>>>>>>>>>>>>
>>>>>>>>>>>>> grep "atlasprd.2757:" <logfile> | egrep
>>>>>>>>>>>>> '2001:718:401:6e03::1:21|2001:718:401:6e03::1:32' > <excerpt>
>>>>>>>>>>>>>
>>>>>>>>>>>>> and because the output was not just few lines long I uploaded
>>>>>>>>>>>>> it to the
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/dpmpool22.xrootd.log
>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/dpmpool24.xrootd.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> These machines on 2001:718:401:6e03::/64 subnet are in remote
>>>>>>>>>>>>> location
>>>>>>>>>>>>> (but relatively close with RTT ~ 0.7ms). I can see several
>>>>>>>>>>>>> ofs_Posc
>>>>>>>>>>>>> occurrences in the logs of our disknodes during "190717
>>>>>>>>>>>>> 17:xx:xx" time
>>>>>>>>>>>>> period:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://vokac.web.cern.ch/vokac/tmp/POSC.txt
>>>>>>>>>>>>>
>>>>>>>>>>>>> Petr
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/19/19 8:51 PM, Andrew Hanushevsky wrote:
>>>>>>>>>>>>>> Hi Petr,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, POSC says that the client reconnected for writing so not
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> went well. Can I get the log output for that occurrence as I
>>>>>>>>>>>>>> previously described?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Andy
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, 19 Jul 2019, Petr Vokac wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We are using Rucio download client to transfer job outputs
>>>>>>>>>>>>>>> (it is
>>>>>>>>>>>>>>> build
>>>>>>>>>>>>>>> on top of gfal2 python interface).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/dpmpool22.xrootd.log
>>>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/dpmpool24.xrootd.log
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> this is the full job output unfortunately with minimal gfal2
>>>>>>>>>>>>>>> logging
>>>>>>>>>>>>>>> output.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/grid.2258136.4.out-from-2001:718:401:6e03::1:21-to-dpmpool22.txt
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://vokac.web.cern.ch/vokac/tmp/grid.2255921.7.out-from-2001:718:401:6e03::1:32-to-dpmpool24.txt
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It looks like according gfal2 transfer was OK
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,782 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,782 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,783 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,783 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_http.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,783 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,783 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_file.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,784 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,784 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,784 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,793 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,793 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,793 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_http.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_file.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,794 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,795 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,795 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,795 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,795 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_http.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,796 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,796 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_file.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,796 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,796 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,796 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | connect                   |
>>>>>>>>>>>>>>> [gfal_module_load] plugin
>>>>>>>>>>>>>>> /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so loaded with
>>>>>>>>>>>>>>> success
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,810 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | __gfal2_copy              | Event
>>>>>>>>>>>>>>> triggered: BOTH GFAL2:CORE:COPY LIST:ENTER
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,810 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | __gfal2_copy              | Event
>>>>>>>>>>>>>>> triggered: BOTH GFAL2:CORE:COPY LIST:ITEM
>>>>>>>>>>>>>>> file:///scratch/condor/dir_24031/atlas_HudmF2he/PanDA_Pilot-4419868006/log.18636890._001059.job.log.tgz.1
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> =>
>>>>>>>>>>>>>>> root://golias100.farm.particle.cz:1094//dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/log.18636890._001059.job.log.tgz.1.rucio.upload
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,810 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> gfal2                            | __gfal2_copy              | Event
>>>>>>>>>>>>>>> triggered: BOTH GFAL2:CORE:COPY LIST:EXIT
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,811 | INFO     | Dummy-2             |
>>>>>>>>>>>>>>> gfal2                            | (unknown function)        | Event
>>>>>>>>>>>>>>> triggered: BOTH xroot TRANSFER:ENTER
>>>>>>>>>>>>>>> file://localhost///scratch/condor/dir_24031/atlas_HudmF2he/PanDA_Pilot-4419868006/log.18636890._001059.job.log.tgz.1?xrd.gsiusrpxy=/scratch/condor/dir_24031/job.9XONDm6yx7unnoBGSqYX2MjnABFKDmABFKDmSyhTDmEBFKDmqAGN3n.proxy&xrd.wantprot=gsi,unix
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> =>
>>>>>>>>>>>>>>> root://golias100.farm.particle.cz:1094///dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/log.18636890._001059.job.log.tgz.1.rucio.upload?xrd.gsiusrpxy=/scratch/condor/dir_24031/job.9XONDm6yx7unnoBGSqYX2MjnABFKDmABFKDmSyhTDmEBFKDmqAGN3n.proxy&x
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2019-07-17 15:09:17,811 | INFO     | Dummy-2             |
>>>>>>>>>>>>>>> gfal2                            | (unknown function)        | Event
>>>>>>>>>>>>>>> triggered: BOTH xroot TRANSFER:TYPE streamed
>>>>>>>>>>>>>>> 2019-07-17 15:11:23,914 | INFO     | Dummy-2             |
>>>>>>>>>>>>>>> gfal2                            | (unknown function)        | Event
>>>>>>>>>>>>>>> triggered: BOTH xroot TRANSFER:EXIT Job finished, [SUCCESS]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> but probably rucio tries to verify checksum (gfal-sum via python
>>>>>>>>>>>>>>> bindings) and it fails
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2019-07-17 15:15:23,774 | INFO     | copytool_out        |
>>>>>>>>>>>>>>> pilot.copytool.rucio             | copy_out                  |
>>>>>>>>>>>>>>> stderr = An unknown exception occurred.
>>>>>>>>>>>>>>> Details: Checksum not validated
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> because DPM is unable to calculate checksum for 128TB file
>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>> timeout.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Petr
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 7/19/19 5:07 AM, Andrew Hanushevsky wrote:
>>>>>>>>>>>>>>>> Hi Petr,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Indeed, it looks like what was used to upload the files
>>>>>>>>>>>>>>>> (xrdcp?)
>>>>>>>>>>>>>>>> created a very large sparse file by writing with a very large
>>>>>>>>>>>>>>>> offset.
>>>>>>>>>>>>>>>> The clue here is that an anomaly occurred during the write
>>>>>>>>>>>>>>>> and these
>>>>>>>>>>>>>>>> two messages (lkely each corresponding to each file) tell us
>>>>>>>>>>>>>>>> something....
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 190717 17:10:39 332092 ofs_Posc: Creator changed from
>>>>>>>>>>>>>>>>    atlasprd.2757:90@[2001:718:401:6e03::1:21] to    atlasprd.2757:39@[2001:718:401:6e03::1:21] for   /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/    log.18636890._001059.job.log.tgz.1.rucio.upload
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 190717 17:29:26 14204 ofs_Posc: Creator changed from
>>>>>>>>>>>>>>>>    atlasprd.2757:33@[2001:718:401:6e03::1:32] to    atlasprd.2757:95@[2001:718:401:6e03::1:32] for   /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_5TeV/09/bc/    AOD.18595963._000786.pool.root.1.rucio.upload
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What the POSC system is saying is that in each case the
>>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>>> creator of the file lost contact with the server. It then
>>>>>>>>>>>>>>>> reconnected
>>>>>>>>>>>>>>>> and resumed writing with a successful close. POSC allows
>>>>>>>>>>>>>>>> this to
>>>>>>>>>>>>>>>> avoid
>>>>>>>>>>>>>>>> network glitches from deleting a file because a network
>>>>>>>>>>>>>>>> glitch will
>>>>>>>>>>>>>>>> cause the client to reconnect and resume where it left off.
>>>>>>>>>>>>>>>> This is
>>>>>>>>>>>>>>>> only allowed when the resuming client comes from the same IP
>>>>>>>>>>>>>>>> address,
>>>>>>>>>>>>>>>> has the same process ID, and the same logical name. In this
>>>>>>>>>>>>>>>> case it
>>>>>>>>>>>>>>>> was "atlasprd.2757". I find that rather odd because it's
>>>>>>>>>>>>>>>> next to
>>>>>>>>>>>>>>>> impossible for two processes on two different machines to
>>>>>>>>>>>>>>>> have the
>>>>>>>>>>>>>>>> same process number (i.e. 2757). So, this is why I am
>>>>>>>>>>>>>>>> wondering if
>>>>>>>>>>>>>>>> this was really an xrdcp or some other program.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We should look to see what these client were actually doing
>>>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>>> this period as they clearly did not do the right thing when
>>>>>>>>>>>>>>>> they
>>>>>>>>>>>>>>>> reconnected. So, could you grep out all interactions of
>>>>>>>>>>>>>>>> these two
>>>>>>>>>>>>>>>> client from the log. Something like:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> grep "atlasprd.2757:" <logfile> | egrep
>>>>>>>>>>>>>>>> '2001:718:401:6e03::1:21|2001:718:401:6e03::1:32' > <excerpt>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and send the excerpt to me?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Andy
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, 18 Jul 2019, Petr Vokac wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> our DPM which is using native xrootd 4.9.1 yesterday
>>>>>>>>>>>>>>>>> created two
>>>>>>>>>>>>>>>>> different 128TB files on two different diskservers that were
>>>>>>>>>>>>>>>>> received as
>>>>>>>>>>>>>>>>> a job output from two different worker nodes. The size of
>>>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>>> was few megabytes and when I check beginning of stored
>>>>>>>>>>>>>>>>> files up to
>>>>>>>>>>>>>>>>> original size & checksum everything seems to be OK - except
>>>>>>>>>>>>>>>>> these are
>>>>>>>>>>>>>>>>> sparse files with size 128TB. Original transferred files
>>>>>>>>>>>>>>>>> from WN
>>>>>>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>> like:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> mc15_13TeV:log.18636890._001059.job.log.tgz.1 - original size
>>>>>>>>>>>>>>>>> 3101899, adler32 742e1336
>>>>>>>>>>>>>>>>> mc16_5TeV:AOD.18595963._000786.pool.root.1 - original size
>>>>>>>>>>>>>>>>> 737474514,
>>>>>>>>>>>>>>>>> adler32 dc68cb21
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> but on our diskservers they were stored as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root@dpmpool22 ~]# ls -l
>>>>>>>>>>>>>>>>> /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -rw-rw---- 1 dpmmgr dpmmgr 140560257079236 Jul 17 17:11
>>>>>>>>>>>>>>>>> /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root@dpmpool22 ~]# du --block-size=1
>>>>>>>>>>>>>>>>> /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3108864
>>>>>>>>>>>>>>>>> /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root@dpmpool24 ~]# ls -l
>>>>>>>>>>>>>>>>> /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -rw-rw---- 1 dpmmgr dpmmgr 139734345056408 Jul 17 17:31
>>>>>>>>>>>>>>>>> /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root@dpmpool24 tmp]# du --block-size=1
>>>>>>>>>>>>>>>>> /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 737480704
>>>>>>>>>>>>>>>>> /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I was told that DPM is not really involved in transfering /
>>>>>>>>>>>>>>>>> storing
>>>>>>>>>>>>>>>>> files with xrootd protocol, so I guess something bad must
>>>>>>>>>>>>>>>>> happened
>>>>>>>>>>>>>>>>> inside XRootD libraries. Do you have any idea why these two
>>>>>>>>>>>>>>>>> transfers
>>>>>>>>>>>>>>>>> generated such extreme files? Unfortunately our xrootd
>>>>>>>>>>>>>>>>> logging is
>>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>>> low and I was able to find only info about file creation
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [root@dpmpool22 ~]# grep log.18636890._001059.job.log.tgz
>>>>>>>>>>>>>>>>> /var/log/xrootd/dpmdisk/xrootd.log
>>>>>>>>>>>>>>>>> 190717 17:10:39 332092 ofs_Posc: Creator changed from
>>>>>>>>>>>>>>>>    atlasprd.2757:90@[2001:718:401:6e03::1:21] to    atlasprd.2757:39@[2001:718:401:6e03::1:21] for   /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/    log.18636890._001059.job.log.tgz.1.rucio.upload
>>>>>>>>>>>>>>>>> [root@dpmpool24 tmp]# grep AOD.18595963._000786.pool.root
>>>>>>>>>>>>>>>>> /var/log/xrootd/dpmdisk/xrootd.log
>>>>>>>>>>>>>>>>> 190717 17:29:26 14204 ofs_Posc: Creator changed from
>>>>>>>>>>>>>>>>    atlasprd.2757:33@[2001:718:401:6e03::1:32] to    atlasprd.2757:95@[2001:718:401:6e03::1:32] for   /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_5TeV/09/bc/AOD.18595963.    _000786.pool.root.1.rucio.upload
>>>>>>>>>>>>>>>>> Petr
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ########################################################################
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Use REPLY-ALL to reply to list
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To unsubscribe from the XROOTD-L list, click the following
>>>>>>>>>>>>>>>>> link:
>>>>>>>>>>>>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ########################################################################
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Use REPLY-ALL to reply to list
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To unsubscribe from the XROOTD-L list, click the following
>>>>>>>>>>>>>>>> link:
>>>>>>>>>>>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ########################################################################
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Use REPLY-ALL to reply to list
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>>>>>>>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Use REPLY-ALL to reply to list
>>>>>>>>>>>>>
>>>>>>>>>>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>>>>>>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Use REPLY-ALL to reply to list
>>>>>>>>>>
>>>>>>>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>>>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>>>>>>>
>>>>>> ########################################################################
>>>>>> Use REPLY-ALL to reply to list
>>>>>>
>>>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>> ########################################################################
>>>>> Use REPLY-ALL to reply to list
>>>>>
>>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> Use REPLY-ALL to reply to list
>>>>
>>>> To unsubscribe from the XROOTD-L list, click the following link:
>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the XROOTD-L list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the XROOTD-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>>
>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1