Print

Print


No (except for DNS & DHCP).

Actually I was thinking about tests with UDP-lite, because it seems to
me this could be easiest way to test/detect payload corruption.

Petr


On 8/18/19 10:45 AM, Yang, Wei wrote:
>
> Do you use UDP at somewhere, which doesn’t check data integrity?
>
>  
>
> --
>
> Wei Yang  |  [log in to unmask] <mailto:[log in to unmask]> 
> |  650-926-3338(O)
>
>  
>
> *From: *Petr Vokac <[log in to unmask]>
> *Date: *Sunday, August 18, 2019 at 1:42 AM
> *To: *Wei Yang <[log in to unmask]>, Andrew Hanushevsky
> <[log in to unmask]>
> *Cc: *dpm-devel <[log in to unmask]>, xrootd-l <[log in to unmask]>
> *Subject: *Re: Uploaded file 128TB on filesystem
>
>  
>
> I checked only first two 128TB files where I could see right content
> up to the original source filesize (same ADLER32) and corresponding
> small real size obtained by `du --block-size=1 ...`
>
> Different source -> destination was involved with these big sparse files
>
> mff21.farm.particle.cz -> dpmpool22.farm.particle.cz -> 128TB
> mff32.farm.particle.cz -> dpmpool24.farm.particle.cz -> 128TB
> mff72.farm.particle.cz -> dpmpool21.farm.particle.cz -> 65PB
> mff84.farm.particle.cz -> dpmpool1.farm.particle.cz -> 665PB
>
> TCP checksums seems to me pretty weak if I understand them correctly
> (16bit and e.g. swapping two bytes leads to same checksum).
>
> We have order of magnitude more machines on local
> 2001:718:401:6025::/64 subnet, but till now I detected problems coming
> only from distant 2001:718:401:6e03::/64 subnet (I think 2-3 active
> network devices are on the path to this 20km distant location).
>
>
> I'll try to test myself transfers between these machines. Do you know
> some tool that could help me understand TPC/UDP packet corruption?
> (otherwise I'll write something very simple for UDP transfers in
> python myself)
>
>
> Petr
>
> On 8/18/19 10:06 AM, Yang, Wei wrote:
>
>     Though TCP checksum can miss something, the chance is very small.
>     I think you have seem issues often enough that it is unlikely to
>     be caused by TCP error. I would incline to check to
>     storage/filesystem at both source and destination (or switch in
>     between).  Also, are the 65PB file or 128TB file sparse files?
>
>      
>
>     In theory, using strace to check xrootd protocol can determine the
>     offset and length of each write. But on a production system it is
>     virtually impossible to do.
>
>      
>
>     --
>
>     Wei Yang  |  [log in to unmask]
>     <mailto:[log in to unmask]>  |  650-926-3338(O)
>
>      
>
>     *From: *Petr Vokac <[log in to unmask]> <mailto:[log in to unmask]>
>     *Date: *Sunday, August 18, 2019 at 12:50 AM
>     *To: *Wei Yang <[log in to unmask]>
>     <mailto:[log in to unmask]>, Andrew Hanushevsky
>     <[log in to unmask]> <mailto:[log in to unmask]>
>     *Cc: *dpm-devel <[log in to unmask]> <mailto:[log in to unmask]>,
>     xrootd-l <[log in to unmask]>
>     <mailto:[log in to unmask]>
>     *Subject: *Re: Uploaded file 128TB on filesystem
>
>      
>
>     This /mnt/... is the physical file on disknode filesystem, DPM
>     file location was
>
>     root://golias100.farm.particle.cz//dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/8b/f3/EVNT.18893579._003197.pool.root.1.rucio.upload
>
>     but I already removed this file, because it is our production
>     storage and we run out of quota with this big sparse file.
>
>
>     Yesterday evening one more 665PB file arrived, but this time there
>     was not ofs_Posc error message in log files.
>
>
>     All four big sparse files transferred with xrootd came from
>     distant worker nodes (2001:718:401:6e03::/64 subnet) and recently
>     I found few job output uploads failing after checksum validation
>     (Rucio client use ADLER32 to check transferred data). This
>     behavior is suspicious and I'm starting to think about TCP payload
>     corruption while transferring data from these worker nodes.
>
>
>     Is it possible that corrupted TCP data (that still pass TCP
>     checksum) could cause this behavior?
>
>
>     Petr
>
>     On 8/18/19 6:11 AM, Yang, Wei wrote:
>
>         Hi Petr,
>
>          
>
>         Is this /mnt/fs4/atlas/2019-08-17/… a DPM mount point path, or
>         the actual storage path on a DPM disk node?
>
>          
>
>         regards,
>
>         --
>
>         Wei Yang  |  [log in to unmask]
>         <mailto:[log in to unmask]>  |  650-926-3338(O)
>
>          
>
>         *From: *<[log in to unmask]>
>         <mailto:[log in to unmask]> on behalf of Petr Vokac
>         <[log in to unmask]> <mailto:[log in to unmask]>
>         *Date: *Saturday, August 17, 2019 at 3:38 AM
>         *To: *Andrew Hanushevsky <[log in to unmask]>
>         <mailto:[log in to unmask]>
>         *Cc: *dpm-devel <[log in to unmask]>
>         <mailto:[log in to unmask]>, xrootd-l
>         <[log in to unmask]> <mailto:[log in to unmask]>
>         *Subject: *Re: Uploaded file 128TB on filesystem
>
>          
>
>         Today new "65PB" file appeared on our storage and it was again
>         transferred with xrootd protocol. I can see again same error
>         in the log file:
>
>         *[root@dpmpool21 ~]# ls -la
>         /mnt/fs4/atlas/2019-08-17/EVNT.18893579._003197.pool.root.1.rucio.upload.234855.1566031522*
>
>         -rw-rw---- 1 dpmmgr dpmmgr 72340172838077281 Aug 17 10:48 /mnt/fs4/atlas/2019-08-17/EVNT.18893579._003197.pool.root.1.rucio.upload.234855.1566031522
>
>         * *
>
>         *[root@dpmpool21 ~]# grep atlasprd.2757
>         /var/log/xrootd/dpmdisk/xrootd.log*
>
>         ...
>
>         190817 10:47:55 57803 XrootdXeq: atlasprd.2757:65@[2001:718:401:6e03::1:43] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>
>         190817 10:47:55 61859 XrootdXeq: atlasprd.2757:66@[2001:718:401:6e03::1:83] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>
>         190817 10:48:52 62233 XrootdXeq: atlasprd.2757:46@[2001:718:401:6e03::1:72] disc 0:00:58
>
>         190817 10:48:52 62954 XrootdXeq: atlasprd.2757:25@[2001:718:401:6e03::1:72] disc 0:03:30
>
>         190817 10:48:52 57804 XrootdXeq: atlasprd.2757:28@[2001:718:401:6e03::1:72] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>
>         190817 10:48:52 57804 ofs_Posc: Creator changed from atlasprd.2757:25@[2001:718:401:6e03::1:72] to atlasprd.2757:28@[2001:718:401:6e03::1:72] for /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/8b/f3/EVNT.18893579._003197.pool.root.1.rucio.upload
>
>         190817 10:48:53 57803 XrootdXeq: atlasprd.2757:46@[2001:718:401:6017:20:0:14:44] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>
>         190817 10:49:01 57818 XrootdXeq: atlasprd.2757:70@[2001:718:401:6017:20:0:11:11] pub IP64 login as /DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1
>
>         190817 10:49:07 62954 XrootdXeq: atlasprd.2757:55@[2001:718:401:6e03::1:81] disc 0:10:17
>
>         ...
>
>         Any suggestion how to trace what is causing this behavior?
>
>         Petr
>
>
>         On 7/20/19 7:48 AM, Petr Vokac wrote:
>
>             Hi,
>
>             I ran the command you sent me
>
>             grep "atlasprd.2757:" <logfile> | egrep 
>
>             '2001:718:401:6e03::1:21|2001:718:401:6e03::1:32' > <excerpt>
>
>
>             and because the output was not just few lines long I
>             uploaded it to the
>
>             https://vokac.web.cern.ch/vokac/tmp/dpmpool22.xrootd.log
>             https://vokac.web.cern.ch/vokac/tmp/dpmpool24.xrootd.log
>
>             These machines on 2001:718:401:6e03::/64 subnet are in
>             remote location (but relatively close with RTT ~ 0.7ms). I
>             can see several ofs_Posc occurrences in the logs of our
>             disknodes during "190717 17:xx:xx" time period:
>
>             http://vokac.web.cern.ch/vokac/tmp/POSC.txt
>
>             Petr
>
>             On 7/19/19 8:51 PM, Andrew Hanushevsky wrote:
>
>                 Hi Petr,
>
>                 Well, POSC says that the client reconnected for
>                 writing so not all went well. Can I get the log output
>                 for that occurrence as I previously described?
>
>                 Andy
>
>                 On Fri, 19 Jul 2019, Petr Vokac wrote:
>
>
>
>
>                     We are using Rucio download client to transfer job
>                     outputs (it is build
>                     on top of gfal2 python interface).
>
>                     https://vokac.web.cern.ch/vokac/tmp/dpmpool22.xrootd.log
>
>                     https://vokac.web.cern.ch/vokac/tmp/dpmpool24.xrootd.log
>
>
>                     this is the full job output unfortunately with
>                     minimal gfal2 logging output.
>
>                     https://vokac.web.cern.ch/vokac/tmp/grid.2258136.4.out-from-2001:718:401:6e03::1:21-to-dpmpool22.txt
>
>                     https://vokac.web.cern.ch/vokac/tmp/grid.2255921.7.out-from-2001:718:401:6e03::1:32-to-dpmpool24.txt
>
>
>                     It looks like according gfal2 transfer was OK
>
>                     2019-07-17 15:09:17,782 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so
>                     loaded with success
>                     2019-07-17 15:09:17,782 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so
>                     loaded with success
>                     2019-07-17 15:09:17,783 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,783 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_http.so
>                     loaded with success
>                     2019-07-17 15:09:17,783 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so
>                     loaded with success
>                     2019-07-17 15:09:17,783 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_file.so
>                     loaded with success
>                     2019-07-17 15:09:17,784 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,784 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so
>                     loaded with success
>                     2019-07-17 15:09:17,784 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so
>                     loaded with success
>                     2019-07-17 15:09:17,793 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so
>                     loaded with success
>                     2019-07-17 15:09:17,793 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so
>                     loaded with success
>                     2019-07-17 15:09:17,793 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_http.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_file.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so
>                     loaded with success
>                     2019-07-17 15:09:17,794 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so
>                     loaded with success
>                     2019-07-17 15:09:17,795 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_xrootd.so
>                     loaded with success
>                     2019-07-17 15:09:17,795 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_rfio.so
>                     loaded with success
>                     2019-07-17 15:09:17,795 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_gridftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,795 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_http.so
>                     loaded with success
>                     2019-07-17 15:09:17,796 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_lfc.so
>                     loaded with success
>                     2019-07-17 15:09:17,796 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_file.so
>                     loaded with success
>                     2019-07-17 15:09:17,796 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_sftp.so
>                     loaded with success
>                     2019-07-17 15:09:17,796 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_dcap.so
>                     loaded with success
>                     2019-07-17 15:09:17,796 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     connect                   | [gfal_module_load]
>                     plugin
>                     /usr/lib64/gfal2-plugins//libgfal_plugin_srm.so
>                     loaded with success
>                     2019-07-17 15:09:17,810 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     __gfal2_copy              | Event triggered: BOTH
>                     GFAL2:CORE:COPY LIST:ENTER
>                     2019-07-17 15:09:17,810 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     __gfal2_copy              | Event triggered: BOTH
>                     GFAL2:CORE:COPY LIST:ITEM
>                     file:///scratch/condor/dir_24031/atlas_HudmF2he/PanDA_Pilot-4419868006/log.18636890._001059.job.log.tgz.1
>                     =>
>                     root://golias100.farm.particle.cz:1094//dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/log.18636890._001059.job.log.tgz.1.rucio.upload
>                     2019-07-17 15:09:17,810 | INFO     |
>                     copytool_out        |
>                     gfal2                            |
>                     __gfal2_copy              | Event triggered: BOTH
>                     GFAL2:CORE:COPY LIST:EXIT
>                     2019-07-17 15:09:17,811 | INFO     |
>                     Dummy-2             |
>                     gfal2                            | (unknown
>                     function)        | Event triggered: BOTH xroot
>                     TRANSFER:ENTER
>                     file://localhost///scratch/condor/dir_24031/atlas_HudmF2he/PanDA_Pilot-4419868006/log.18636890._001059.job.log.tgz.1?xrd.gsiusrpxy=/scratch/condor/dir_24031/job.9XONDm6yx7unnoBGSqYX2MjnABFKDmABFKDmSyhTDmEBFKDmqAGN3n.proxy&xrd.wantprot=gsi,unix
>                     <file:///localhost/scratch/condor/dir_24031/atlas_HudmF2he/PanDA_Pilot-4419868006/log.18636890._001059.job.log.tgz.1?xrd.gsiusrpxy=/scratch/condor/dir_24031/job.9XONDm6yx7unnoBGSqYX2MjnABFKDmABFKDmSyhTDmEBFKDmqAGN3n.proxy&xrd.wantprot=gsi,unix>
>                     =>
>                     root://golias100.farm.particle.cz:1094///dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/log.18636890._001059.job.log.tgz.1.rucio.upload?xrd.gsiusrpxy=/scratch/condor/dir_24031/job.9XONDm6yx7unnoBGSqYX2MjnABFKDmABFKDmSyhTDmEBFKDmqAGN3n.proxy&x
>                     2019-07-17 15:09:17,811 | INFO     |
>                     Dummy-2             |
>                     gfal2                            | (unknown
>                     function)        | Event triggered: BOTH xroot
>                     TRANSFER:TYPE streamed
>                     2019-07-17 15:11:23,914 | INFO     |
>                     Dummy-2             |
>                     gfal2                            | (unknown
>                     function)        | Event triggered: BOTH xroot
>                     TRANSFER:EXIT Job finished, [SUCCESS]
>
>                     but probably rucio tries to verify checksum
>                     (gfal-sum via python
>                     bindings) and it fails
>
>                     2019-07-17 15:15:23,774 | INFO     |
>                     copytool_out        |
>                     pilot.copytool.rucio             |
>                     copy_out                  | stderr = An unknown
>                     exception occurred.
>                     Details: Checksum not validated
>
>                     because DPM is unable to calculate checksum for
>                     128TB file within timeout.
>
>                     Petr
>
>                     On 7/19/19 5:07 AM, Andrew Hanushevsky wrote:
>
>
>
>                         Hi Petr,
>
>                         Indeed, it looks like what was used to upload
>                         the files (xrdcp?)
>                         created a very large sparse file by writing
>                         with a very large offset.
>                         The clue here is that an anomaly occurred
>                         during the write and these
>                         two messages (lkely each corresponding to each
>                         file) tell us
>                         something....
>
>
>
>
>                             190717 17:10:39 332092 ofs_Posc: Creator
>                             changed from
>
>                           atlasprd.2757:90@[2001:718:401:6e03::1:21] to
>                           atlasprd.2757:39@[2001:718:401:6e03::1:21] for
>                          
>                         /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/
>
>                           log.18636890._001059.job.log.tgz.1.rucio.upload
>
>
>
>
>                             190717 17:29:26 14204 ofs_Posc: Creator
>                             changed from
>
>                           atlasprd.2757:33@[2001:718:401:6e03::1:32] to
>                           atlasprd.2757:95@[2001:718:401:6e03::1:32] for
>                          
>                         /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_5TeV/09/bc/
>
>                           AOD.18595963._000786.pool.root.1.rucio.upload
>
>                         What the POSC system is saying is that in each
>                         case the original
>                         creator of the file lost contact with the
>                         server. It then reconnected
>                         and resumed writing with a successful close.
>                         POSC allows this to avoid
>                         network glitches from deleting a file because
>                         a network glitch will
>                         cause the client to reconnect and resume where
>                         it left off. This is
>                         only allowed when the resuming client comes
>                         from the same IP address,
>                         has the same process ID, and the same logical
>                         name. In this case it
>                         was "atlasprd.2757". I find that rather odd
>                         because it's next to
>                         impossible for two processes on two different
>                         machines to have the
>                         same process number (i.e. 2757). So, this is
>                         why I am wondering if
>                         this was really an xrdcp or some other program.
>
>                         We should look to see what these client were
>                         actually doing during
>                         this period as they clearly did not do the
>                         right thing when they
>                         reconnected. So, could you grep out all
>                         interactions of these two
>                         client from the log. Something like:
>
>                         grep "atlasprd.2757:" <logfile> | egrep
>                         '2001:718:401:6e03::1:21|2001:718:401:6e03::1:32'
>                         > <excerpt>
>
>                         and send the excerpt to me?
>
>                         Andy
>
>
>                         On Thu, 18 Jul 2019, Petr Vokac wrote:
>
>
>
>
>                             Hi,
>
>                             our DPM which is using native xrootd 4.9.1
>                             yesterday created two
>                             different 128TB files on two different
>                             diskservers that were received as
>                             a job output from two different worker
>                             nodes. The size of original file
>                             was few megabytes and when I check
>                             beginning of stored files up to
>                             original size & checksum everything seems
>                             to be OK - except these are
>                             sparse files with size 128TB. Original
>                             transferred files from WN looked
>                             like:
>
>                             mc15_13TeV:log.18636890._001059.job.log.tgz.1
>                             - original size
>                             3101899, adler32 742e1336
>                             mc16_5TeV:AOD.18595963._000786.pool.root.1
>                             - original size 737474514,
>                             adler32 dc68cb21
>
>                             but on our diskservers they were stored as
>
>                             [root@dpmpool22 ~]# ls -l
>                             /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>
>                             -rw-rw---- 1 dpmmgr dpmmgr 140560257079236
>                             Jul 17 17:11
>                             /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>
>                             [root@dpmpool22 ~]# du --block-size=1
>                             /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>
>                             3108864   
>                             /mnt/fs1/atlas/2019-07-17/log.18636890._001059.job.log.tgz.1.rucio.upload.209928.1563376157
>
>
>                             [root@dpmpool24 ~]# ls -l
>                             /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>
>                             -rw-rw---- 1 dpmmgr dpmmgr 139734345056408
>                             Jul 17 17:31
>                             /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>
>                             [root@dpmpool24 tmp]# du --block-size=1
>                             /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>
>                             737480704   
>                             /mnt/fs2/atlas/2019-07-17/AOD.18595963._000786.pool.root.1.rucio.upload.210480.1563377083
>
>
>                             I was told that DPM is not really involved
>                             in transfering / storing
>                             files with xrootd protocol, so I guess
>                             something bad must happened
>                             inside XRootD libraries. Do you have any
>                             idea why these two transfers
>                             generated such extreme files?
>                             Unfortunately our xrootd logging is pretty
>                             low and I was able to find only info about
>                             file creation
>
>                             [root@dpmpool22 ~]# grep
>                             log.18636890._001059.job.log.tgz
>                             /var/log/xrootd/dpmdisk/xrootd.log
>                             190717 17:10:39 332092 ofs_Posc: Creator
>                             changed from
>
>                           atlasprd.2757:90@[2001:718:401:6e03::1:21] to
>                           atlasprd.2757:39@[2001:718:401:6e03::1:21] for
>                          
>                         /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc15_13TeV/26/98/
>
>                           log.18636890._001059.job.log.tgz.1.rucio.upload
>
>
>
>
>                             [root@dpmpool24 tmp]# grep
>                             AOD.18595963._000786.pool.root
>                             /var/log/xrootd/dpmdisk/xrootd.log
>                             190717 17:29:26 14204 ofs_Posc: Creator
>                             changed from
>
>                           atlasprd.2757:33@[2001:718:401:6e03::1:32] to
>                           atlasprd.2757:95@[2001:718:401:6e03::1:32] for
>                          
>                         /dpm/farm.particle.cz/home/atlas/atlasdatadisk/rucio/mc16_5TeV/09/bc/AOD.18595963.
>
>                           _000786.pool.root.1.rucio.upload
>
>
>
>
>
>                             Petr
>
>                             ########################################################################
>
>                             Use REPLY-ALL to reply to list
>
>                             To unsubscribe from the XROOTD-L list,
>                             click the following link:
>                             https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>
>                         ########################################################################
>
>                         Use REPLY-ALL to reply to list
>
>                         To unsubscribe from the XROOTD-L list, click
>                         the following link:
>                         https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>                      
>
>
>                 ########################################################################
>
>                 Use REPLY-ALL to reply to list
>
>                 To unsubscribe from the XROOTD-L list, click the
>                 following link:
>                 https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>              
>
>             ------------------------------------------------------------------------
>
>             Use REPLY-ALL to reply to list
>
>             To unsubscribe from the XROOTD-L list, click the following
>             link:
>             https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>          
>
>         ------------------------------------------------------------------------
>
>         Use REPLY-ALL to reply to list
>
>         To unsubscribe from the XROOTD-L list, click the following link:
>         https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>
>
>      
>
>     ------------------------------------------------------------------------
>
>     Use REPLY-ALL to reply to list
>
>     To unsubscribe from the XROOTD-L list, click the following link:
>     https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>
>
>
>
> ------------------------------------------------------------------------
>
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the XROOTD-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1
>


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-L&A=1