Print

Print


URL:
  <http://savannah.cern.ch/bugs/?99674>

                 Summary: odd transcient behaviour on google compute
element(gce) storage cluster
                 Project: XROOTD
            Submitted by: bdouglas
            Submitted on: 2013-01-07 12:46
             Report Type: Bug
                Priority: 5 - Normal
                Severity: 3 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
      Fixed by commit(s): 

    _______________________________________________________

Details:

Hi,

I have setup an xrootd storage cluster in the google cloud. (gce)
and have seen this odd behaviour.

I have successfully copied files into the storage but when I go to locate the
files sometimes I see them and some times I do not.

For example:

root://headnode.c.atlasgce.internal:1094//> dirlist /atlas/local/benjamin/
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-15
-rw-(048)    104857600 2012-12-22 23:19:36
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:23
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-05
-rw-(048)    104857600 2012-12-22 23:19:43
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-20
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-19
-rw-(048)    104857600 2012-12-22 23:19:42
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:40
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-18
-rw-(048)    104857600 2012-12-22 23:19:39
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-17
-rw-(048)    104857600 2012-12-22 23:19:37
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-16
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-14
-rw-(048)    104857600 2012-12-22 23:19:35
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-13
-rw-(048)    104857600 2012-12-22 23:19:33
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:32
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:35:09
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-12
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-11
drwx(051)         4096 2013-01-07 11:34:25
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
-rw-(048)    104857600 2012-12-22 23:19:31
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:33:34
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
-rw-(048)    104857600 2012-12-22 23:19:29
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-10
drwx(051)         4096 2013-01-07 11:33:12
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
-rw-(048)    104857600 2012-12-22 23:19:28
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-09
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-04
-rw-(048)    104857600 2012-12-22 23:19:21
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:32:21
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2013-01-07 11:31:35
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-07
-rw-(048)    104857600 2012-12-22 23:19:25
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-03
-rw-(048)    104857600 2012-12-22 23:19:20
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:31:25
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-08
-rw-(048)    104857600 2012-12-22 23:19:27
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:30:39
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2013-01-07 11:29:53
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
-rw-(048)    104857600 2012-12-22 23:19:24
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-06
drwx(051)         4096 2013-01-07 11:29:06
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
-rw-(048)    104857600 2012-12-22 23:19:19
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-02
-rw-(048)    104857600 2012-12-22 23:19:17
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:27:23
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2012-12-22 23:11:53 /atlas/local/benjamin/dpb-test-01
-rw-(048)    104857600 2012-12-22 23:19:16
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 05:41:03
/atlas/local/benjamin/ddo.000001.Atlas.Ideal.DBRelease.v210501
drwx(051)         4096 2012-12-22 23:12:12
/atlas/local/benjamin/dpb-test-00-clone
drwx(051)         4096 2012-12-22 23:12:01 /atlas/local/benjamin/dpb-test-00
-rw-(048)    104857600 2012-12-22 23:19:14
/atlas/local/benjamin/testfile_100MB
-rw-(048)           10 2013-01-07 05:13:32
/atlas/local/benjamin/dpb-apf-00.testfile

Yet a short time late (after the help command in xrd)

root://headnode.c.atlasgce.internal:1094//> dirlistrec
/atlas/local/benjamin/
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-15
-rw-(048)    104857600 2012-12-22 23:19:36
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:23
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-05
-rw-(048)    104857600 2012-12-22 23:19:43
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-20
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-19
-rw-(048)    104857600 2012-12-22 23:19:42
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:40
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-18
-rw-(048)    104857600 2012-12-22 23:19:39
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-17
-rw-(048)    104857600 2012-12-22 23:19:37
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-16
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-14
-rw-(048)    104857600 2012-12-22 23:19:35
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-13
-rw-(048)    104857600 2012-12-22 23:19:33
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:32
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 11:35:09
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-12
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-11
-rw-(048)    104857600 2012-12-22 23:19:31
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:29
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-10
-rw-(048)    104857600 2012-12-22 23:19:28
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-09
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-04
-rw-(048)    104857600 2012-12-22 23:19:21
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-07
-rw-(048)    104857600 2012-12-22 23:19:25
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-03
-rw-(048)    104857600 2012-12-22 23:19:20
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-08
-rw-(048)    104857600 2012-12-22 23:19:27
/atlas/local/benjamin/testfile_100MB
-rw-(048)    104857600 2012-12-22 23:19:24
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-06
-rw-(048)    104857600 2012-12-22 23:19:19
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:13:32 /atlas/local/benjamin/dpb-test-02
-rw-(048)    104857600 2012-12-22 23:19:17
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2012-12-22 23:11:53 /atlas/local/benjamin/dpb-test-01
-rw-(048)    104857600 2012-12-22 23:19:16
/atlas/local/benjamin/testfile_100MB
drwx(051)         4096 2013-01-07 05:41:03
/atlas/local/benjamin/ddo.000001.Atlas.Ideal.DBRelease.v210501
drwx(051)         4096 2012-12-22 23:12:12
/atlas/local/benjamin/dpb-test-00-clone
drwx(051)         4096 2012-12-22 23:12:01 /atlas/local/benjamin/dpb-test-00
-rw-(048)    104857600 2012-12-22 23:19:14
/atlas/local/benjamin/testfile_100MB
-rw-(048)           10 2013-01-07 05:13:32
/atlas/local/benjamin/dpb-apf-00.testfile
Error 3011: Unable to open directory /atlas/local/benjamin/dpb-test-00-clone;
No such file or directory

In server headnode.c.atlasgce.internal:1094 or in some of its child nodes.

root://headnode.c.atlasgce.internal:1094//> locateall
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m120/
No matching files were found.

root://headnode.c.atlasgce.internal:1094//> exit
Goodbye.

[benjamin@dpb-apf-00 d3pd_testjob]$ xrd headnode.c.atlasgce.internal
locateall
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m120/
No matching files were found.

Now the files are not found?

A short time later -
[benjamin@dpb-apf-00 d3pd_testjob]$ xrd headnode.c.atlasgce.internal
(C) 2004-2010 by the Xrootd group. Xrootd version: v3.2.7
Welcome to the xrootd command line interface.
Type 'help' for a list of available commands.
root://headnode.c.atlasgce.internal:1094//> dirlist
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/
Error 3011: Unable to open directory
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/;
No such file or directory

In server headnode.c.atlasgce.internal:1094 or in some of its child nodes.
-rw-(048)    801094670 2013-01-07 11:35:18
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0095._0001.1
-rw-(048)   3559362431 2013-01-07 11:35:04
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0094._0001.1
-rw-(048)   3674852869 2013-01-07 11:34:20
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0093._0001.1
-rw-(048)   1708802906 2013-01-07 11:33:29
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0092._0001.1
-rw-(048)   3720590161 2013-01-07 11:33:07
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0091._0001.1
-rw-(048)   3597669746 2013-01-07 11:32:16
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0090._0001.1
-rw-(048)    559179199 2013-01-07 11:31:30
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0089._0001.1
-rw-(048)   3755235005 2013-01-07 11:31:20
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0088._0001.1
-rw-(048)   3748590483 2013-01-07 11:30:34
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0087._0001.1
-rw-(048)   3814234452 2013-01-07 11:29:48
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0086._0001.1
-rw-(048)   3865215588 2013-01-07 11:28:08
/atlas/local/benjamin/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208/data12_8TeV.00208484.physics_Egamma.merge.AOD.f472_m1208._lb0085._0001.1

root://headnode.c.atlasgce.internal:1094//> 

The files are found.  There were no changes to the system.

Are there timeouts that I can set to make the system a bit more robust
against these transient issues?

Thanks,

Doug Benjamin








    _______________________________________________________

Reply to this item at:

  <http://savannah.cern.ch/bugs/?99674>

_______________________________________________
  Message sent via/by LCG Savannah
  http://savannah.cern.ch/

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1