Print

Print


On Mar 3, 2011, at 8:13 PM, Andrew Hanushevsky wrote:

> Hi Brian,
> 
> OK, seems to me that it's have responses that are bunching up. Did you turn on the rate limiting? If so, what is the "cms.dfs" directive look like?
> 

No, I didn't turn on rate limiting - but I felt it was relevant in case if there's new code path.  cms.dfs is unset.

Don't rule out the possibility the underlying file system synchronized everything: I'd really like to be able to handle the case where it does though.  We can expect all CMS jobs to behave like this:

[root@xrootd-itb ~]# grep cms133.31967:274 /var/log/xrootd/cmsd.log
110303 19:38:21 18158 cms133.31967:[log in to unmask] do_PrepAdd: parms: * * 0 rq /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/8C8B1651-0FEA-DF11-B2B4-485B39800BA2.root
110303 19:38:21 18158 cms133.31967:[log in to unmask] do_PrepAdd: parms: * * 0 rq /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/ACF1159C-02EA-DF11-979F-E0CB4EA0A92F.root
110303 19:38:21 18158 cms133.31967:[log in to unmask] do_PrepAdd: parms: * * 0 rq /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/30129E99-0FEA-DF11-8603-0030487C6A32.root
110303 19:38:21 18158 cms133.31967:[log in to unmask] do_PrepAdd: parms: * * 0 rq /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/DC3BEE80-FDE9-DF11-B011-E0CB4E19F9A0.root
110303 19:38:21 18158 cms133.31967:[log in to unmask] do_PrepAdd: parms: * * 0 rq /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/468C8C9A-17EA-DF11-9276-E0CB4E1A1163.root
110303 19:38:22 18158 cms133.31967:[log in to unmask] do_Select:  /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/8C8B1651-0FEA-DF11-B2B4-485B39800BA2.root
110303 19:38:27 18158 cms133.31967:[log in to unmask] do_Select:  /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/8C8B1651-0FEA-DF11-B2B4-485B39800BA2.root
110303 19:38:27 18158 cms133.31967:[log in to unmask] do_Select: failed; No servers are available to read the file. /store/test/xrootd/T2_US_Nebraska/store/data/Run2010B/Mu/AOD/Nov4ReReco_v1/0000/8C8B1651-0FEA-DF11-B2B4-485B39800BA2.root

I.e., rapid-fire preps followed by an immediate select.  The "best" behavior would be if the client could be told to check in every 5s, but the central server would keep trying for up to 120s if it has outstanding queries.

Brian

> Andy
> 
> P.S I will soon remove the offending code that you pointed out.
> 
> -----Original Message----- From: Brian Bockelman
> Sent: Thursday, March 03, 2011 6:02 PM
> To: Andrew Hanushevsky
> Cc: xrootd-dev
> Subject: Re: Delay for SelPrep
> 
> 
> On Mar 3, 2011, at 7:54 PM, Andrew Hanushevsky wrote:
> 
>> Hi Brian,
>> 
>> Could you produce similar stats for how many state requests each server got?
> 
> One server is below (*).  However, I don't know if the log message occurs when it gets the query, or when it responds to the query.
> 
>> All that should be in the log. As for the 10 second variations that could happen if the clients are being told to hold for one or two periods or the not-available time of 10 seconds. That should be in the log as well.
>> 
> 
> I also included below when the selects were performed.  I don't know the best logging line to look for, so I included two different views.
> 
> I should mention: these servers are running the latest git sources; they ought to include the late limiting code.  The manager is running a release prior to 3.0.0; just upgraded that to the latest pre-release, which is how I found the bug in the other email.
> 
> Brian
> 
> (*)
> [root@red-gridftp10 ~]# grep '18:3' /var/log/xrootd/cmsd.log | grep do_State | awk '{print $2;}' | uniq -c
>     1 18:30:34
>     1 18:30:52
>     8 18:31:11
>    43 18:31:12
>    56 18:31:13
>    43 18:31:14
>    48 18:31:15
>    48 18:31:16
>    11 18:31:17
>     1 18:31:22
>     1 18:31:32
>   120 18:31:52
>    45 18:31:53
>     9 18:31:54
>     9 18:31:55
>    10 18:31:56
>     5 18:31:58
>    10 18:32:01
>    10 18:32:02
>     1 18:32:08
>    15 18:32:12
>     2 18:32:13
>     2 18:32:22
>     2 18:32:23
>   101 18:32:32
>    51 18:32:33
>     9 18:32:34
>     2 18:32:35
>     8 18:32:37
>     7 18:32:41
>     7 18:32:42
>     2 18:32:43
>     1 18:32:46
>    13 18:32:52
>     7 18:32:53
>     1 18:32:54
>     1 18:32:55
>     2 18:32:58
>     7 18:33:02
>     3 18:33:03
>     1 18:33:04
>    13 18:33:06
>     1 18:33:08
>    13 18:33:10
>    69 18:33:12
>    50 18:33:13
>     4 18:33:14
>     1 18:33:24
>     2 18:33:29
>     2 18:33:30
>     3 18:33:32
>     5 18:33:33
>     5 18:33:34
>     8 18:33:35
>     6 18:33:36
>     2 18:33:37
>     2 18:33:38
>     2 18:33:40
>     1 18:33:42
>     1 18:33:43
>     1 18:33:48
>     1 18:34:07
>     1 18:35:34
>     1 18:35:52
>     1 18:35:56
>     1 18:36:00
>     1 18:36:13
>     1 18:36:21
>     1 18:36:28
>     1 18:36:41
>     1 18:38:09
>     1 18:38:34
> 
> (**)
> [root@xrootd-itb ~]# grep 'Select seeking' /var/log/xrootd/cmsd.log | grep '18:3' | awk '{print $2;}' | uniq -c
>     1 12:18:33
>     8 18:31:11
>    43 18:31:12
>    55 18:31:13
>    43 18:31:14
>    48 18:31:15
>    48 18:31:16
>    31 18:31:17
>    12 18:31:18
>    21 18:31:19
>    16 18:31:20
>    45 18:31:21
>    44 18:31:22
>    34 18:31:23
>    50 18:31:24
>    50 18:31:25
>    45 18:31:26
>    52 18:31:27
>    50 18:31:28
>    72 18:31:29
>    52 18:31:30
>    47 18:31:31
>     1 18:31:32
>     1 18:31:41
>     1 18:32:02
>     1 18:33:10
>     1 18:33:11
>     1 18:33:24
>     2 18:33:29
>     2 18:33:30
>     3 18:33:32
>     5 18:33:33
>     4 18:33:34
>     8 18:33:35
>     6 18:33:36
>     2 18:33:37
>     2 18:33:38
>     2 18:33:40
>     1 18:33:42
>     1 18:33:43
>     1 18:33:48
>     1 18:34:07
>     1 18:35:56
>     1 18:36:00
>     1 18:36:21
>     1 18:36:28
>     1 18:36:41
>     1 18:38:09
>     1 19:18:39
> [root@xrootd-itb ~]# grep 'do_Select' /var/log/xrootd/cmsd.log | grep '18:3' | awk '{print $2;}' | uniq -c
>     1 12:18:33
>     1 18:31:11
>    15 18:31:12
>    20 18:31:13
>    14 18:31:14
>    14 18:31:15
>    14 18:31:16
>    19 18:31:17
>     6 18:31:18
>    10 18:31:19
>     8 18:31:20
>    11 18:31:21
>    18 18:31:22
>    14 18:31:23
>    15 18:31:24
>    26 18:31:25
>    11 18:31:26
>    22 18:31:27
>    23 18:31:28
>    24 18:31:29
>    24 18:31:30
>    25 18:31:31
>     3 18:31:32
>     1 18:31:41
>     1 18:31:52
>     1 18:32:02
>     2 18:32:11
>     2 18:32:13
>    28 18:32:14
>    38 18:32:15
>    24 18:32:16
>    30 18:32:17
>    24 18:32:18
>    20 18:32:19
>    16 18:32:20
>    12 18:32:21
>    32 18:32:22
>    28 18:32:23
>    36 18:32:24
>    30 18:32:25
>    42 18:32:26
>    20 18:32:27
>    42 18:32:28
>    46 18:32:29
>    46 18:32:30
>    46 18:32:31
>    28 18:32:32
>     2 18:32:45
>     2 18:33:00
>     1 18:33:10
>     1 18:33:11
>     2 18:33:16
>     1 18:33:24
>     1 18:33:29
>     1 18:33:31
>     1 18:33:32
>     5 18:33:33
>     2 18:33:34
>     3 18:33:35
>     8 18:33:36
>     1 18:33:37
>     3 18:33:38
>     1 18:33:40
>     2 18:33:43
>     1 18:33:48
>     2 18:33:58
>     1 18:34:07
>     2 18:34:10
>     2 18:34:24
>     2 18:34:30
>     2 18:34:31
>     2 18:34:32
>     2 18:34:34
>     4 18:34:35
>     6 18:34:36
>     6 18:34:37
>     2 18:34:38
>     4 18:34:39
>     2 18:34:41
>     2 18:34:48
>     2 18:35:06
>     2 18:35:42
>     2 18:35:43
>     1 18:35:56
>     1 18:36:00
>     1 18:36:21
>     1 18:36:28
>     3 18:36:41
>     2 18:37:02
>     2 18:37:46
>     2 18:38:00
>     1 18:38:09
>     2 18:38:11
>     2 18:38:12
>     2 18:38:35
>     2 19:18:30
>     1 19:18:39
>     2 19:43:11