Print

Print


Hi Gregory,

----- Original Message ----- 
From: "Gregory J. Sharp" <[log in to unmask]>
To: "Xrootd Mailing List" <[log in to unmask]>
Sent: Thursday, December 16, 2004 10:22 AM
Subject: Xrootd network problems


> I have stared at the code for nearly a day, and I can't figure this one
> out. (Maybe 4 hours sleep last night just wasn't enough?)
You really should be getting more sleep (yes, mom :-)

> My xrootd data director on sol199 produces the following messages for
> every connection. It looks to my naive eye that the connections are not
> being closed cleanly, but perhaps "link read error" is just a poor
> choice of error message. It occurs in two places in the code, so it
> isn't clear which piece of code produces the error.  Anyway, things
> pretty much work okay while this is going on...
Not only poor but wrong. There is a typo in one spot and the "==" should be
"!=". So, most of the time, 'link read error' means that the client closed
the connection and there was nothing to read.

> Then suddenly I get this in the xrootd data server log... lots of
> connections being made but never terminated.
>
> 041216 12:49:14 020 XrootdXeq: User logged in as gregor.31754:17@lnx7108
> 041216 12:51:22 016 XrootdXeq: User logged in as gregor.31754:18@lnx7108
> 041216 12:53:22 018 XrootdXeq: User logged in as gregor.31754:19@lnx7108
> 041216 12:53:51 017 XrootdXeq: User logged in as gregor.31764:20@lnx7108
> 041216 12:55:51 019 XrootdXeq: User logged in as gregor.31764:21@lnx7108
> 041216 12:55:54 021 XrootdXeq: User logged in as gregor.31769:22@lnx7108
> 041216 12:55:56 022 XrootdXeq: User logged in as gregor.31773:23@lnx7108
>
> Meanwhile, the client doing the connecting keeps printing
>
> 041216 12:51:22 001 Xrd: ReadPartialAnswer Error reading msg from
> connmgr (server [sol199.lns.cornell.edu:1094]).
> 041216 12:53:22 001 Xrd: ReadPartialAnswer Error reading msg from
> connmgr (server [sol199.lns.cornell.edu:1094]).
>
> until I kill it.
Does this mean that the client is connecting multiple times and never
closing the connection? That is, the client error messages correlate with
the logins at the server. If so, this is a client error (i.e., it isn't
closing the connection). Fabrizio, could you verify this?

The other question is why is the client getting the error in the first
place. What are you doing on the client end? Is there more debugging info
you can include?

> On sol199 (solaris 8) there are 24 open files, but none of them
> particularly enlightening to me.
> c---------   1 root     sys       13,  2 Dec 16 11:07 0
> --w-------   1 gregor   cleo           0 Dec 16 11:32 1
Neither to me, since there are no file names present here.

Andy