Hi Lukasz, I believe Matevz was referring to the redirect counter issue. The redirect counter issue really nailed us over the weekend - the CERN redirectors got up to 1KHz of clients logging in / out. We had to turn off cross-region redirection, and I'm not quite sure how we will be able to re-enable it. Brian On Sep 3, 2012, at 10:55 AM, Lukasz Janyst <[log in to unmask]> wrote: > A ma ready to build what is in stable now. Waiting for a confirmation from Matevz. > > Lukasz > > On Monday, September 03, 2012 12:51:30 PM Lukasz Janyst wrote: > > Hi Matevz, > > > > What problem are you referring to? The redirect counter issue or the > > thread safety issues? The thread safety is already there in 3.2.2 so if you > > still see the problems then we need to investigate... > > > > Cheers, > > Lukasz > > > > On Friday, August 31, 2012 01:07:56 PM Matevz Tadel wrote: > > > Luckily we just noticed this one show up again: > > > https://savannah.cern.ch/bugs/?93794 > > > > > > Please include the fix! > > > > > > Cheers, > > > Matevz > > > > > > On 08/31/12 08:08, Brian Bockelman wrote: > > > > Perfectly fine by me. > > > > > > > > So: > > > > 1) Detailed monitoring fix. > > > > 2) Sendfile monitoring fix. > > > > > > > > I can't think of anything else? > > > > > > > > Brian > > > > > > > > On Aug 31, 2012, at 9:58 AM, Lukasz Janyst <[log in to unmask] > > > > > > > > <mailto:[log in to unmask]>> wrote: > > > >> Yes, but I won't manage today. Is Monday fine with you? > > > >> > > > >> Lukasz > > > >> > > > >> On Friday, August 31, 2012 07:34:44 AM Brian Bockelman wrote: > > > >> > Can we cut a 3.2.3 patch release with these two fixes? > > > >> > > > > >> > Brian > > > >> > > > > >> > On Aug 31, 2012, at 12:14 AM, "Yang, Wei" <[log in to unmask] > > > >> > > > >> <mailto:[log in to unmask]>> wrote: > > > >> > > I tested the second. I didn't get a chance to test the 1st before I > > > >> > > lost > > > >> > > the window of restarting the cluster. But I have sendfile() turned > > > >> > > off > > > >> > > and I do get correct results, so it implicitly confirms the 1st > > > >> > > one. > > > >> > > > > > >> > > regards, > > > >> > > Wei Yang | [log in to unmask] <mailto:[log in to unmask]> > > > >> > > | > > > >> > > > >> 650-926-3338(O) > > > >> > > > >> > > On Aug 30, 2012, at 9:29 PM, Wilko Kroeger wrote: > > > >> > >> Hello Brian > > > >> > >> > > > >> > >> Yes, we also noticed that the detailed monitoring is not working > > > >> > >> in > > > >> > >> v3.2.2. We build a version on top of v3.2.2 adding the two > > > >> > >> commits: > > > >> > >> > > > >> > >> commit e0ad3459c89a163e600070a15936b8fd5d26ff35 > > > >> > >> Author: Andrew Hanushevsky <[log in to unmask] > > > >> > >> <mailto:[log in to unmask]>> > > > >> > >> Date: Wed Aug 22 18:56:19 2012 -0700 > > > >> > >> > > > >> > >> Make sure read statistics are updated for sendfile() and mmap I/O. > > > >> > >> > > > >> > >> commit e51db4bb0178a21bbe87ccf7c9349b079c2d7455 > > > >> > >> Author: Andrew Hanushevsky <[log in to unmask] > > > >> > >> <mailto:[log in to unmask]>> > > > >> > >> Date: Mon Jul 30 16:52:56 2012 -0700 > > > >> > >> > > > >> > >> Correct monitor initialization test to start monitor under all > > > >> > >> configs. > > > >> > >> > > > >> > >> As far as I can tell the detailed monitoring is now working. Wei > > > >> > >> might > > > >> > >> have done more testing. > > > >> > >> > > > >> > >> Cheers, > > > >> > >> > > > >> > >> Wilko > > > >> > >> > > > >> > >> On Thu, 30 Aug 2012, Brian Bockelman wrote: > > > >> > >>> Hi Andy, > > > >> > >>> > > > >> > >>> The core wasn't interesting. However, I tracked it down to this > > > >> > >>> change > > > >> > >>> (line 334 in XrdXrootdConfig.cc): > > > >> > >>> > > > >> > >>> if ((!isRedir || (RQList.Next() != 0 && > > > >> > >>> XrdXrootdMonitor::Redirect()))) > > > >> > >>> > > > >> > >>> became: > > > >> > >>> > > > >> > >>> if ((!isRedir || (RQList.Next() != 0)) && > > > >> > >>> XrdXrootdMonitor::Redirect()) > > > >> > >>> > > > >> > >>> (in 3.2.2). In master, it is this test: > > > >> > >>> > > > >> > >>> if (!isRedir || XrdXrootdMonitor::Redirect()) > > > >> > >>> > > > >> > >>> Note that XrdXrootdMonitor::Redirect always returns 0 (I suspect > > > >> > >>> the bug > > > >> > >>> is this). > > > >> > >>> > > > >> > >>> So, basically, I think detailed monitoring is broken in the 3.2.2 > > > >> > >>> release. Matevz, take note... > > > >> > >>> > > > >> > >>> What's the minimal patch? I can ask OSG to push this out ASAP. > > > >> > >>> > > > >> > >>> Brian > > > >> > >>> > > > >> > >>> On Aug 28, 2012, at 9:26 PM, Andrew Hanushevsky <[log in to unmask] > > > >> > > > >> <mailto:[log in to unmask]>> wrote: > > > >> > >>>> Hi Brian, > > > >> > >>>> > > > >> > >>>> Best to get a gcore on this one. Seems like the monitoring did > > > >> > >>>> not > > > >> > >>>> initialize correctly as it's trying to send to fd 0. > > > >> > >>>> > > > >> > >>>> Andy > > > >> > >>>> > > > >> > >>>> -----Original Message----- From: Brian Bockelman > > > >> > >>>> Sent: Tuesday, August 28, 2012 7:15 PM > > > >> > >>>> To: <[log in to unmask] > > > >> > >>>> <mailto:[log in to unmask]>> > > > >> > >>>> Subject: Strange detailed monitoring issue > > > >> > >>>> > > > >> > >>>> After a power outage locally, Matevz noticed he is not receiving > > > >> > >>>> monitoring messages. > > > >> > >>>> > > > >> > >>>> Sure enough, from strace: > > > >> > >>>> > > > >> > >>>> [pid 1705] sendto(0, > > > >> > >>>> "t8\5\270\0\0\0\0\340\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\200\0\0\0\30 > > > >> > >>>> 2^ > > > >> > >>>> v#". > > > >> > >>>> .., 1464, 0, {sa_family=AF_UNSPEC, > > > >> > >>>> sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = -1 ENOTSOCK > > > >> > >>>> (Socket > > > >> > >>>> operation on non-socket) > > > >> > >>>> > > > >> > >>>> Version info: > > > >> > >>>> > > > >> > >>>> [root@red-gridftp3 ~]# rpm -q xrootd-server > > > >> > >>>> xrootd-server-3.2.2-1.osg.el5.xu > > > >> > >>>> > > > >> > >>>> Log startup is below. Config file snippet is: > > > >> > >>>> > > > >> > >>>> xrootd.monitor all auth flush 30s mbuff 1472 window 5s dest > > > >> > >>>> files > > > >> > >>>> io > > > >> > >>>> info user xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu>:9930 > > > >> > > > >> xrd.report xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu>:9931 > > > >> > > > >> > >>>> every 30s all sync > > > >> > >>>> > > > >> > >>>> Any ideas? We are at a loss as to what might be happening. > > > >> > >>>> > > > >> > >>>> Brian > > > >> > >>>> > > > >> > >>>> 120828 21:07:13 1663 Scalla is starting. . . > > > >> > >>>> Copr. 2010 Stanford University, xrd version v3.2.2 > > > >> > >>>> ++++++ xrootd [log in to unmask] > > > >> > > > >> <mailto:[log in to unmask]> initialization started. > > > >> > > > >> > >>>> Config using configuration file /etc/xrootd/xrootd-clustered.cfg > > > >> > >>>> =====> xrd.port 1094 > > > >> > >>>> =====> xrd.trace conn > > > >> > >>>> =====> all.adminpath /var/run/xrootd > > > >> > >>>> =====> xrd.report xrootd.t2.ucsd.edu > > > >> > >>>> <http://xrootd.t2.ucsd.edu>:9931 > > > >> > > > >> every 30s all sync > > > >> > > > >> > >>>> Config maximum number of connections restricted to 65536 > > > >> > >>>> Copr. 2007 Stanford University, xrootd version 2.9.7 build > > > >> > >>>> v3.2.2 > > > >> > >>>> ++++++ xrootd protocol initialization started. > > > >> > >>>> =====> all.export / nostage > > > >> > >>>> =====> xrootd.trace emsg login stall redirect > > > >> > >>>> =====> xrootd.seclib /usr/lib64/libXrdSec.so > > > >> > >>>> Config warning: ignoring fslib; libXrdOfs.so is built-in. > > > >> > >>>> =====> xrootd.fslib /usr/lib64/libXrdOfs.so > > > >> > >>>> =====> all.pidpath /var/run/xrootd > > > >> > >>>> =====> xrootd.monitor all auth flush 30s mbuff 1472 window 5s > > > >> > >>>> dest > > > >> > >>>> files io info user xrootd.t2.ucsd.edu > > > >> > >>>> <http://xrootd.t2.ucsd.edu>:9930 > > > >> > > > >> Config exporting / > > > >> > > > >> > >>>> ++++++ Authentication system initialization started. > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: *** > > > >> > >>>> ------------------------------------------------------------ *** > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Mode: server > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Debug: -1 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CA dir: > > > >> > >>>> /etc/grid-security/certificates 120828 21:07:13 1663 > > > >> > >>>> secgsi_InitOpts: > > > >> > >>>> CA verification level: 1 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL dir: > > > >> > >>>> /etc/grid-security/certificates/ 120828 21:07:13 1663 > > > >> > >>>> secgsi_InitOpts: > > > >> > >>>> CRL extension: .r0 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL check level: 1 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL refresh time: 86400 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Certificate: > > > >> > >>>> /etc/grid-security/xrd/xrdcert.pem 120828 21:07:13 1663 > > > >> > >>>> secgsi_InitOpts: Key: /etc/grid-security/xrd/xrdkey.pem 120828 > > > >> > >>>> 21:07:13 1663 secgsi_InitOpts: Proxy delegation option: 0 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: GRIDmap file: > > > >> > >>>> /etc/grid-security/grid-mapfile 120828 21:07:13 1663 > > > >> > >>>> secgsi_InitOpts: > > > >> > >>>> GRIDmap option: 10 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: GRIDmap cache entries > > > >> > >>>> expiration > > > >> > >>>> (secs): 0 120828 21:07:13 1663 secgsi_InitOpts: Authorization > > > >> > >>>> function: libXrdLcmaps.so 120828 21:07:13 1663 secgsi_InitOpts: > > > >> > >>>> Authorization function parms: > > > >> > >>>> --osg,--lcmapscfg,/etc/xrootd/lcmaps.cfg,--loglevel,0|useglobals > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Authorization cache > > > >> > >>>> entries > > > >> > >>>> expiration (secs): -1 120828 21:07:13 1663 secgsi_InitOpts: > > > >> > >>>> Client > > > >> > >>>> proxy availability in XrdSecEntity.endorsement: 0 120828 > > > >> > >>>> 21:07:13 > > > >> > >>>> 1663 > > > >> > >>>> secgsi_InitOpts: VOMS option: 1 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: MonInfo option: 0 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Crypto modules: ssl > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Ciphers: > > > >> > >>>> aes-128-cbc:bf-cbc:des-ede3-cbc 120828 21:07:13 1663 > > > >> > >>>> secgsi_InitOpts: > > > >> > >>>> MDigests: sha1:md5 > > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: *** > > > >> > >>>> ------------------------------------------------------------ *** > > > >> > >>>> 120828 21:07:13 1663 secgsi_LoadAuthzFun: using > > > >> > >>>> 'XrdSecgsiAuthzFun()' > > > >> > >>>> from libXrdLcmaps.so =====> sec.protocol /usr/lib64 gsi > > > >> > >>>> -certdir:/etc/grid-security/certificates > > > >> > >>>> -cert:/etc/grid-security/xrd/xrdcert.pem > > > >> > >>>> -key:/etc/grid-security/xrd/xrdkey.pem -crl:1 > > > >> > >>>> -authzfun:libXrdLcmaps.so -authzfunparms:--osg,--lcmapscfg,/et > > > >> > >>>> Config > > > >> > >>>> 1 authentication directives processed in > > > >> > >>>> /etc/xrootd/xrootd-clustered.cfg ------ Authentication system > > > >> > >>>> initialization completed. > > > >> > >>>> ++++++ File system initialization started. > > > >> > >>>> =====> all.role server > > > >> > >>>> Config warning: ignoring invalid trace option 'none'. > > > >> > >>>> =====> ofs.trace none > > > >> > >>>> =====> ofs.authorize > > > >> > >>>> =====> ofs.osslib /usr/lib64/libXrdHdfs.so > > > >> > >>>> ++++++ Authorization system initialization started. > > > >> > >>>> 120828 21:07:13 1663 acc_Config: Authorization system using > > > >> > >>>> configuration in /etc/xrootd/xrootd-clustered.cfg =====> > > > >> > >>>> acc.authdb > > > >> > >>>> /etc/xrootd/Authfile > > > >> > >>>> =====> acc.audit deny grant > > > >> > >>>> Config 2 authorization directives processed in > > > >> > >>>> /etc/xrootd/xrootd-clustered.cfg Config 1 auth entries processed > > > >> > >>>> in > > > >> > >>>> /etc/xrootd/Authfile > > > >> > >>>> ------ Authorization system initialization completed. > > > >> > >>>> Copr. 2009, Brian Bockelman, Hdfs Version > > > >> > >>>> 120828 21:07:13 1663 hdfs_Config: Copr. 2009, Brian Bockelman, > > > >> > >>>> Hdfs > > > >> > >>>> Version > > > >> > >>>> 120828 21:07:13 1663 hdfs_Config: Configuring HDFS. > > > >> > >>>> =====> oss.namelib /usr/lib64/libXrdCmsTfc.so > > > >> > >>>> file:/etc/xrootd/storage.xml?protocol=hadoop Copr. 2009 > > > >> > >>>> University > > > >> > >>>> of > > > >> > >>>> Nebraska-Lincoln TFC plugin v 1.0 > > > >> > >>>> Params: file:/etc/xrootd/storage.xml?protocol=hadoop > > > >> > >>>> Xerces-c has been initialized. > > > >> > >>>> Connecting to the catalog > > > >> > >>>> file:/etc/xrootd/storage.xml?protocol=hadoop > > > >> > >>>> Using catalog file /etc/xrootd/storage.xml > > > >> > >>>> ------ HDFS storage system initialization completed. > > > >> > >>>> 120828 21:07:13 1663 hdfs_HDFS storage system initialization.: > > > >> > >>>> completed. > > > >> > >>>> ++++++ Configuring server role. . . > > > >> > >>>> =====> all.manager srm.unl.edu <http://srm.unl.edu>:1213 > > > >> > >>>> =====> cms.trace all > > > >> > >>>> =====> all.adminpath /var/run/xrootd > > > >> > >>>> 120828 21:07:13 1663 Configure Global System Identification: > > > >> > >>>> anon-s > > > >> > >>>> 1213srm.unl.edu <http://1213srm.unl.edu>>>>> > > > >> > >>>> Config effective /etc/xrootd/xrootd-clustered.cfg ofs > > > >> > >>>> configuration: > > > >> > >>>> ofs.role server > > > >> > >>>> ofs.authorize > > > >> > >>>> ofs.maxdelay 60 > > > >> > >>>> ofs.osslib /usr/lib64/libXrdHdfs.so > > > >> > >>>> ofs.persist manual hold 600 logdir /var/run/xrootd/.ofs/posc.log > > > >> > >>>> ofs.trace 0 > > > >> > >>>> > > > >> > >>>> ------ File system server initialization completed. > > > >> > >>>> Config warning: 'xrootd.prepare logdir' not specified; prepare > > > >> > >>>> tracking > > > >> > >>>> disabled. 120828 21:07:13 1675 cms_Finder: Connected to cmsd via > > > >> > >>>> /var/run/xrootd/.olb/olbd.admin ------ xrootd protocol > > > >> > >>>> initialization > > > >> > >>>> completed. > > > >> > >>>> ------ xrootd [log in to unmask] > > > >> > > > >> <mailto:[log in to unmask]>:1094 initialization completed. > > > >> > > > >> > >>>> ################################################################ > > > >> > >>>> ## > > > >> > >>>> ##### > > > >> > >>>> # > > > >> > >>>> Use REPLY-ALL to reply to list > > > >> > >>>> > > > >> > >>>> To unsubscribe from the XROOTD-DEV list, click the following > > > >> > >>>> link: > > > >> > >>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV& > > > >> > >>>> A= > > > >> > >>>> 1 > > > >> > >>> > > > >> > >>> ################################################################# > > > >> > >>> ## > > > >> > >>> ##### > > > >> > >>> Use REPLY-ALL to reply to list > > > >> > >>> > > > >> > >>> To unsubscribe from the XROOTD-DEV list, click the following > > > >> > >>> link: > > > >> > >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A > > > >> > >>> =1 > > > >> > >> > > > >> > >> ################################################################## > > > >> > >> ## > > > >> > >> #### > > > >> > >> Use REPLY-ALL to reply to list > > > >> > >> > > > >> > >> To unsubscribe from the XROOTD-DEV list, click the following link: > > > >> > >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A= > > > >> > >> 1 > > > >> > > > > > >> > > ################################################################### > > > >> > > ## > > > >> > > ### > > > >> > > Use REPLY-ALL to reply to list > > > >> > > > > > >> > > To unsubscribe from the XROOTD-DEV list, click the following link: > > > >> > > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > > > >> > > <winmail.dat> > > > >> > > > > >> > ##################################################################### > > > >> > ## > > > >> > # > > > >> > Use REPLY-ALL to reply to list > > > >> > > > > >> > To unsubscribe from the XROOTD-DEV list, click the following link: > > > >> > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > > > >> > > > >> ----------------------------------------------------------------------- > > > >> -- > > > >> ------- > > > >> > > > >> Use REPLY-ALL to reply to list > > > >> > > > >> To unsubscribe from the XROOTD-DEV list, click the following link: > > > >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > > > > > > > > ------------------------------------------------------------------------ > > > > -- > > > > ------ > > > > > > > > Use REPLY-ALL to reply to list > > > > > > > > To unsubscribe from the XROOTD-DEV list, click the following link: > > > > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > > > Use REPLY-ALL to reply to list > > To unsubscribe from the XROOTD-DEV list, click the following link: > > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1