Hi, On 9/3/12 9:00 AM, Brian Bockelman wrote: > Hi Lukasz, > > I believe Matevz was referring to the redirect counter issue. Yup, that's what I meant ... this is the commit record referenced from the savannah link I sent originally: http://xrootd.cern.ch/cgi-bin/cgit.cgi/xrootd/commit/?id=4dd4cfa4340cc3ff61684f943392eeeff793c3a5 Matevz > The redirect counter issue really nailed us over the weekend - the CERN > redirectors got up to 1KHz of clients logging in / out. We had to turn off > cross-region redirection, and I'm not quite sure how we will be able to > re-enable it. > > Brian > > On Sep 3, 2012, at 10:55 AM, Lukasz Janyst <[log in to unmask] > <mailto:[log in to unmask]>> wrote: > >> A ma ready to build what is in stable now. Waiting for a confirmation from Matevz. >> >> Lukasz >> >> On Monday, September 03, 2012 12:51:30 PM Lukasz Janyst wrote: >> > Hi Matevz, >> > >> > What problem are you referring to? The redirect counter issue or the >> > thread safety issues? The thread safety is already there in 3.2.2 so if you >> > still see the problems then we need to investigate... >> > >> > Cheers, >> > Lukasz >> > >> > On Friday, August 31, 2012 01:07:56 PM Matevz Tadel wrote: >> > > Luckily we just noticed this one show up again: >> > > https://savannah.cern.ch/bugs/?93794 >> > > >> > > Please include the fix! >> > > >> > > Cheers, >> > > Matevz >> > > >> > > On 08/31/12 08:08, Brian Bockelman wrote: >> > > > Perfectly fine by me. >> > > > >> > > > So: >> > > > 1) Detailed monitoring fix. >> > > > 2) Sendfile monitoring fix. >> > > > >> > > > I can't think of anything else? >> > > > >> > > > Brian >> > > > >> > > > On Aug 31, 2012, at 9:58 AM, Lukasz Janyst <[log in to unmask] >> <mailto:[log in to unmask]> >> > > > >> > > > <mailto:[log in to unmask]>> wrote: >> > > >> Yes, but I won't manage today. Is Monday fine with you? >> > > >> >> > > >> Lukasz >> > > >> >> > > >> On Friday, August 31, 2012 07:34:44 AM Brian Bockelman wrote: >> > > >> > Can we cut a 3.2.3 patch release with these two fixes? >> > > >> > >> > > >> > Brian >> > > >> > >> > > >> > On Aug 31, 2012, at 12:14 AM, "Yang, Wei" <[log in to unmask] >> <mailto:[log in to unmask]> >> > > >> >> > > >> <mailto:[log in to unmask]>> wrote: >> > > >> > > I tested the second. I didn't get a chance to test the 1st before I >> > > >> > > lost >> > > >> > > the window of restarting the cluster. But I have sendfile() turned >> > > >> > > off >> > > >> > > and I do get correct results, so it implicitly confirms the 1st >> > > >> > > one. >> > > >> > > >> > > >> > > regards, >> > > >> > > Wei Yang | [log in to unmask] <mailto:[log in to unmask]> >> <mailto:[log in to unmask]> >> > > >> > > | >> > > >> >> > > >> 650-926-3338(O) >> > > >> >> > > >> > > On Aug 30, 2012, at 9:29 PM, Wilko Kroeger wrote: >> > > >> > >> Hello Brian >> > > >> > >> >> > > >> > >> Yes, we also noticed that the detailed monitoring is not working >> > > >> > >> in >> > > >> > >> v3.2.2. We build a version on top of v3.2.2 adding the two >> > > >> > >> commits: >> > > >> > >> >> > > >> > >> commit e0ad3459c89a163e600070a15936b8fd5d26ff35 >> > > >> > >> Author: Andrew Hanushevsky <[log in to unmask] <mailto:[log in to unmask]> >> > > >> > >> <mailto:[log in to unmask]>> >> > > >> > >> Date: Wed Aug 22 18:56:19 2012 -0700 >> > > >> > >> >> > > >> > >> Make sure read statistics are updated for sendfile() and mmap I/O. >> > > >> > >> >> > > >> > >> commit e51db4bb0178a21bbe87ccf7c9349b079c2d7455 >> > > >> > >> Author: Andrew Hanushevsky <[log in to unmask] <mailto:[log in to unmask]> >> > > >> > >> <mailto:[log in to unmask]>> >> > > >> > >> Date: Mon Jul 30 16:52:56 2012 -0700 >> > > >> > >> >> > > >> > >> Correct monitor initialization test to start monitor under all >> > > >> > >> configs. >> > > >> > >> >> > > >> > >> As far as I can tell the detailed monitoring is now working. Wei >> > > >> > >> might >> > > >> > >> have done more testing. >> > > >> > >> >> > > >> > >> Cheers, >> > > >> > >> >> > > >> > >> Wilko >> > > >> > >> >> > > >> > >> On Thu, 30 Aug 2012, Brian Bockelman wrote: >> > > >> > >>> Hi Andy, >> > > >> > >>> >> > > >> > >>> The core wasn't interesting. However, I tracked it down to this >> > > >> > >>> change >> > > >> > >>> (line 334 in XrdXrootdConfig.cc): >> > > >> > >>> >> > > >> > >>> if ((!isRedir || (RQList.Next() != 0 && >> > > >> > >>> XrdXrootdMonitor::Redirect()))) >> > > >> > >>> >> > > >> > >>> became: >> > > >> > >>> >> > > >> > >>> if ((!isRedir || (RQList.Next() != 0)) && >> > > >> > >>> XrdXrootdMonitor::Redirect()) >> > > >> > >>> >> > > >> > >>> (in 3.2.2). In master, it is this test: >> > > >> > >>> >> > > >> > >>> if (!isRedir || XrdXrootdMonitor::Redirect()) >> > > >> > >>> >> > > >> > >>> Note that XrdXrootdMonitor::Redirect always returns 0 (I suspect >> > > >> > >>> the bug >> > > >> > >>> is this). >> > > >> > >>> >> > > >> > >>> So, basically, I think detailed monitoring is broken in the 3.2.2 >> > > >> > >>> release. Matevz, take note... >> > > >> > >>> >> > > >> > >>> What's the minimal patch? I can ask OSG to push this out ASAP. >> > > >> > >>> >> > > >> > >>> Brian >> > > >> > >>> >> > > >> > >>> On Aug 28, 2012, at 9:26 PM, Andrew Hanushevsky <[log in to unmask] >> <mailto:[log in to unmask]> >> > > >> >> > > >> <mailto:[log in to unmask]>> wrote: >> > > >> > >>>> Hi Brian, >> > > >> > >>>> >> > > >> > >>>> Best to get a gcore on this one. Seems like the monitoring did >> > > >> > >>>> not >> > > >> > >>>> initialize correctly as it's trying to send to fd 0. >> > > >> > >>>> >> > > >> > >>>> Andy >> > > >> > >>>> >> > > >> > >>>> -----Original Message----- From: Brian Bockelman >> > > >> > >>>> Sent: Tuesday, August 28, 2012 7:15 PM >> > > >> > >>>> To: <[log in to unmask] >> <mailto:[log in to unmask]> >> > > >> > >>>> <mailto:[log in to unmask]>> >> > > >> > >>>> Subject: Strange detailed monitoring issue >> > > >> > >>>> >> > > >> > >>>> After a power outage locally, Matevz noticed he is not receiving >> > > >> > >>>> monitoring messages. >> > > >> > >>>> >> > > >> > >>>> Sure enough, from strace: >> > > >> > >>>> >> > > >> > >>>> [pid 1705] sendto(0, >> > > >> > >>>> "t8\5\270\0\0\0\0\340\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\200\0\0\0\30 >> > > >> > >>>> 2^ >> > > >> > >>>> v#". >> > > >> > >>>> .., 1464, 0, {sa_family=AF_UNSPEC, >> > > >> > >>>> sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = -1 ENOTSOCK >> > > >> > >>>> (Socket >> > > >> > >>>> operation on non-socket) >> > > >> > >>>> >> > > >> > >>>> Version info: >> > > >> > >>>> >> > > >> > >>>> [root@red-gridftp3 ~]# rpm -q xrootd-server >> > > >> > >>>> xrootd-server-3.2.2-1.osg.el5.xu >> > > >> > >>>> >> > > >> > >>>> Log startup is below. Config file snippet is: >> > > >> > >>>> >> > > >> > >>>> xrootd.monitor all auth flush 30s mbuff 1472 window 5s dest >> > > >> > >>>> files >> > > >> > >>>> io >> > > >> > >>>> info user xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu> >> <http://xrootd.t2.ucsd.edu>:9930 >> > > >> >> > > >> xrd.report xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu> >> <http://xrootd.t2.ucsd.edu>:9931 >> > > >> >> > > >> > >>>> every 30s all sync >> > > >> > >>>> >> > > >> > >>>> Any ideas? We are at a loss as to what might be happening. >> > > >> > >>>> >> > > >> > >>>> Brian >> > > >> > >>>> >> > > >> > >>>> 120828 21:07:13 1663 Scalla is starting. . . >> > > >> > >>>> Copr. 2010 Stanford University, xrd version v3.2.2 >> > > >> > >>>> ++++++ xrootd [log in to unmask] >> <mailto:[log in to unmask]> >> > > >> >> > > >> <mailto:[log in to unmask]> initialization started. >> > > >> >> > > >> > >>>> Config using configuration file /etc/xrootd/xrootd-clustered.cfg >> > > >> > >>>> =====> xrd.port 1094 >> > > >> > >>>> =====> xrd.trace conn >> > > >> > >>>> =====> all.adminpath /var/run/xrootd >> > > >> > >>>> =====> xrd.report xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu> >> > > >> > >>>> <http://xrootd.t2.ucsd.edu>:9931 >> > > >> >> > > >> every 30s all sync >> > > >> >> > > >> > >>>> Config maximum number of connections restricted to 65536 >> > > >> > >>>> Copr. 2007 Stanford University, xrootd version 2.9.7 build >> > > >> > >>>> v3.2.2 >> > > >> > >>>> ++++++ xrootd protocol initialization started. >> > > >> > >>>> =====> all.export / nostage >> > > >> > >>>> =====> xrootd.trace emsg login stall redirect >> > > >> > >>>> =====> xrootd.seclib /usr/lib64/libXrdSec.so >> > > >> > >>>> Config warning: ignoring fslib; libXrdOfs.so is built-in. >> > > >> > >>>> =====> xrootd.fslib /usr/lib64/libXrdOfs.so >> > > >> > >>>> =====> all.pidpath /var/run/xrootd >> > > >> > >>>> =====> xrootd.monitor all auth flush 30s mbuff 1472 window 5s >> > > >> > >>>> dest >> > > >> > >>>> files io info user xrootd.t2.ucsd.edu <http://xrootd.t2.ucsd.edu> >> > > >> > >>>> <http://xrootd.t2.ucsd.edu>:9930 >> > > >> >> > > >> Config exporting / >> > > >> >> > > >> > >>>> ++++++ Authentication system initialization started. >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: *** >> > > >> > >>>> ------------------------------------------------------------ *** >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Mode: server >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Debug: -1 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CA dir: >> > > >> > >>>> /etc/grid-security/certificates 120828 21:07:13 1663 >> > > >> > >>>> secgsi_InitOpts: >> > > >> > >>>> CA verification level: 1 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL dir: >> > > >> > >>>> /etc/grid-security/certificates/ 120828 21:07:13 1663 >> > > >> > >>>> secgsi_InitOpts: >> > > >> > >>>> CRL extension: .r0 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL check level: 1 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: CRL refresh time: 86400 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Certificate: >> > > >> > >>>> /etc/grid-security/xrd/xrdcert.pem 120828 21:07:13 1663 >> > > >> > >>>> secgsi_InitOpts: Key: /etc/grid-security/xrd/xrdkey.pem 120828 >> > > >> > >>>> 21:07:13 1663 secgsi_InitOpts: Proxy delegation option: 0 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: GRIDmap file: >> > > >> > >>>> /etc/grid-security/grid-mapfile 120828 21:07:13 1663 >> > > >> > >>>> secgsi_InitOpts: >> > > >> > >>>> GRIDmap option: 10 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: GRIDmap cache entries >> > > >> > >>>> expiration >> > > >> > >>>> (secs): 0 120828 21:07:13 1663 secgsi_InitOpts: Authorization >> > > >> > >>>> function: libXrdLcmaps.so 120828 21:07:13 1663 secgsi_InitOpts: >> > > >> > >>>> Authorization function parms: >> > > >> > >>>> --osg,--lcmapscfg,/etc/xrootd/lcmaps.cfg,--loglevel,0|useglobals >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Authorization cache >> > > >> > >>>> entries >> > > >> > >>>> expiration (secs): -1 120828 21:07:13 1663 secgsi_InitOpts: >> > > >> > >>>> Client >> > > >> > >>>> proxy availability in XrdSecEntity.endorsement: 0 120828 >> > > >> > >>>> 21:07:13 >> > > >> > >>>> 1663 >> > > >> > >>>> secgsi_InitOpts: VOMS option: 1 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: MonInfo option: 0 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Crypto modules: ssl >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: Ciphers: >> > > >> > >>>> aes-128-cbc:bf-cbc:des-ede3-cbc 120828 21:07:13 1663 >> > > >> > >>>> secgsi_InitOpts: >> > > >> > >>>> MDigests: sha1:md5 >> > > >> > >>>> 120828 21:07:13 1663 secgsi_InitOpts: *** >> > > >> > >>>> ------------------------------------------------------------ *** >> > > >> > >>>> 120828 21:07:13 1663 secgsi_LoadAuthzFun: using >> > > >> > >>>> 'XrdSecgsiAuthzFun()' >> > > >> > >>>> from libXrdLcmaps.so =====> sec.protocol /usr/lib64 gsi >> > > >> > >>>> -certdir:/etc/grid-security/certificates >> > > >> > >>>> -cert:/etc/grid-security/xrd/xrdcert.pem >> > > >> > >>>> -key:/etc/grid-security/xrd/xrdkey.pem -crl:1 >> > > >> > >>>> -authzfun:libXrdLcmaps.so -authzfunparms:--osg,--lcmapscfg,/et >> > > >> > >>>> Config >> > > >> > >>>> 1 authentication directives processed in >> > > >> > >>>> /etc/xrootd/xrootd-clustered.cfg ------ Authentication system >> > > >> > >>>> initialization completed. >> > > >> > >>>> ++++++ File system initialization started. >> > > >> > >>>> =====> all.role server >> > > >> > >>>> Config warning: ignoring invalid trace option 'none'. >> > > >> > >>>> =====> ofs.trace none >> > > >> > >>>> =====> ofs.authorize >> > > >> > >>>> =====> ofs.osslib /usr/lib64/libXrdHdfs.so >> > > >> > >>>> ++++++ Authorization system initialization started. >> > > >> > >>>> 120828 21:07:13 1663 acc_Config: Authorization system using >> > > >> > >>>> configuration in /etc/xrootd/xrootd-clustered.cfg =====> >> > > >> > >>>> acc.authdb >> > > >> > >>>> /etc/xrootd/Authfile >> > > >> > >>>> =====> acc.audit deny grant >> > > >> > >>>> Config 2 authorization directives processed in >> > > >> > >>>> /etc/xrootd/xrootd-clustered.cfg Config 1 auth entries processed >> > > >> > >>>> in >> > > >> > >>>> /etc/xrootd/Authfile >> > > >> > >>>> ------ Authorization system initialization completed. >> > > >> > >>>> Copr. 2009, Brian Bockelman, Hdfs Version >> > > >> > >>>> 120828 21:07:13 1663 hdfs_Config: Copr. 2009, Brian Bockelman, >> > > >> > >>>> Hdfs >> > > >> > >>>> Version >> > > >> > >>>> 120828 21:07:13 1663 hdfs_Config: Configuring HDFS. >> > > >> > >>>> =====> oss.namelib /usr/lib64/libXrdCmsTfc.so >> > > >> > >>>> file:/etc/xrootd/storage.xml?protocol=hadoop Copr. 2009 >> > > >> > >>>> University >> > > >> > >>>> of >> > > >> > >>>> Nebraska-Lincoln TFC plugin v 1.0 >> > > >> > >>>> Params: file:/etc/xrootd/storage.xml?protocol=hadoop >> > > >> > >>>> Xerces-c has been initialized. >> > > >> > >>>> Connecting to the catalog >> > > >> > >>>> file:/etc/xrootd/storage.xml?protocol=hadoop >> > > >> > >>>> Using catalog file /etc/xrootd/storage.xml >> > > >> > >>>> ------ HDFS storage system initialization completed. >> > > >> > >>>> 120828 21:07:13 1663 hdfs_HDFS storage system initialization.: >> > > >> > >>>> completed. >> > > >> > >>>> ++++++ Configuring server role. . . >> > > >> > >>>> =====> all.manager srm.unl.edu <http://srm.unl.edu> >> <http://srm.unl.edu>:1213 >> > > >> > >>>> =====> cms.trace all >> > > >> > >>>> =====> all.adminpath /var/run/xrootd >> > > >> > >>>> 120828 21:07:13 1663 Configure Global System Identification: >> > > >> > >>>> anon-s >> > > >> > >>>> 1213srm.unl.edu <http://1213srm.unl.edu> >> <http://1213srm.unl.edu>>>>> >> > > >> > >>>> Config effective /etc/xrootd/xrootd-clustered.cfg ofs >> > > >> > >>>> configuration: >> > > >> > >>>> ofs.role server >> > > >> > >>>> ofs.authorize >> > > >> > >>>> ofs.maxdelay 60 >> > > >> > >>>> ofs.osslib /usr/lib64/libXrdHdfs.so >> > > >> > >>>> ofs.persist manual hold 600 logdir /var/run/xrootd/.ofs/posc.log >> > > >> > >>>> ofs.trace 0 >> > > >> > >>>> >> > > >> > >>>> ------ File system server initialization completed. >> > > >> > >>>> Config warning: 'xrootd.prepare logdir' not specified; prepare >> > > >> > >>>> tracking >> > > >> > >>>> disabled. 120828 21:07:13 1675 cms_Finder: Connected to cmsd via >> > > >> > >>>> /var/run/xrootd/.olb/olbd.admin ------ xrootd protocol >> > > >> > >>>> initialization >> > > >> > >>>> completed. >> > > >> > >>>> ------ xrootd [log in to unmask] >> <mailto:[log in to unmask]> >> > > >> >> > > >> <mailto:[log in to unmask]>:1094 initialization completed. >> > > >> >> > > >> > >>>> ################################################################ >> > > >> > >>>> ## >> > > >> > >>>> ##### >> > > >> > >>>> # >> > > >> > >>>> Use REPLY-ALL to reply to list >> > > >> > >>>> >> > > >> > >>>> To unsubscribe from the XROOTD-DEV list, click the following >> > > >> > >>>> link: >> > > >> > >>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV& >> > > >> > >>>> A= >> > > >> > >>>> 1 >> > > >> > >>> >> > > >> > >>> ################################################################# >> > > >> > >>> ## >> > > >> > >>> ##### >> > > >> > >>> Use REPLY-ALL to reply to list >> > > >> > >>> >> > > >> > >>> To unsubscribe from the XROOTD-DEV list, click the following >> > > >> > >>> link: >> > > >> > >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A >> > > >> > >>> =1 >> > > >> > >> >> > > >> > >> ################################################################## >> > > >> > >> ## >> > > >> > >> #### >> > > >> > >> Use REPLY-ALL to reply to list >> > > >> > >> >> > > >> > >> To unsubscribe from the XROOTD-DEV list, click the following link: >> > > >> > >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A= >> > > >> > >> 1 >> > > >> > > >> > > >> > > ################################################################### >> > > >> > > ## >> > > >> > > ### >> > > >> > > Use REPLY-ALL to reply to list >> > > >> > > >> > > >> > > To unsubscribe from the XROOTD-DEV list, click the following link: >> > > >> > > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> > > >> > > <winmail.dat> >> > > >> > >> > > >> > ##################################################################### >> > > >> > ## >> > > >> > # >> > > >> > Use REPLY-ALL to reply to list >> > > >> > >> > > >> > To unsubscribe from the XROOTD-DEV list, click the following link: >> > > >> > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> > > >> >> > > >> ----------------------------------------------------------------------- >> > > >> -- >> > > >> ------- >> > > >> >> > > >> Use REPLY-ALL to reply to list >> > > >> >> > > >> To unsubscribe from the XROOTD-DEV list, click the following link: >> > > >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> > > > >> > > > ------------------------------------------------------------------------ >> > > > -- >> > > > ------ >> > > > >> > > > Use REPLY-ALL to reply to list >> > > > >> > > > To unsubscribe from the XROOTD-DEV list, click the following link: >> > > > https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> >> -------------------------------------------------------------------------------- >> >> Use REPLY-ALL to reply to list >> >> To unsubscribe from the XROOTD-DEV list, click the following link: >> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1 >> > ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the XROOTD-DEV list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1