Hi Horst,
Indeed, one should never see a core dump and if one does appear we
definitely want to know about it. When you do see one, here is the first
dump of information that would be helpful before we start digging deeper:
gdb >executable> <corefile>
where
quit
Cut and past the output into a mail file or posting. We may ask for a
detailed traceback of every thread. I think I'll put that process on the
xroot web page. For now, at least we will know where it went bonkers.
Andy
On Sat, 21 Mar 2020, Horst Severini wrote:
> Hi Wei,
>
> yes, we have enough space for a few core dumps in /var/. It's just that there
> were 5 or 6 in the last week or two, and that filled /var/ up completely.
> I'll keep a closer eye on it for now.
>
> Thanks,
>
> Horst
>
> On 3/21/20 4:39 PM, Yang, Wei wrote:
>> Also make sure you have enough space to hold a core dump. It can something
>> be 10GB+
>>
>> --
>> Wei Yang [log in to unmask] | 650-926-3338(O)
>>
>> -----Original Message-----
>> From:<[log in to unmask]> on behalf of Horst
>> Severini<[log in to unmask]>
>> Date: Saturday, March 21, 2020 at 1:30 PM
>> To: xrootd-dev<[log in to unmask]>,"[log in to unmask]"
>> <[log in to unmask]>
>> Subject: Re: XrootD smoke test report for 2020-03-21 10:01:46 GMT
>>
>> Thanks Wei,
>> I'll send you the next one I get!:)
>> Cheers,
>> Horst
>> On 3/21/20 2:55 PM, Yang, Wei wrote:
>> > Indeed a core dump is usually the thing we need.
>> >
>> > --
>> > Wei Yang [log in to unmask] | 650-926-3338(O)
>> >
>> > On 3/21/20, 12:19 PM,"[log in to unmask] on behalf of
>> Horst Severini" <[log in to unmask] on behalf of [log in to unmask]>
>> wrote:
>> >
>> > We're running 4.11.1 here at OU.
>> >
>> > Cheers,
>> >
>> > Horst
>> >
>> > Albert Rossi<[log in to unmask]> wrote:
>> >
>> > > Hi Horst,
>> > >
>> > > actually, if you notice, all endpoints failed on today's
>> test. So it was not just OU.
>> > >
>> > > The Stanford developers may want you to run a few commands
>> over the core file from gdb once you have it in hand.
>> > >
>> > > What version of xrootd are you running, just out of
>> curiosity? Is it bleeding-edge, or a stable release?
>> > >
>> > > Cheers, Al
>> > >
>> > > ________________________________________________
>> > > Albert L. Rossi
>> > > Application Developer & Systems Analyst III
>> > > Scientific Computing Division, Data Movement Development
>> > > FCC 229A
>> > > Mail Station 369 (FCC 2W)
>> > > Fermi National Accelerator Laboratory
>> > > Batavia, IL 60510
>> > > (630) 840-3023
>> > > ________________________________
>> > > From: Horst Severini<[log in to unmask]>
>> > > Sent: Saturday, March 21, 2020 1:18 PM
>> > >To:[log in to unmask]
>> <[log in to unmask]>;[log in to unmask]<[log in to unmask]>;
>> Albert Rossi<[log in to unmask]>
>> > > Subject: Re: XrootD smoke test report for 2020-03-21 10:01:46
>> GMT
>> > >
>> > > Hi Al,
>> > >
>> > > thanks, good idea. I'll save the next core file.
>> > >
>> > > I'm pretty sure the authentication failures simply came
>> because
>> > > that partition was full and no new proxies or what not could
>> be
>> > > created, so I wouldn't worry about that.
>> > >
>> > > Thanks,
>> > >
>> > > Horst
>> > >
>> > > Albert Rossi<[log in to unmask]> wrote:
>> > >
>> > > > Hi Horst,
>> > > >
>> > > > I would definitely report
>> [log in to unmask]
>> > > >
>> > > > As for why the massive authentication failure, I've seen
>> this before, it might have to do with CA cert issues.
>> > > >
>> > > > Cheers, Al
>> > > >
>> > > > ________________________________________________
>> > > > Albert L. Rossi
>> > > > Application Developer & Systems Analyst III
>> > > > Scientific Computing Division, Data Movement Development
>> > > > FCC 229A
>> > > > Mail Station 369 (FCC 2W)
>> > > > Fermi National Accelerator Laboratory
>> > > > Batavia, IL 60510
>> > > > (630) 840-3023
>> > > > ________________________________
>> > > > From: Horst Severini<[log in to unmask]>
>> > > > Sent: Saturday, March 21, 2020 11:19 AM
>> > > >To:[log in to unmask] <[log in to unmask]>
>> > > > Subject: Re: XrootD smoke test report for 2020-03-21
>> 10:01:46 GMT
>> > > >
>> > > > Hi all,
>> > > >
>> > > > our/var/ partition had filled up because of too many
>> xrootd core dumps.
>> > > > I cleared those up and restarted xrootd, and things look
>> better again now.
>> > > >
>> > > > Not sure why we keep getting core dumps, though.
>> > > >
>> > > > Cheers,
>> > > >
>> > > > Horst
>> > > >
>> > > > On 3/21/20 5:01 AM,[log in to unmask] wrote:
>> > > > > XROOTD SMOKE TEST SUMMARY
>> > > > > 2020-03-21 10:01:46 GMT
>> > > > >
>> > > > > Client: bogus6.fnal.gov
>> > > > >
>> > > > > XrootD version: v4.11.2
>> > > > >
>> > > > > Reference server: CERN-TRUNK
>> > > > >
>> > > > > Credential delegation: ON
>> > > > >
>> > > > > Checksum: -C adler32
>> > > > >
>> > > > > Total number of round-trip tests: 21
>> > > > >
>> > > > > --------------------------------SOUND
>> ENDPOINTS---------------------------------
>> > > > >
>> > > > > SCORE ENDPT TYPE UP SRC
>> DST DN
>> > > > >
>> --------------------------------------------------------------------------------
>> > > > >
>> > > > > -----------------------------PROBLEMATIC
>> ENDPOINTS------------------------------
>> > > > >
>> > > > > SCORE ENDPT TYPE UP SRC
>> DST DN
>> > > > >
>> --------------------------------------------------------------------------------
>> > > > > 19 BRUSSELS dCache F -
>> F F 0/4
>> > > > > 19 CERN-EOS EOS F -
>> F F 0/4
>> > > > > 19 CERN-TRUNK DPM - F
>> - F 0/2
>> > > > > 19 DESY-PROM dCache F -
>> F F 0/4
>> > > > > 19 FNAL dCache F -
>> F F 0/4
>> > > > > 19 IN2P3-DOMA xrootd F -
>> F F 0/4
>> > > > > 19 PRAGUE DPM F -
>> F F 0/4
>> > > > > 19 RAL-CEPH CEPH F -
>> F F 0/4
>> > > > > 19 RAL-LCG2 Echo F -
>> F F 0/4
>> > > > > 19 SLAC XrootD F -
>> F F 0/4
>> > > > > 19 TRIUMF dCache F -
>> F F 0/4
>> > > > > 19 UKI-BRUNEL DPM F -
>> F F 0/4
>> > > > > 19 UKI-LANC DPM F -
>> F F 0/4
>> > > > > 19 UKI-MAN1 DPM F -
>> F F 0/4
>> > > > > 19 UKI-MAN2 DPM F -
>> F F 0/4
>> > > > > 19 UNI-BONN CephFS F -
>> F F 0/4
>> > > > > 18 OU XrootD F -
>> F F 0/4
>> > > > > 13 BNL dCache F -
>> F F 0/4
>> > > > > 7 AGLT2 dCache F -
>> F F 0/4
>> > > > > 0 CALTECH HDFS F -
>> F F 0/4
>> > > > > 0 TRIUMF-PROD dCache F -
>> F F 0/4
>
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the XROOTD-DEV list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=XROOTD-DEV&A=1
|