LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  November 2015

QSERV-L November 2015

Subject:

Re: docker on the cluster

From:

John Gates <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Wed, 25 Nov 2015 11:58:31 -0800

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (308 lines)

Merci beaucoup Fabrice.  I ran a query and it came up 4 chunks short. 
They are a different four chunks then the last time, so it's pointing 
away from a problem with the data. I stuck jobId in most of the comments 
on the czar so it's easier get most of the relevant stuff out of the 
czar log.

The worker looks pretty much the same for the example chunk that did 
work as it does for the chunk that didn't work. xrootd on the czar never 
calls ProcessResponse for the failed queries, but the logs on the 
workers say the data was sent (or at least provided to xrootd.


-John





318d317
< 1282
1963d1961
< 2763
2857d2854
< 3568
5092d5088
< 558

--- Should look like this -------------------------chunk 6283 on ccqserv140
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG qdisp.Executive 
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3567 
payload.len=123 ru=/chk/LSST/6283))
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3567 
desc=job(id=3567 payload.len=123 ru=/chk/LSST/6283)
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3567 
payload.len=123 ru=/chk/LSST/6283) : 2015-11-25T19:01:25+0000, Unknown, 0, }
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3567
2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:57) - jobId=3567 New QueryRequest 
with payload(123)
2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:74) - jobId=3567 Requesting, payload 
size: [123]
2015-11-25T19:01:25.731Z [0x7f6a2cff9700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3567
2015-11-25T19:01:25.771Z [0x7f6a6affd700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:81) - jobId=3567 RelRequestBuffer
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:88) - jobId=3567 ProcessResponse
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:127) - jobId=3567 _importStream 
buffer.size=256
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:130) - jobId=3567 
_importStream->GetResponseData size=256 0x7f6a8c93d030 [[0]=0, [1]=0, 
[2]=0, [3]=0, [4]=0, ..., [251]=0, [252]=0, [253]=0, [254]=0, [255]=0]
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:132) - jobId=3567 Initiated request ok
2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO  root 
(core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with 
buflen=256 (more) jobId=3567
2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO  root 
(core/modules/qdisp/QueryRequest.cc:180) - jobId=3567 
_importStream->GetResponseData size=47 0x7f6a8c93d030 [[0]=40, [1]=13, 
[2]=2, [3]=0, [4]=0, ..., [42]=48, [43]=48, [44]=48, [45]=48, [46]=48]
2015-11-25T19:01:25.777Z [0x7f6a2effd700] INFO  root 
(core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with 
buflen=47 (more) jobId=3567
2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO  root 
(core/modules/qdisp/QueryRequest.cc:267) - Finished() ok. jobId=3567
2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO  root 
(core/modules/qdisp/Executive.cc:318) - Executive (0x25ad040) UNTRACKING 
id=3567






---------------------------------------------------- chunk 3605 on 
ccqserv135
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG qdisp.Executive 
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=1282 
payload.len=123 ru=/chk/LSST/3605))
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=1282 
desc=job(id=1282 payload.len=123 ru=/chk/LSST/3605)
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=1282 
payload.len=123 ru=/chk/LSST/3605) : 2015-11-25T19:01:22+0000, Unknown, 0, }
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=1282
2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:57) - jobId=1282 New QueryRequest 
with payload(123)
2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:74) - jobId=1282 Requesting, payload 
size: [123]
2015-11-25T19:01:22.295Z [0x7f69c1ffb700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=1282
2015-11-25T19:01:22.336Z [0x7f69e97fa700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:81) - jobId=1282 RelRequestBuffer

---------------------------------------------------- chunk 5423 on 
ccqserv128
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG qdisp.Executive 
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=2763 
payload.len=123 ru=/chk/LSST/5423))
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=2763 
desc=job(id=2763 payload.len=123 ru=/chk/LSST/5423)
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=2763 
payload.len=123 ru=/chk/LSST/5423) : 2015-11-25T19:01:24+0000, Unknown, 0, }
2015-11-25T19:01:24.506Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=2763
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:57) - jobId=2763 New QueryRequest 
with payload(123)
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:74) - jobId=2763 Requesting, payload 
size: [123]
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=2763
2015-11-25T19:01:24.550Z [0x7f6a467fc700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:81) - jobId=2763 RelRequestBuffer


---------------------------------------------------- chunk 6284 on 
ccqserv141
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG qdisp.Executive 
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3568 
payload.len=123 ru=/chk/LSST/6284))
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3568 
desc=job(id=3568 payload.len=123 ru=/chk/LSST/6284)
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3568 
payload.len=123 ru=/chk/LSST/6284) : 2015-11-25T19:01:25+0000, Unknown, 0, }
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3568
2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:57) - jobId=3568 New QueryRequest 
with payload(123)
2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:74) - jobId=3568 Requesting, payload 
size: [123]
2015-11-25T19:01:25.736Z [0x7f69e97fa700] DEBUG root 
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3568
2015-11-25T19:01:25.739Z [0x7f699d7fa700] DEBUG root 
(core/modules/qdisp/QueryRequest.cc:81) - jobId=3568 RelRequestBuffer




--------- ccqserv135  worker log chunk 3605 (czar did not record result 
for this)
[2015-11-25T19:01:22.285Z] [0x7f663dec3700] INFO  root 
(core/modules/xrdsvc/SsiService.cc:105) - Got provision call where rName 
is: /chk/LSST/3605
[0x7f663dec3700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:122) - 
returning buffer (256, (more))
...
[0x7f66302cf700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75) - 
BlendScheduler::queCmd tSeq=53
[0x7f66302cf700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
[0x7f66302cf700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=53
[0x7f66302cf700] INFO  root (core/modules/xrdsvc/SsiSession.cc:137) - 
BindRequest took 0.000159 seconds
[0x7f66302cf700] INFO  root (core/modules/xrdsvc/SsiSession.cc:138) - 
Enqueued TaskMsg for Resource(/chk/LSST/3605) in 0.000159 seconds
...
[0x7f6623fff700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd: 
groupReady=1 scanReady=0 lastCmdFromScan=0
[0x7f6623fff700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:112) - 
BlendScheduler::commandStart tSeq=53
[0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) - 
QueryRunner::runQuery() tSeq=53
[0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) - 
Exec in flight for Db = q_f9c2294ddefbc9ee028dfea990ab8a33
[0x7f6623fff700] WARN  Foreman (core/modules/wdb/QueryRunner.cc:115) - 
QueryRunner overriding dbName with LSST
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) - 
_transmit last=1 tSeq=53
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) - 
_transmitHeader
[0x7f6623fff700] INFO  root (core/modules/proto/ProtoHeaderWrap.cc:52) - 
msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ..., [251]=48, 
[252]=48, [253]=48, [254]=48, [255]=48]
[0x7f6623fff700] INFO  root 
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream, 
checking stream 0 len=256 last=0
[0x7f6623fff700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:91) - 
last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47, 
[8]=0, [9]=0, ..., [246]=48,[247]=48, [248]=48, [249]=48, [250]=48, 
[251]=48, [252]=48, [253]=48, [254]=48, [255]=48]
[0x7f6623fff700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:94) -  
trying to append message (flowing)
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) - 
_transmit last=1 tSeq=53 resultString=[[0]=8, [1]=0, [2]=16, [3]=3, 
[4]=26, ..., [42]=56, [43]=56, [44]=53, [45]=16, [46]=0]
[0x7f6623fff700] INFO  root 
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream, 
checking stream 0x7f65e8007c20 len=47 last=1
[0x7f6623fff700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:91) - 
last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27, 
[8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=49, [40]=55, [41]=55, [42]=56, 
[43]=56, [44]=53, [45]=16, [46]=0]
[0x7f6623fff700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:94) -  
trying to append message (flowing)
[0x7f6623fff700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:132) - 
BlendScheduler::commandFinish tSeq=53



-------- ccqserv140 chunk 6283 - did work for this
[2015-11-25T19:01:25.730Z] [0x7f52e84cc700] INFO  root 
(core/modules/xrdsvc/SsiService.cc:105) - Got provision call where rName 
is: /chk/LSST/6283
[0x7f52e84cc700] INFO  root (core/modules/xrdsvc/SsiService.cc:105) - 
Got provision call where rName is: /chk/LSST/6312
...
[0x7f52e85cd700] INFO  root (core/modules/xrdsvc/SsiSession.cc:61) - 
ProcessRequest, service=/chk/LSST/6283
[0x7f52e85cd700] INFO  root (core/modules/xrdsvc/SsiSession.cc:68) - 
GetRequest took 1e-06 seconds
[0x7f52e85cd700] INFO  root (core/modules/xrdsvc/SsiSession.cc:99) - 
Decoding TaskMsg of size 123
[0x7f52e85cd700] DEBUG root (core/modules/wbase/Task.cc:111) - Task(...) 
tSeq=148  :count=149 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 
69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 
104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 
118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 
132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 
146, 147, 148
[0x7f52e85cd700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75) - 
BlendScheduler::queCmd tSeq=148
[0x7f52e85cd700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
[0x7f52e85cd700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=148
[0x7f52e85cd700] INFO  root (core/modules/xrdsvc/SsiSession.cc:137) - 
BindRequest took 0.000157 seconds
[0x7f52e85cd700] INFO  root (core/modules/xrdsvc/SsiSession.cc:138) - 
Enqueued TaskMsg for Resource(/chk/LSST/6283) in 0.000157 seconds
...
[0x7f52e1ffb700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd: 
groupReady=1 scanReady=0 lastCmdFromScan=0
[0x7f52e1ffb700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:112) - 
BlendScheduler::commandStart tSeq=148
[0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) - 
QueryRunner::runQuery() tSeq=148
[0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) - 
Exec in flight for Db = q_7fb0348a55a96afb780281e251f381d4
[0x7f52e1ffb700] WARN  Foreman (core/modules/wdb/QueryRunner.cc:115) - 
QueryRunner overriding dbName with LSST
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) - 
_transmit last=1 tSeq=148
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) - 
_transmitHeader
[0x7f52e1ffb700] INFO  root (core/modules/proto/ProtoHeaderWrap.cc:52) - 
msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ..., [251]=48, 
[252]=48, [253]=48, [254]=48, [255]=48]
[0x7f52e1ffb700] INFO  root 
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream, 
checking stream 0 len=256 last=0
[0x7f52e1ffb700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:91) - 
last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47, 
[8]=0, [9]=0, ..., [246]=48,
[247]=48, [248]=48, [249]=48, [250]=48, [251]=48, [252]=48, [253]=48, 
[254]=48, [255]=48]
[0x7f52e1ffb700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:94) -  
trying to append message (flowing)
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) - 
_transmit last=1 tSeq=148 resultString=[[0]=8, [1]=0, [2]=16, [3]=3, 
[4]=26, ..., [42]=52, [43]=54, [44]=52, [45]=16, [46]=0]
[0x7f52e1ffb700] INFO  root 
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream, 
checking stream 0x7f52b80073a0 len=47 last=1
[0x7f52e1ffb700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:91) - 
last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27, 
[8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=50, [40]=57, [41]=55, [42]=52, 
[43]=54, [44]=52, [45]=16, [46]=0]
[0x7f52e1ffb700] INFO  root (core/modules/xrdsvc/ChannelStream.cc:94) -  
trying to append message (flowing)
[0x7f52e1ffb700] DEBUG BlendSched 
(core/modules/wsched/BlendScheduler.cc:132) - 
BlendScheduler::commandFinish tSeq=148

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use