Merci beaucoup Fabrice. I ran a query and it came up 4 chunks short.
They are a different four chunks then the last time, so it's pointing
away from a problem with the data. I stuck jobId in most of the comments
on the czar so it's easier get most of the relevant stuff out of the
czar log.
The worker looks pretty much the same for the example chunk that did
work as it does for the chunk that didn't work. xrootd on the czar never
calls ProcessResponse for the failed queries, but the logs on the
workers say the data was sent (or at least provided to xrootd.
-John
318d317
< 1282
1963d1961
< 2763
2857d2854
< 3568
5092d5088
< 558
--- Should look like this -------------------------chunk 6283 on ccqserv140
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG qdisp.Executive
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3567
payload.len=123 ru=/chk/LSST/6283))
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3567
desc=job(id=3567 payload.len=123 ru=/chk/LSST/6283)
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3567
payload.len=123 ru=/chk/LSST/6283) : 2015-11-25T19:01:25+0000, Unknown, 0, }
2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3567
2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:57) - jobId=3567 New QueryRequest
with payload(123)
2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:74) - jobId=3567 Requesting, payload
size: [123]
2015-11-25T19:01:25.731Z [0x7f6a2cff9700] DEBUG root
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3567
2015-11-25T19:01:25.771Z [0x7f6a6affd700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:81) - jobId=3567 RelRequestBuffer
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:88) - jobId=3567 ProcessResponse
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:127) - jobId=3567 _importStream
buffer.size=256
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:130) - jobId=3567
_importStream->GetResponseData size=256 0x7f6a8c93d030 [[0]=0, [1]=0,
[2]=0, [3]=0, [4]=0, ..., [251]=0, [252]=0, [253]=0, [254]=0, [255]=0]
2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:132) - jobId=3567 Initiated request ok
2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO root
(core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with
buflen=256 (more) jobId=3567
2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO root
(core/modules/qdisp/QueryRequest.cc:180) - jobId=3567
_importStream->GetResponseData size=47 0x7f6a8c93d030 [[0]=40, [1]=13,
[2]=2, [3]=0, [4]=0, ..., [42]=48, [43]=48, [44]=48, [45]=48, [46]=48]
2015-11-25T19:01:25.777Z [0x7f6a2effd700] INFO root
(core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with
buflen=47 (more) jobId=3567
2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO root
(core/modules/qdisp/QueryRequest.cc:267) - Finished() ok. jobId=3567
2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO root
(core/modules/qdisp/Executive.cc:318) - Executive (0x25ad040) UNTRACKING
id=3567
---------------------------------------------------- chunk 3605 on
ccqserv135
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG qdisp.Executive
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=1282
payload.len=123 ru=/chk/LSST/3605))
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=1282
desc=job(id=1282 payload.len=123 ru=/chk/LSST/3605)
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=1282
payload.len=123 ru=/chk/LSST/3605) : 2015-11-25T19:01:22+0000, Unknown, 0, }
2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=1282
2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:57) - jobId=1282 New QueryRequest
with payload(123)
2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:74) - jobId=1282 Requesting, payload
size: [123]
2015-11-25T19:01:22.295Z [0x7f69c1ffb700] DEBUG root
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=1282
2015-11-25T19:01:22.336Z [0x7f69e97fa700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:81) - jobId=1282 RelRequestBuffer
---------------------------------------------------- chunk 5423 on
ccqserv128
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG qdisp.Executive
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=2763
payload.len=123 ru=/chk/LSST/5423))
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=2763
desc=job(id=2763 payload.len=123 ru=/chk/LSST/5423)
2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=2763
payload.len=123 ru=/chk/LSST/5423) : 2015-11-25T19:01:24+0000, Unknown, 0, }
2015-11-25T19:01:24.506Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=2763
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:57) - jobId=2763 New QueryRequest
with payload(123)
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:74) - jobId=2763 Requesting, payload
size: [123]
2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=2763
2015-11-25T19:01:24.550Z [0x7f6a467fc700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:81) - jobId=2763 RelRequestBuffer
---------------------------------------------------- chunk 6284 on
ccqserv141
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG qdisp.Executive
(core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3568
payload.len=123 ru=/chk/LSST/6284))
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3568
desc=job(id=3568 payload.len=123 ru=/chk/LSST/6284)
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3568
payload.len=123 ru=/chk/LSST/6284) : 2015-11-25T19:01:25+0000, Unknown, 0, }
2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
(core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3568
2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:57) - jobId=3568 New QueryRequest
with payload(123)
2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:74) - jobId=3568 Requesting, payload
size: [123]
2015-11-25T19:01:25.736Z [0x7f69e97fa700] DEBUG root
(core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3568
2015-11-25T19:01:25.739Z [0x7f699d7fa700] DEBUG root
(core/modules/qdisp/QueryRequest.cc:81) - jobId=3568 RelRequestBuffer
--------- ccqserv135 worker log chunk 3605 (czar did not record result
for this)
[2015-11-25T19:01:22.285Z] [0x7f663dec3700] INFO root
(core/modules/xrdsvc/SsiService.cc:105) - Got provision call where rName
is: /chk/LSST/3605
[0x7f663dec3700] INFO root (core/modules/xrdsvc/ChannelStream.cc:122) -
returning buffer (256, (more))
...
[0x7f66302cf700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75) -
BlendScheduler::queCmd tSeq=53
[0x7f66302cf700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
[0x7f66302cf700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=53
[0x7f66302cf700] INFO root (core/modules/xrdsvc/SsiSession.cc:137) -
BindRequest took 0.000159 seconds
[0x7f66302cf700] INFO root (core/modules/xrdsvc/SsiSession.cc:138) -
Enqueued TaskMsg for Resource(/chk/LSST/3605) in 0.000159 seconds
...
[0x7f6623fff700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd:
groupReady=1 scanReady=0 lastCmdFromScan=0
[0x7f6623fff700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:112) -
BlendScheduler::commandStart tSeq=53
[0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) -
QueryRunner::runQuery() tSeq=53
[0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) -
Exec in flight for Db = q_f9c2294ddefbc9ee028dfea990ab8a33
[0x7f6623fff700] WARN Foreman (core/modules/wdb/QueryRunner.cc:115) -
QueryRunner overriding dbName with LSST
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) -
_transmit last=1 tSeq=53
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) -
_transmitHeader
[0x7f6623fff700] INFO root (core/modules/proto/ProtoHeaderWrap.cc:52) -
msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ..., [251]=48,
[252]=48, [253]=48, [254]=48, [255]=48]
[0x7f6623fff700] INFO root
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
checking stream 0 len=256 last=0
[0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91) -
last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47,
[8]=0, [9]=0, ..., [246]=48,[247]=48, [248]=48, [249]=48, [250]=48,
[251]=48, [252]=48, [253]=48, [254]=48, [255]=48]
[0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94) -
trying to append message (flowing)
[0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) -
_transmit last=1 tSeq=53 resultString=[[0]=8, [1]=0, [2]=16, [3]=3,
[4]=26, ..., [42]=56, [43]=56, [44]=53, [45]=16, [46]=0]
[0x7f6623fff700] INFO root
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
checking stream 0x7f65e8007c20 len=47 last=1
[0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91) -
last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27,
[8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=49, [40]=55, [41]=55, [42]=56,
[43]=56, [44]=53, [45]=16, [46]=0]
[0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94) -
trying to append message (flowing)
[0x7f6623fff700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:132) -
BlendScheduler::commandFinish tSeq=53
-------- ccqserv140 chunk 6283 - did work for this
[2015-11-25T19:01:25.730Z] [0x7f52e84cc700] INFO root
(core/modules/xrdsvc/SsiService.cc:105) - Got provision call where rName
is: /chk/LSST/6283
[0x7f52e84cc700] INFO root (core/modules/xrdsvc/SsiService.cc:105) -
Got provision call where rName is: /chk/LSST/6312
...
[0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:61) -
ProcessRequest, service=/chk/LSST/6283
[0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:68) -
GetRequest took 1e-06 seconds
[0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:99) -
Decoding TaskMsg of size 123
[0x7f52e85cd700] DEBUG root (core/modules/wbase/Task.cc:111) - Task(...)
tSeq=148 :count=149 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68,
69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103,
104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117,
118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131,
132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145,
146, 147, 148
[0x7f52e85cd700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75) -
BlendScheduler::queCmd tSeq=148
[0x7f52e85cd700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
[0x7f52e85cd700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=148
[0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:137) -
BindRequest took 0.000157 seconds
[0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:138) -
Enqueued TaskMsg for Resource(/chk/LSST/6283) in 0.000157 seconds
...
[0x7f52e1ffb700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd:
groupReady=1 scanReady=0 lastCmdFromScan=0
[0x7f52e1ffb700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:112) -
BlendScheduler::commandStart tSeq=148
[0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) -
QueryRunner::runQuery() tSeq=148
[0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) -
Exec in flight for Db = q_7fb0348a55a96afb780281e251f381d4
[0x7f52e1ffb700] WARN Foreman (core/modules/wdb/QueryRunner.cc:115) -
QueryRunner overriding dbName with LSST
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) -
_transmit last=1 tSeq=148
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) -
_transmitHeader
[0x7f52e1ffb700] INFO root (core/modules/proto/ProtoHeaderWrap.cc:52) -
msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ..., [251]=48,
[252]=48, [253]=48, [254]=48, [255]=48]
[0x7f52e1ffb700] INFO root
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
checking stream 0 len=256 last=0
[0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91) -
last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47,
[8]=0, [9]=0, ..., [246]=48,
[247]=48, [248]=48, [249]=48, [250]=48, [251]=48, [252]=48, [253]=48,
[254]=48, [255]=48]
[0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94) -
trying to append message (flowing)
[0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) -
_transmit last=1 tSeq=148 resultString=[[0]=8, [1]=0, [2]=16, [3]=3,
[4]=26, ..., [42]=52, [43]=54, [44]=52, [45]=16, [46]=0]
[0x7f52e1ffb700] INFO root
(core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
checking stream 0x7f52b80073a0 len=47 last=1
[0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91) -
last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27,
[8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=50, [40]=57, [41]=55, [42]=52,
[43]=54, [44]=52, [45]=16, [46]=0]
[0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94) -
trying to append message (flowing)
[0x7f52e1ffb700] DEBUG BlendSched
(core/modules/wsched/BlendScheduler.cc:132) -
BlendScheduler::commandFinish tSeq=148
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|