FYI, I tried same query on a multinode setup with same docker image, and
it works fine, of course the data size/chunk number is much more small:
fjammes@clrqserv00:~/src/qserv/admin/tools/docker/shmux
(u/fjammes/DM-3986 *=)$ mysql --host clrqserv00 --port 4040 --user qsmaster
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 979
Server version: 5.1.65 Source distribution
Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input
statement.
mysql> use qservTest_case01_qserv;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select count(*) from Object;
+----------------+
| SUM(QS1_COUNT) |
+----------------+
| 87 |
+----------------+
1 row in set (0.49 sec)
mysql> use qservTest_case05_qserv;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select count(*) from Object;
+----------------+
| SUM(QS1_COUNT) |
+----------------+
| 39064 |
+----------------+
1 row in set (0.51 sec)
mysql>
On 11/25/2015 08:58 PM, John Gates wrote:
> Merci beaucoup Fabrice. I ran a query and it came up 4 chunks short.
> They are a different four chunks then the last time, so it's pointing
> away from a problem with the data. I stuck jobId in most of the
> comments on the czar so it's easier get most of the relevant stuff out
> of the czar log.
>
> The worker looks pretty much the same for the example chunk that did
> work as it does for the chunk that didn't work. xrootd on the czar
> never calls ProcessResponse for the failed queries, but the logs on
> the workers say the data was sent (or at least provided to xrootd.
>
>
> -John
>
>
>
>
>
> 318d317
> < 1282
> 1963d1961
> < 2763
> 2857d2854
> < 3568
> 5092d5088
> < 558
>
> --- Should look like this -------------------------chunk 6283 on
> ccqserv140
> 2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG qdisp.Executive
> (core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3567
> payload.len=123 ru=/chk/LSST/6283))
> 2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3567
> desc=job(id=3567 payload.len=123 ru=/chk/LSST/6283)
> 2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3567
> payload.len=123 ru=/chk/LSST/6283) : 2015-11-25T19:01:25+0000,
> Unknown, 0, }
> 2015-11-25T19:01:25.728Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3567
> 2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:57) - jobId=3567 New QueryRequest
> with payload(123)
> 2015-11-25T19:01:25.730Z [0x7f6a2cff9700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:74) - jobId=3567 Requesting,
> payload size: [123]
> 2015-11-25T19:01:25.731Z [0x7f6a2cff9700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3567
> 2015-11-25T19:01:25.771Z [0x7f6a6affd700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:81) - jobId=3567 RelRequestBuffer
> 2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:88) - jobId=3567 ProcessResponse
> 2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:127) - jobId=3567 _importStream
> buffer.size=256
> 2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:130) - jobId=3567
> _importStream->GetResponseData size=256 0x7f6a8c93d030 [[0]=0, [1]=0,
> [2]=0, [3]=0, [4]=0, ..., [251]=0, [252]=0, [253]=0, [254]=0, [255]=0]
> 2015-11-25T19:01:25.775Z [0x7f6a1b7fe700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:132) - jobId=3567 Initiated
> request ok
> 2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO root
> (core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with
> buflen=256 (more) jobId=3567
> 2015-11-25T19:01:25.776Z [0x7f69e9ffb700] INFO root
> (core/modules/qdisp/QueryRequest.cc:180) - jobId=3567
> _importStream->GetResponseData size=47 0x7f6a8c93d030 [[0]=40, [1]=13,
> [2]=2, [3]=0, [4]=0, ..., [42]=48, [43]=48, [44]=48, [45]=48, [46]=48]
> 2015-11-25T19:01:25.777Z [0x7f6a2effd700] INFO root
> (core/modules/qdisp/QueryRequest.cc:155) - ProcessResponseData with
> buflen=47 (more) jobId=3567
> 2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO root
> (core/modules/qdisp/QueryRequest.cc:267) - Finished() ok. jobId=3567
> 2015-11-25T19:01:25.779Z [0x7f6a2effd700] INFO root
> (core/modules/qdisp/Executive.cc:318) - Executive (0x25ad040)
> UNTRACKING id=3567
>
>
>
>
>
>
> ---------------------------------------------------- chunk 3605 on
> ccqserv135
> 2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG qdisp.Executive
> (core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=1282
> payload.len=123 ru=/chk/LSST/3605))
> 2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=1282
> desc=job(id=1282 payload.len=123 ru=/chk/LSST/3605)
> 2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=1282
> payload.len=123 ru=/chk/LSST/3605) : 2015-11-25T19:01:22+0000,
> Unknown, 0, }
> 2015-11-25T19:01:22.258Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=1282
> 2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:57) - jobId=1282 New QueryRequest
> with payload(123)
> 2015-11-25T19:01:22.286Z [0x7f69c1ffb700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:74) - jobId=1282 Requesting,
> payload size: [123]
> 2015-11-25T19:01:22.295Z [0x7f69c1ffb700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=1282
> 2015-11-25T19:01:22.336Z [0x7f69e97fa700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:81) - jobId=1282 RelRequestBuffer
>
> ---------------------------------------------------- chunk 5423 on
> ccqserv128
> 2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG qdisp.Executive
> (core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=2763
> payload.len=123 ru=/chk/LSST/5423))
> 2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=2763
> desc=job(id=2763 payload.len=123 ru=/chk/LSST/5423)
> 2015-11-25T19:01:24.505Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=2763
> payload.len=123 ru=/chk/LSST/5423) : 2015-11-25T19:01:24+0000,
> Unknown, 0, }
> 2015-11-25T19:01:24.506Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=2763
> 2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:57) - jobId=2763 New QueryRequest
> with payload(123)
> 2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:74) - jobId=2763 Requesting,
> payload size: [123]
> 2015-11-25T19:01:24.507Z [0x7f6a6bfff700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=2763
> 2015-11-25T19:01:24.550Z [0x7f6a467fc700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:81) - jobId=2763 RelRequestBuffer
>
>
> ---------------------------------------------------- chunk 6284 on
> ccqserv141
> 2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG qdisp.Executive
> (core/modules/qdisp/Executive.cc:106) - Executive::add(job(id=3568
> payload.len=123 ru=/chk/LSST/6284))
> 2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.h:104) - JobQuery JQ_jobId=3568
> desc=job(id=3568 payload.len=123 ru=/chk/LSST/6284)
> 2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/JobQuery.cc:55) - runJob {job(id=3568
> payload.len=123 ru=/chk/LSST/6284) : 2015-11-25T19:01:25+0000,
> Unknown, 0, }
> 2015-11-25T19:01:25.729Z [0x7f6a90cb3700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:51) - QueryResource JQ_jobId=3568
> 2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:57) - jobId=3568 New QueryRequest
> with payload(123)
> 2015-11-25T19:01:25.735Z [0x7f69e97fa700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:74) - jobId=3568 Requesting,
> payload size: [123]
> 2015-11-25T19:01:25.736Z [0x7f69e97fa700] DEBUG root
> (core/modules/qdisp/QueryResource.cc:55) - ~QueryResource() JQ_jobId=3568
> 2015-11-25T19:01:25.739Z [0x7f699d7fa700] DEBUG root
> (core/modules/qdisp/QueryRequest.cc:81) - jobId=3568 RelRequestBuffer
>
>
>
>
> --------- ccqserv135 worker log chunk 3605 (czar did not record
> result for this)
> [2015-11-25T19:01:22.285Z] [0x7f663dec3700] INFO root
> (core/modules/xrdsvc/SsiService.cc:105) - Got provision call where
> rName is: /chk/LSST/3605
> [0x7f663dec3700] INFO root (core/modules/xrdsvc/ChannelStream.cc:122)
> - returning buffer (256, (more))
> ...
> [0x7f66302cf700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75)
> - BlendScheduler::queCmd tSeq=53
> [0x7f66302cf700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
> [0x7f66302cf700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=53
> [0x7f66302cf700] INFO root (core/modules/xrdsvc/SsiSession.cc:137) -
> BindRequest took 0.000159 seconds
> [0x7f66302cf700] INFO root (core/modules/xrdsvc/SsiSession.cc:138) -
> Enqueued TaskMsg for Resource(/chk/LSST/3605) in 0.000159 seconds
> ...
> [0x7f6623fff700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd:
> groupReady=1 scanReady=0 lastCmdFromScan=0
> [0x7f6623fff700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:112) -
> BlendScheduler::commandStart tSeq=53
> [0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) -
> QueryRunner::runQuery() tSeq=53
> [0x7f6623fff700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) -
> Exec in flight for Db = q_f9c2294ddefbc9ee028dfea990ab8a33
> [0x7f6623fff700] WARN Foreman (core/modules/wdb/QueryRunner.cc:115) -
> QueryRunner overriding dbName with LSST
> [0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) -
> _transmit last=1 tSeq=53
> [0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) -
> _transmitHeader
> [0x7f6623fff700] INFO root (core/modules/proto/ProtoHeaderWrap.cc:52)
> - msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ...,
> [251]=48, [252]=48, [253]=48, [254]=48, [255]=48]
> [0x7f6623fff700] INFO root
> (core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
> checking stream 0 len=256 last=0
> [0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91)
> - last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47,
> [8]=0, [9]=0, ..., [246]=48,[247]=48, [248]=48, [249]=48, [250]=48,
> [251]=48, [252]=48, [253]=48, [254]=48, [255]=48]
> [0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94)
> - trying to append message (flowing)
> [0x7f6623fff700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) -
> _transmit last=1 tSeq=53 resultString=[[0]=8, [1]=0, [2]=16, [3]=3,
> [4]=26, ..., [42]=56, [43]=56, [44]=53, [45]=16, [46]=0]
> [0x7f6623fff700] INFO root
> (core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
> checking stream 0x7f65e8007c20 len=47 last=1
> [0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91)
> - last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27,
> [8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=49, [40]=55, [41]=55,
> [42]=56, [43]=56, [44]=53, [45]=16, [46]=0]
> [0x7f6623fff700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94)
> - trying to append message (flowing)
> [0x7f6623fff700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:132) -
> BlendScheduler::commandFinish tSeq=53
>
>
>
> -------- ccqserv140 chunk 6283 - did work for this
> [2015-11-25T19:01:25.730Z] [0x7f52e84cc700] INFO root
> (core/modules/xrdsvc/SsiService.cc:105) - Got provision call where
> rName is: /chk/LSST/6283
> [0x7f52e84cc700] INFO root (core/modules/xrdsvc/SsiService.cc:105) -
> Got provision call where rName is: /chk/LSST/6312
> ...
> [0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:61) -
> ProcessRequest, service=/chk/LSST/6283
> [0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:68) -
> GetRequest took 1e-06 seconds
> [0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:99) -
> Decoding TaskMsg of size 123
> [0x7f52e85cd700] DEBUG root (core/modules/wbase/Task.cc:111) -
> Task(...) tSeq=148 :count=149 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
> 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
> 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,
> 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,
> 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,
> 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96,
> 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110,
> 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124,
> 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138,
> 139, 140, 141, 142, 143, 144, 145, 146, 147, 148
> [0x7f52e85cd700] DEBUG root (core/modules/wsched/BlendScheduler.cc:75)
> - BlendScheduler::queCmd tSeq=148
> [0x7f52e85cd700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:93) - Blend chose group
> [0x7f52e85cd700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:100) - Blend queCmd tSeq=148
> [0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:137) -
> BindRequest took 0.000157 seconds
> [0x7f52e85cd700] INFO root (core/modules/xrdsvc/SsiSession.cc:138) -
> Enqueued TaskMsg for Resource(/chk/LSST/6283) in 0.000157 seconds
> ...
> [0x7f52e1ffb700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:191) - BlendScheduler::getCmd:
> groupReady=1 scanReady=0 lastCmdFromScan=0
> [0x7f52e1ffb700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:112) -
> BlendScheduler::commandStart tSeq=148
> [0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:120) -
> QueryRunner::runQuery() tSeq=148
> [0x7f52e1ffb700] DEBUG Foreman (core/modules/wdb/QueryRunner.cc:137) -
> Exec in flight for Db = q_7fb0348a55a96afb780281e251f381d4
> [0x7f52e1ffb700] WARN Foreman (core/modules/wdb/QueryRunner.cc:115) -
> QueryRunner overriding dbName with LSST
> [0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:242) -
> _transmit last=1 tSeq=148
> [0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:263) -
> _transmitHeader
> [0x7f52e1ffb700] INFO root (core/modules/proto/ProtoHeaderWrap.cc:52)
> - msgBuf size=256 -> [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, ...,
> [251]=48, [252]=48, [253]=48, [254]=48, [255]=48]
> [0x7f52e1ffb700] INFO root
> (core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
> checking stream 0 len=256 last=0
> [0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91)
> - last=0 [[0]=40, [1]=13, [2]=2, [3]=0, [4]=0, [5]=0, [6]=21, [7]=47,
> [8]=0, [9]=0, ..., [246]=48,
> [247]=48, [248]=48, [249]=48, [250]=48, [251]=48, [252]=48, [253]=48,
> [254]=48, [255]=48]
> [0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94)
> - trying to append message (flowing)
> [0x7f52e1ffb700] DEBUG root (core/modules/wdb/QueryRunner.cc:253) -
> _transmit last=1 tSeq=148 resultString=[[0]=8, [1]=0, [2]=16, [3]=3,
> [4]=26, ..., [42]=52, [43]=54, [44]=52, [45]=16, [46]=0]
> [0x7f52e1ffb700] INFO root
> (core/modules/xrdsvc/SsiSession_ReplyChannel.cc:85) - sendStream,
> checking stream 0x7f52b80073a0 len=47 last=1
> [0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:91)
> - last=1 [[0]=8, [1]=0, [2]=16, [3]=3, [4]=26, [5]=29, [6]=10, [7]=27,
> [8]=10, [9]=9, ..., [37]=10, [38]=6, [39]=50, [40]=57, [41]=55,
> [42]=52, [43]=54, [44]=52, [45]=16, [46]=0]
> [0x7f52e1ffb700] INFO root (core/modules/xrdsvc/ChannelStream.cc:94)
> - trying to append message (flowing)
> [0x7f52e1ffb700] DEBUG BlendSched
> (core/modules/wsched/BlendScheduler.cc:132) -
> BlendScheduler::commandFinish tSeq=148
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|