Hi,
Interface between Qserv and xrdssi API is very interesting, both
product are so nice.
So I would really like to be involved in this bug fixing.
I think we should look inside function below on the xrootd client
side:
https://github.com/lsst/qserv/search?utf8=%E2%9C%93&q=ProcessResponseData
isn't it? Here, in GetResponseData(), the client constraints the
size of the xrootd result message to buffer.size() isn't it?
But in Qserv, it seems buffer.size() parameter is initialized to the
size of the whole xrootd sv answer (i.e. the whole protobuf query
result message) which mean: the size of the full SQL query result.
Whereas in Andy simple example it seems to have a fixed size.
On xrootd server side it seems the response stream is build in
append() method of next class (cf. l82):
https://github.com/lsst/qserv/blob/ff47e3fd708e3e8dfcef05f59f8395ac3540137f/core/modules/xrdsvc/ChannelStream.cc#L82
but the response buffer seems to be filled with the full SQL query
result in one time. In Andy example, XrdSsiSession and the
XrdSvStream
are aware of the client constraint on buffer size (i.e. buffer.size)
in order to chunk correctly the answer, isn't it? This should also
be the case in Qserv?
I can try to set up high verbosity inside it and then reproduce the
error, this may help us to understand?
Thanks
Fabrice
Le 11/02/2015 22:08, Andrew Hanushevsky
a écrit :
[log in to unmask]"
type="cite">OK, let's work on resolving this. I think the real
issue is that getResponseData wants you to piece the buffers
together and we can fix that. Let's get together so I completely
understand what you are doing.
Andy
On Wed, 11 Feb 2015, Daniel L Wang wrote:
Yes, this is one of the 2mb buffer
problems.
(typed on a small touch screen)
On Feb 11, 2015 7:29 PM, Tatiana Goldina
<[log in to unmask]> wrote:
This is the tail from czar.log - I hope it would help to see
what is wrong.
0211 19:23:21.662 [0x7fcb535fe700] DEBUG root
(build/qproc/TaskMsgFactory2.cc:154) - SELECT * FROM
LSST.Science_Ccd_Exposure AS QST_1_ WHERE
scisql_s2PtInCircle(ra,decl,0.5,1.0,0.5)=1 LIMIT 3000
0211 19:23:21.662 [0x7fcb535fe700] DEBUG root
(build/qdisp/Executive.cc:397) - Executive (0x2bf3d80)
tracking id=1
0211 19:23:21.662 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:173) - Exec add
pth=/chk/LSST/1234567890
0211 19:23:21.662 [0x7fcb535fe700] DEBUG root
(build/qdisp/MessageStore.cc:49) - Msg: 1234567890 1200 Exec
add pth=/chk/LSST/1234567890
Opening xroot://127.0.0.1:1094//chk/LSST/1234567890
0211 19:23:21.662 [0x7fcb535fe700] DEBUG root
(build/qdisp/Executive.cc:338) - Provision was ok
0211 19:23:21.663 [0x7fcb3ebfd700] INFO root
(build/qdisp/QueryResource.cc:56) - Provision done
0211 19:23:21.663 [0x7fcb3ebfd700] INFO root
(build/qdisp/QueryRequest.cc:89) - New QueryRequest with
payload(182)
0211 19:23:21.663 [0x7fcb3ebfd700] DEBUG root
(build/qdisp/QueryRequest.cc:99) - Requesting, payload size:
[182]
Task=0x7fcb28dab640 processing id=0
0211 19:23:21.663 [0x7fcb535fe700] INFO root (app.py:565) -
Query dispatch (7) took 0.003240 seconds
0211 19:23:21.663 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:484) - Still 1 in flight.
Task 0x7fcb28dab640 sess=ok Status = 1 isWrite
Task Handler calling RelBuff.
0211 19:23:21.704 [0x7fcb3f5fe700] DEBUG root
(build/qdisp/QueryRequest.cc:105) - Early release of request
buffer
Task Handler calling trunc.
Task 0x7fcb28dab640 sess=ok Status = 1 isSync
Task Handler responding with stream.
0211 19:23:21.913 [0x7fcb3ebfd700] INFO root
(build/qdisp/QueryRequest.cc:148) - GetResponseData with
buffer of 0
Task 0x7fcb28dab640 SetBuff Async Status=isReady
0211 19:23:21.913 [0x7fcb3ebfd700] INFO root
(build/qdisp/QueryRequest.cc:150) - Initiated request ok
Task 0x7fcb28dab640 sess=ok Status = 1 isReady
Task Handler calling ProcessResponseData.
0211 19:23:21.914 [0x7fcb3f5fe700] INFO root
(build/qdisp/QueryRequest.cc:180) - ProcessResponse[data] with
buflen=1 (more)
Task 0x7fcb28dab640 SetBuff Async Status=isReady
Task 0x7fcb28dab640 sess=ok Status = 1 isReady
Task Handler calling ProcessResponseData.
0211 19:23:21.916 [0x7fcb3ffff700] INFO root
(build/qdisp/QueryRequest.cc:180) - ProcessResponse[data] with
buflen=25 (more)
Task 0x7fcb28dab640 SetBuff Async Status=isReady
Task 0x7fcb28dab640 sess=ok Status = 1 isReady
Task Handler calling ProcessResponseData.
0211 19:23:21.925 [0x7fcb3ebfd700] INFO root
(build/qdisp/QueryRequest.cc:180) - ProcessResponse[data] with
buflen=2097152 (last)
0211 19:23:21.925 [0x7fcb3ebfd700] ERROR root
(build/ccontrol/MergingRequester.cc:70) - MergingRequester
size mismatch: expected 2378306 got 2097152
0211 19:23:26.663 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:432) - Executive (0x2bf3d80) REAPED
id=1
0211 19:23:31.664 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:189) - entry state:0x7fcb400360c0
Resource(/chk/LSST/1234567890): 20150211-19:23:21, Error
merging result, 1420, Result message MD5 mismatch)
0211 19:23:31.664 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:194) - Query exec finish. 1
dispatched.
0211 19:23:31.664 [0x7fcb535fe700] DEBUG root
(build/qdisp/MessageStore.cc:49) - Msg: 1234567890 1215 Error
merging result 1420 (Result message MD5 mismatch) 1423711401
0211 19:23:31.664 [0x7fcb535fe700] INFO root
(build/qdisp/Executive.cc:197) - Query exec error:. 1 != 0
0211 19:23:31.664 [0x7fcb535fe700] INFO root
(build/rproc/InfileMerger.cc:325) - Merged
qservResult.result_4492663602 into
qservResult.result_4492663602
0211 19:23:31.664 [0x7fcb535fe700] ERROR root
(build/ccontrol/UserQuery.cc:221) - Joined everything
(failure!)
0211 19:23:31.665 [0x7fcb535fe700] INFO root (app.py:569) -
Query exec (7) took 10.001105 seconds
0211 19:23:31.665 [0x7fcb535fe700] ERROR root
(build/qdisp/Executive.cc:307) - Ref=1
Resource(/chk/LSST/1234567890): 20150211-19:23:21, Error
merging result, 1420, Result message MD5 mismatch
0211 19:23:31.666 [0x7fcb535fe700] DEBUG root (app.py:389) -
reporting -1 -1 Ref=1 Resource(/chk/LSST/1234567890):
20150211-19:23:21, Error merging result, 1420, Result message
MD5 mismatch
0211 19:23:31.666 [0x7fcb535fe700] ERROR root
(build/qdisp/MessageStore.cc:47) - Msg: -1 -1 Ref=1
Resource(/chk/LSST/1234567890): 20150211-19:23:21, Error
merging result, 1420, Result message MD5 mismatch
0211 19:23:31.667 [0x7fcb535fe700] INFO root (app.py:574) -
Final state of all queries error
0211 19:23:31.668 [0x7fcb521fc700] INFO root
(build/ccontrol/UserQuery.cc:251) - Discarded UserQuery(7)
On Feb 11, 2015, at 2:36 PM, Fabrice Jammes
<[log in to unmask]> wrote:
Hi Tatiana,
I think you're right. The 2MB xrootd can produce this error
message.
qserv-czar.log produce more accurate information, but I
don't know if you can access it?
Cheers,
Fabrice
On 02/11/2015 02:23 PM, Tatiana Goldina wrote:
I don't think I saw this error
before, but it may be related to 2MB xroot limit
(DM-1847). If the limit is lower, I do get the data.
sql> select * from Science_Ccd_Exposure where
scisql_s2PtInCircle(ra, decl, 0.5, 1.0, 0.5)=1 LIMIT 3000
[2015-02-11 13:43:17] [42S02][1051] Unknown table
'result_4414650179'
[2015-02-11 13:43:17] [Proxy][4120] Error during
execution:
-1 Ref=1
Resource(/chk/qservTest_caseSUI_qserv/1234567890):
20150211-13:43:06, Error merging result, 1420, Result
message MD5 mismatch (-1)
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following
link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following
link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
--
BEGIN-ANTISPAM-VOTING-LINKS
------------------------------------------------------
Teach CanIt if this mail (ID 04NPaAUBC) is spam:
Spam:
https://canit.ipac.caltech.edu/canit/b.php?i=04NPaAUBC&m=b93d2877e507&c=s
Not spam:
https://canit.ipac.caltech.edu/canit/b.php?i=04NPaAUBC&m=b93d2877e507&c=n
Forget vote:
https://canit.ipac.caltech.edu/canit/b.php?i=04NPaAUBC&m=b93d2877e507&c=f
------------------------------------------------------
END-ANTISPAM-VOTING-LINKS
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following
link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1