LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  September 2013

QSERV-L September 2013

Subject:

Re: more testing...

From:

"Daniel L. Wang" <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Wed, 4 Sep 2013 17:23:44 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (74 lines)

Hello,

On 09/04/2013 04:48 PM, Serge Monkewitz wrote:
>
>> Some other notes:
>> - I added LSST__Source (a copy of LSST__Object) to qservMeta. I 
>> probably could have soft-linked it to LSST__Object.*, but for now I 
>> just made a copy off the objectId index table.
>> - The fact that the qserv master doesn't recognize Source.objectId as 
>> special and look up a chunk for it in the objectId index table seems 
>> like a bug.
>
> So, it turns out that the current qserv master code sub-chunks the 
> above join (Daniel - is that expected?).
The join code thinks that there are always benefits to subchunk a 
subchunked table when it is being joined. It is stupid, but at least it 
provides correct results under some circumstances. I added a TODO to 
think about this, because it doesn't seem like something that we can 
design and define in our heads as we write code. You might consider the 
join predicate (hello, USING() and ON() syntax) and what columns are 
indexed (hello, metadata) or approximate sizes of chunk and subchunk 
tables. Sometimes it looks like query analyzer and optimizer problem. It 
looks easy to get wrong.

The code is only starting to realize that there is more than one table 
that might endure objectId lookups. It only recently allowed for the 
column to be named something other than "objectId". You can call it a 
bug. It is in the same class as "not implemented". The whole idea of 
indexing a database by one column needs some design discussion--for me, 
no details have been hashed out other than the 'we really need this 
master or boss table thing' hallway comment.

> In particular, I see stuff like this in the worker logs on ccqserv27 
> for chunk 12833, which contains  the objectId in question:
>
> Runner running Task: msg: session=19 chunk=12833 db=LSST frag: 
> q=SELECT s.ra,s.decl FROM LSST.Source_12833 AS 
> s,Subchunks_LSST_12833.Object_12833_0 AS o WHERE 
> o.objectId=154062577146538502 AND s.objectId=o.objectId,...
>
> But I also see the same for a bunch of different chunks, and the same 
> query shows up on other workers. This is what I would call a missed 
> opportunity by the master-side query generation logic. But the only 
> ill effect is that execution isn't as fast as it could be.
One of many. Should we gather a list? Could be useful for FDR (list of 
known ways to improve performance beyond our benchmarks).


 > If I understand things correctly, this means that the Object 
sub-chunks created on the fly by qserv workers
 > will always be empty, so that Source/Object join and Object 
near-neighbor queries will always come back
 > with nothing on the IN2P3 cluster. Something must  have gone 
seriously wrong with duplication or
 > partitioning of the Object table.

I think it would be useful to come up with a set of sanity-check queries 
to run on a worker. The sanity check probably involves the qms, and 
would be run as a post-load check (not required by qserv worker query 
execution code).
e.g.
SELECT count(chunkId), count(subChunkId), count(*) from <chunked table>
then make sure all are equal.
SELECT count(*) from <chunked table> where chunkId <> <chunk#>
then make sure count is 0.

-Daniel

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use