LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  May 2014

QSERV-L May 2014

Subject:

Doing subchunking with overlap

From:

"Daniel L. Wang" <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Sat, 3 May 2014 10:06:40 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (48 lines)

Hello Qserv clan,

On my run, I was thinking about the subchunking and overlap problem, and 
the dec-stripe solution, and I think I've come up with an in-between 
solution.

One of the reasons that we just stored overlap and then queried by using 
joining against subchunk+overlap, was that we didn't want to compute the 
exact overlap region for each query, for each subchunk. How about this:

* store subchunks, but not overlap.
* query against subchunk + adjacent subchunks. Effectively, this gives 
us overlap ~= subchunk width.
* computing this may have some noticeable cost if the czar has to do it 
(full scan of 10k chunks each with 200 subchunks = computation of 2 
million adjacency subchunks), so maybe we can push it to the workers, 
where it can be computed almost for free (actually, we can cache it on 
each worker)..
* subchunks are on-the-fly, so we save computation of overlap subchunks 
completely, and build half the temp tables as before
* Workaround subchunks from adjacent chunks by storing them using 
virtual subchunk numbers.
* This requires somewhat more code and complexity, but also eliminates 
the previous overlap management, so the net cost/complexity increase is 
low(?).


Another different idea: We build subchunk tables on the fly because the 
mysql optimizer is too stupid to use a "subchunkid=X" condition in the 
WHERE clause to its fullest effect. Did we try coaxing it by using a 
subquery?

i.e., SELECT o1.blah, o2.blah FROM (SELECT ... FROM Object_N WHERE 
subchunkid=X) as o1, (SELECT ... FROM Object_N WHERE subchunkid = X) AS 
o2 WHERE...
instead of
SELECT o1.blah, o2.blah FROM Object_N AS o1, Object_N AS o2 WHERE 
subchunkid = X AND ...;

Have a great weekend, and thanks for the great week!
-Daniel

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use