Print

Print


Can you post table schema(s), the full query text, and query plan?

One concern I have with going down the “optimize near neighbor joins at the chunk level” route, even though I also find it very attractive, is that these approaches invariably involve the optimizer doing the right thing with some kind of index. If for some reason the optimizer decides not to use the plan that we envision, say due to other query constraints, then our queries can turn into full chunk-chunk cross products, which I think is basically equivalent to having infinite run-ttime.

On Jan 8, 2015, at 4:31 PM, Jacek Becla <[log in to unmask]> wrote:

> Let's expose it to others from the group...
> 
> BTW, I haven't seen the french article, Fabrice, please send the link.
> 
> For all who are puzzled what we are talking about:
> Fabrice found an article where the author showed how composite
> index helps in some totally unexpected ways, so we decided to
> give it a try in hope it can beat our existing subChunk-based
> approach.
> 
> Jacek
> 
> 
> 
> -------- Forwarded Message --------
> Subject: Re: comp index
> Date: Thu, 8 Jan 2015 16:01:52 -0800
> From: Daniel L. Wang <[log in to unmask]>
> To: Jacek Becla <[log in to unmask]>, Fabrice Jammes <[log in to unmask]>
> 
> This is weird. comp index does not help select-where-subchunkid, but it
> seems to help nn so that it is almost comparable to creating subchunk
> tables on the fly.
> 
> (on lsst-dev01)
> 
> ('X1', 0.9054880142211914)
> ('X2', 4.438582181930542)
> ('X1', 0.8882651329040527)
> ('X2', 4.423677921295166)
> ('X1', 6.960800886154175)
> ('X2', 4.552653074264526)
> ('X1', 4.225889205932617)
> ('X2', 4.377969980239868)
> 
> first 4: select * from ... where subchunkid=x for X1, X2, X1, X2
> next 2: nn select for X1, X2 (10 iterations)
> next 2: nn select for X1, X2 on on-the-fly subchunk tables (10 iterations)
> 
> The "normal index" does 10 iterations of subchunk select in ~7s, while
> the composite index gets ~4.5, which is close to the on-the-fly subchunk
> version at 4.2s and 4.3s. The on-the-fly-subchunk is a bit faster with
> the single column indexes for some reason. (in-memory temp tables are
> slightly faster, but not by much, at this scale).
> 
> 
> This might be worth posting so that there is an english-language
> equivalent for Fabrice's link.
> 
> Code attached.
> 
> -Daniel
> 
> On 01/08/2015 02:42 PM, Jacek Becla wrote:
>> composite index does not help, as far as I can tell.
>> 
>> Attached is the program I run
>> 
>> Results:
>> 
>> ('X1', 0.4587819576263428)
>> ('X2', 1.813704013824463)
>> ('X1', 0.14171791076660156)
>> ('X2', 0.14273715019226074)
> 
> 
> 
> 
> 
> ########################################################################
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
> <compIdx.py>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1