On 09/04/2013 09:00 PM, Kian-Tat Lim wrote: > Daniel, > >> The join code thinks that there are always benefits to subchunk a >> subchunked table when it is being joined. It is stupid, but at least >> it provides correct results under some circumstances. I added a TODO >> to think about this, because it doesn't seem like something that we >> can design and define in our heads as we write code. You might >> consider the join predicate (hello, USING() and ON() syntax) and >> what columns are indexed (hello, metadata) or approximate sizes of >> chunk and subchunk tables. Sometimes it looks like query analyzer >> and optimizer problem. It looks easy to get wrong. > I don't think we should be trying to outsmart MySQL (or the DBA) > here. I agree that we shouldn't try to outsmart MySQL. But we are, explicitly, for spatial self-joins. Because it is disastrous without them. Right now, there is no code to detect equi-joins. There is also no (well-understood) code to support more than a two-way spatial join. > I believe the only queries for which subchunking is definitively > better are spatially-limited self-joins, which hopefully are relatively > easy to recognize. All other queries should just be passed through as > is (at least until subchunking is proven to help them). You have to know that you can pass them through. This might be simple, but it hasn't been thought out. It would also be nice to know when the query can't be executed accurately. These things haven't been defined precisely, and perhaps I'm really dense, but the first cut at the logic is broken and I felt there was hidden, dangerous complexity lurking underneath. Still, I haven't revisited after the new parse framework, so maybe it's much simpler now. Dare to dream. -Daniel ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1