Print

Print


Ok, we're on the same page.

Having some real test case should a help a lot in finding the good 
optimizations for this interesting problem.
in2p3 cluster can of course be used to run this kind of experimentation, 
and of course I'm interested in contributing ;-)

Cheers,

Fabrice


On 02/26/2015 04:08 PM, Daniel L. Wang wrote:
>> Using the secondary index (built using Object table) to compute the 
>> chunk of a given Source w.r.t its objectId column would avoid this 
>> join, isn't it?
>> For example, if a source i has objectId field equal to j, then we can 
>> query the secondary index on objectId=j to get the chunk of the 
>> source, this should work.
>> Of course we have to build the secondary index prior to this operation.
> This is effectively a join, no? I'm not suggesting sending a SQL join 
> query into the czar's normal pipeline. But looking up chunkId with the 
> secondary index is an index-only join. I think we might still want to 
> create a smaller lookup table for each batch of child table rows, 
> depending on how fast we can make the full index lookups: Is it faster 
> to do 10 million full-index lookups (on disk), or 100k full-index 
> lookups, create a 100k hash table, and 10 million lookups on the 
> in-memory table? I don't know yet. 

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1