The issue of scanning older releases is a big one, it is still unclear. It is pretty much about balancing the cost and guessing what users will want. I have RFC-134 opened about it, and i am scheduled to talk to Mario on Monday to understand it better.

The thinking about current and next chunk in memory was that we are fetching chunk x+1 while we are processing chunk x that is already in memory (that is purely CPU, no disk IO). This would help us keep the disks busy.

BTW, you might watch RFC-132 as well. It’d get pretty nasty if we had to do efficient joins between data releases. Also, RFC-133 is interesting, we might need to have an additional scan for Dia* tables (level 1 data processed through DRP). It is unclear yet if that is needed (I am guessing it will be), and whether we’d need to do any joins between Dia* and say Object. I’m checking with Mario on that.

I’ll keep you all updated about input I get from Mario. I do want to sort out the requirement side of things (older DRs, cross DR joins etc) in the next few weeks.

thanks,

Jacek

On Feb 5, 2016, at 1:52 PM, Gates, John H <[log in to unmask]> wrote:

Hi Jacek,

I was looking through LDM-135 and came across a couple of things.

one synchronized full table scan of Object_Narrow and Object_Extra every 8 hours for the latest and previous data releases.

This is the only scan that specifically mentions the previous data releases. I'm not sure why it is special like that. It also looks like we aren't making any promises at all about other scans for older releases.

For a self-joins, a single shared scans will be sufficient, however each node must have sufficient memory to hold 2 chunks at any given time (the processed chunk and next chunk). Refer to the sizing model [LDM-141] for further details on the cost of shared scans.

I'm not sure why a node needs enough memory to hold the processed chunk and the next chunk. I would think it only needs to have enough memory to deal with the chunk it is working with right now, assuming this is per shared scan.

I also had a couple of thoughts on the sizing model and shared scans. It looks like we're heading for one cluster per release, which is simple but means doing joins across releases is not easy. If, on the other hand it is decided to put all releases on the same cluster, and put all the same chunks for all the releases on one node, the worker scheduler can be pretty easily modified to order scans by chunkId then release number. Provided an increase in worker nodes and czars with each added release, it should work. Joins between object_extra between releases could require a fair bit of memory, but if one release is under utilized, the other queries should be faster.

-John