On 9/24/13 14:54 , Kian-Tat Lim wrote:
> Mario,
>
>>> Here's something for you and the team to think about in November: how
>>> would you modify qserv to download and cache chunks on-demand?
>>>
>>> Imagine the following scenario: scientist M at university H has access
>>> to 200+ nodes w. petabytes of storage.
>
> The same exact number of nodes and amount of storage as we have?
> Or something smaller?
>
Larger, same, smaller, shouldn't matter. If smaller, throw an exception
when disk is full.
> Are you anticipating that many users will *not* do full-table
> scans? Once they do, they have everything (for that table).
>
> Chunks are expected to be multiple terabytes in size, which
> means that downloads are hours long.
>
Maybe.
Or, by ~2025, maybe not, if we have our chunks mirrored on various CDNs
and everyone is using terabit networks. Or maybe this drives us to
rethink how we vertically (and horizontally) partition our tables.
>>> It may also solve our issue with the number of replicas needed to
>>> guard against failure, since we could configure our Archive center
>>> database to fetch any chunks that it doesn't have (e.g., because the
>>> nodes have failed) from the Chilean or French site.
>
> It's not clear that it's faster to get from France or Chile than
> from a local backup. In any case, copying from anywhere else still
> means that we're down.
>
Or you can think of it as being in degraded mode -- when a shared scan
hits a chunk that needs to be downloaded (remotely, or from tape), it
will block until the chunk is in place.
Just throwing out ideas...
--
Mario Juric,
Data Mgmt. Project Scientist, Large Synoptic Survey Telescope
Web : http://www.cfa.harvard.edu/~mjuric/
Phone : +1 617 744 9003 PGP: ~mjuric/crypto/public.key
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|