Dear all,
I find this application case very appealing. I expect that astronomical users either keep asking over and over the same kind of queries with small variants… I also expect that the hit rate highly depends on the columns accessed. And I know for sure that for many applications, we do want a reduced version of the data containing only relevant information for this application.
Do you have access to SDSS database queries log in order to check this on a real case basis ?
There is an old astronomer lore which says that a zipped text file is more powerful than any DB. I believe the modern way of saying this is caching. When working on my science, I'll want to have direct access to a reduced dataset in order to optimize performance. And we could imagine science collaborations and/or local universities having their own cache for this, especially if it would lower the load on the central system.
In the field of high energy physics, we used to rank data centers in tiers for LHC production. Tier 1 were national data centers as CC, and Tiers 2 were major local facilities like at LPC. One of the reason for this architecture is that it was possible to get money for Tier 2 from regional funding, that we wouldn't have got for Tier 1. Leveraging local funding for LSST computing can be of general benefit, if we find the appropriate data model.
If you come out of your brainstorming with some design proposal, we would be very interested to run tests with our local development platform serving as a local cache accessing a permanent system for instance at CC.
Emmanuel
Le 25 sept. 2013 à 01:32, Jacek Becla a écrit :
> BTW, I added that use case to
>
> https://dev.lsstcorp.org/trac/wiki/db/DataDistribution
>
> Feel free to add your comments.
>
> It is liked off the db page in trac
>
> Jacek
>
>
> On 9/24/2013 4:27 PM, Wang, Daniel Liwei wrote:
>> Hello,
>>
>>> Here's something for you and the team to think about in November: how
>>> would you modify qserv to download and cache chunks on-demand?
>>>
>> Without considering whether this would be a good idea or not, it's as
>> easy as implementing auto-staging, and integrating with the replication
>> management system (which we haven't written).
>>
>> Auto-staging doesn't sound that bad--when we recover from a failed node,
>> we have to retrieve and add chunks to existing nodes, so doing remote
>> retrieval is probably just incremental fragility and flexibility.
>>
>> -Daniel
>>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
########################################################################
Use REPLY-ALL to reply to list
To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
|