LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  January 2014

QSERV-L January 2014

Subject:

Re: About DC2013 data-loading procedure

From:

Serge Monkewitz <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Thu, 16 Jan 2014 17:54:15 -0800

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (42 lines)

Hi Fabrice,

    OK, I read through that page. I just want to point out that you will have to be careful to partition based on the deep source position, not the deep forced source positions. This is because duplicate forced sources are not guaranteed to have identical (ra, decl) coordinates. They will however be associated with the same deep source (have identical deepSourceId values). Thus, to run the deduplication procedure from the trac page on chunks, you'll need to ensure that duplicates always end up in the same chunk.

I’m not sure whether the deep forced source data includes the ra,dec of the deep source it was derived from.

If it does not, things will get “interesting”. It should be the case that duplicates have positions that are extremely close to one another. So the first thing to do would be to go ahead and partition with the deep forced source position anyway. To check that no duplicates where split across chunks, you’ll want to set up the partitioner such that each chunk contains exactly one sub-chunk, and such that the overlap radius is non-zero but small (let’s say an arc-minute). This way, the partitioner will split input into chunks, and, for each chunk, provide nearby rows (the overlap). If the two deep forced sources in a duplicate pair are assigned to different chunks, then one will be in the overlap of the chunk for the other, and vice-versa.

So, load all chunk and chunk overlap tables, and check for the existence of split duplicate pairs by testing whether equi-joining a chunk and its overlap on deep source ID yields any rows.

Hopefully you will not encounter any cases where this actually happens, in which case you can just drop all the overlap tables. But if you are unlucky, you’ll need to deal with the annoyance of picking a chunk for each split duplicate pair (I would just assign the pair to the chunk with the smaller ID), and adding/removing rows from chunks to reflect your decisions.

I’m happy to help if you run into any problems!

Cheers,
Serge

On Jan 16, 2014, at 2:23 PM, Fabrice Jammes <[log in to unmask]> wrote:

> Hello Serge,
> 
> Interesting informations are available here :
> https://dev.lsstcorp.org/trac/wiki/Summer2013/ConfigAndStackTestingPlans/DedupeForcedSources
> 
> Thanks for your offer to help with partitioner, i could contact you soon.
> 
> Have a nice day,
> 
> Fabrice
> 
> ########################################################################
> Use REPLY-ALL to reply to list
> 
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use