Print

Print


Hi Serge,

Thanks a lot, we will try it and will send feedback.

Dominique
Le 13/01/2014 12:10, Serge Monkewitz a écrit :
[log in to unmask]" type="cite">
Dominique, Fabrice,

    Just in case it’s useful, there are some instructions on running the new partitioner here:
https://dev.lsstcorp.org/trac/wiki/db/Qserv/Partitioner

Feedback is most welcome - I don’t think anyone has really excercised the new partitioner yet.

Serge

On Jan 13, 2014, at 12:01 PM, Dominique Boutigny <[log in to unmask]> wrote:

Jacek,

Thanks for your answer. I will let Fabrice and Osman to deal with this. I guess that Fabrice knows how to run the partitioner but he may have some specific questions.

Dominique
Le 13/01/2014 11:54, Jacek Becla a écrit :
Dominique

I think the best would be to break the largest tables into
a reasonably small number of smaller tables (say 16, that
should put the table sizes back into the regime where things
are working reasonably fast.). I think writing a trivial
python script to do that would be easier than going through
partitioner. But indeed, if it is experimental, we could
run it by the partitioner. Would you like the instructions
how to use the partitioner, or will you be trying to divide
the largest tables through a simple python script first?

Jacek





On 01/13/2014 10:36 AM, Dominique Boutigny wrote:
Hi Jacek,

As far as I understand there is no unsolved issues with the standard
(non qserv) procedure, it is just awfully (and at the limit of the
usability) long. The idea here was to try qserv partitioning to see if
it is improving the performances for the ingestion step as well as for
the user access.
There are no issues with the Data Challenge, as IN2P3 already fulfilled
its commitment with LSST, the goal is to explore this partitioning
functionality on a single MySQL node.

Dominique

Le 13/01/2014 10:22, Jacek Becla a écrit :
Fabrice

It'd be better to not pull Qserv into any Data Challenges yet.

Is there an issue with using the merge engine? If we use
partitioner to just break data into separate tables, we can
do that easily without partitioner, and if we start using the
advanced features like overlap, it will quickly get out of hand.

Can Osman send more specific information what is not working?
As I said earlier, I have not heard about any issues since I
exchanged emails with him on Dec 11 (which I just forwarded
to you).

Jacek



On 01/13/2014 09:53 AM, Fabrice Jammes wrote:
Hello,

Osman Aidel, a CC-IN2P3 expert in databases administration, try
currently to load in MySQL the 3TB dataset produced during last data
challenge.
Osman and Dominique Boutigny succeeded in loading the whole dataset in
MySQL, but some post-processing steps of this dataset (like removal of
duplicates) take a infinite time.

Please remark that these issues are good news for Qserv as it validates
its distributed data model ;-).

Christian Arnault, french manager for LSST computing, thinks that some
of the tools developed by Qserv team could help Osman, and CC-IN2P3.
Indeed, the partitioning algorithm developed by Serge could be used to
partition the DC dataset in a collection of chunks.
Osman could then load a part of contiguous chunks of this collection
into a single-node MySQL server.

Do you think this proposal could be attainable soon ? Indeed, French
physicists are interested in studying a representative sample of the DC
dataset and this solution would help them a lot.

Thanks,

Fabrice


######################################################################## 
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1



-- 
Dominique Boutigny  -  CNRS / CC-IN2P3
Now at SLAC National Accelerator Laboratory

Mail     : [log in to unmask] -  [log in to unmask]
Office   : +1 650-926-5759   -  Cellular : +1 774-232-0912



########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1


-- 
Dominique Boutigny  -  CNRS / CC-IN2P3  
Now at SLAC National Accelerator Laboratory

Mail     : [log in to unmask] -  [log in to unmask]
Office   : +1 650-926-5759   -  Cellular : +1 774-232-0912


Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1