Print

Print


Hi Serge,

Thanks a lot, we will try it and will send feedback.

Dominique
Le 13/01/2014 12:10, Serge Monkewitz a écrit :
> Dominique, Fabrice,
>
>     Just in case it’s useful, there are some instructions on running 
> the new partitioner here:
> https://dev.lsstcorp.org/trac/wiki/db/Qserv/Partitioner
>
> Feedback is most welcome - I don’t think anyone has really excercised 
> the new partitioner yet.
>
> Serge
>
> On Jan 13, 2014, at 12:01 PM, Dominique Boutigny <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>> Jacek,
>>
>> Thanks for your answer. I will let Fabrice and Osman to deal with 
>> this. I guess that Fabrice knows how to run the partitioner but he 
>> may have some specific questions.
>>
>> Dominique
>> Le 13/01/2014 11:54, Jacek Becla a écrit :
>>> Dominique
>>>
>>> I think the best would be to break the largest tables into
>>> a reasonably small number of smaller tables (say 16, that
>>> should put the table sizes back into the regime where things
>>> are working reasonably fast.). I think writing a trivial
>>> python script to do that would be easier than going through
>>> partitioner. But indeed, if it is experimental, we could
>>> run it by the partitioner. Would you like the instructions
>>> how to use the partitioner, or will you be trying to divide
>>> the largest tables through a simple python script first?
>>>
>>> Jacek
>>>
>>>
>>>
>>>
>>>
>>> On 01/13/2014 10:36 AM, Dominique Boutigny wrote:
>>>> Hi Jacek,
>>>>
>>>> As far as I understand there is no unsolved issues with the standard
>>>> (non qserv) procedure, it is just awfully (and at the limit of the
>>>> usability) long. The idea here was to try qserv partitioning to see if
>>>> it is improving the performances for the ingestion step as well as for
>>>> the user access.
>>>> There are no issues with the Data Challenge, as IN2P3 already fulfilled
>>>> its commitment with LSST, the goal is to explore this partitioning
>>>> functionality on a single MySQL node.
>>>>
>>>> Dominique
>>>>
>>>> Le 13/01/2014 10:22, Jacek Becla a écrit :
>>>>> Fabrice
>>>>>
>>>>> It'd be better to not pull Qserv into any Data Challenges yet.
>>>>>
>>>>> Is there an issue with using the merge engine? If we use
>>>>> partitioner to just break data into separate tables, we can
>>>>> do that easily without partitioner, and if we start using the
>>>>> advanced features like overlap, it will quickly get out of hand.
>>>>>
>>>>> Can Osman send more specific information what is not working?
>>>>> As I said earlier, I have not heard about any issues since I
>>>>> exchanged emails with him on Dec 11 (which I just forwarded
>>>>> to you).
>>>>>
>>>>> Jacek
>>>>>
>>>>>
>>>>>
>>>>> On 01/13/2014 09:53 AM, Fabrice Jammes wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Osman Aidel, a CC-IN2P3 expert in databases administration, try
>>>>>> currently to load in MySQL the 3TB dataset produced during last data
>>>>>> challenge.
>>>>>> Osman and Dominique Boutigny succeeded in loading the whole 
>>>>>> dataset in
>>>>>> MySQL, but some post-processing steps of this dataset (like 
>>>>>> removal of
>>>>>> duplicates) take a infinite time.
>>>>>>
>>>>>> Please remark that these issues are good news for Qserv as it 
>>>>>> validates
>>>>>> its distributed data model ;-).
>>>>>>
>>>>>> Christian Arnault, french manager for LSST computing, thinks that 
>>>>>> some
>>>>>> of the tools developed by Qserv team could help Osman, and CC-IN2P3.
>>>>>> Indeed, the partitioning algorithm developed by Serge could be 
>>>>>> used to
>>>>>> partition the DC dataset in a collection of chunks.
>>>>>> Osman could then load a part of contiguous chunks of this collection
>>>>>> into a single-node MySQL server.
>>>>>>
>>>>>> Do you think this proposal could be attainable soon ? Indeed, French
>>>>>> physicists are interested in studying a representative sample of 
>>>>>> the DC
>>>>>> dataset and this solution would help them a lot.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Fabrice
>>>>>>
>>>>>
>>>>> ########################################################################
>>>>> Use REPLY-ALL to reply to list
>>>>>
>>>>> To unsubscribe from the QSERV-L list, click the following link:
>>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>>>>
>>>
>>
>> --
>> Dominique Boutigny  -  CNRS / CC-IN2P3
>> Now at SLAC National Accelerator Laboratory
>>
>> Mail     :[log in to unmask] <mailto:[log in to unmask]>- 
>> [log in to unmask] <mailto:[log in to unmask]>
>> Office   : +1 650-926-5759   -  Cellular : +1 774-232-0912
>>
>>
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the QSERV-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>

-- 
Dominique Boutigny  -  CNRS / CC-IN2P3
Now at SLAC National Accelerator Laboratory

Mail     : [log in to unmask] -  [log in to unmask]
Office   : +1 650-926-5759   -  Cellular : +1 774-232-0912


########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1