Jacek, Thanks for your answer. I will let Fabrice and Osman to deal with this. I guess that Fabrice knows how to run the partitioner but he may have some specific questions. Dominique Le 13/01/2014 11:54, Jacek Becla a écrit : > Dominique > > I think the best would be to break the largest tables into > a reasonably small number of smaller tables (say 16, that > should put the table sizes back into the regime where things > are working reasonably fast.). I think writing a trivial > python script to do that would be easier than going through > partitioner. But indeed, if it is experimental, we could > run it by the partitioner. Would you like the instructions > how to use the partitioner, or will you be trying to divide > the largest tables through a simple python script first? > > Jacek > > > > > > On 01/13/2014 10:36 AM, Dominique Boutigny wrote: >> Hi Jacek, >> >> As far as I understand there is no unsolved issues with the standard >> (non qserv) procedure, it is just awfully (and at the limit of the >> usability) long. The idea here was to try qserv partitioning to see if >> it is improving the performances for the ingestion step as well as for >> the user access. >> There are no issues with the Data Challenge, as IN2P3 already fulfilled >> its commitment with LSST, the goal is to explore this partitioning >> functionality on a single MySQL node. >> >> Dominique >> >> Le 13/01/2014 10:22, Jacek Becla a écrit : >>> Fabrice >>> >>> It'd be better to not pull Qserv into any Data Challenges yet. >>> >>> Is there an issue with using the merge engine? If we use >>> partitioner to just break data into separate tables, we can >>> do that easily without partitioner, and if we start using the >>> advanced features like overlap, it will quickly get out of hand. >>> >>> Can Osman send more specific information what is not working? >>> As I said earlier, I have not heard about any issues since I >>> exchanged emails with him on Dec 11 (which I just forwarded >>> to you). >>> >>> Jacek >>> >>> >>> >>> On 01/13/2014 09:53 AM, Fabrice Jammes wrote: >>>> Hello, >>>> >>>> Osman Aidel, a CC-IN2P3 expert in databases administration, try >>>> currently to load in MySQL the 3TB dataset produced during last data >>>> challenge. >>>> Osman and Dominique Boutigny succeeded in loading the whole dataset in >>>> MySQL, but some post-processing steps of this dataset (like removal of >>>> duplicates) take a infinite time. >>>> >>>> Please remark that these issues are good news for Qserv as it >>>> validates >>>> its distributed data model ;-). >>>> >>>> Christian Arnault, french manager for LSST computing, thinks that some >>>> of the tools developed by Qserv team could help Osman, and CC-IN2P3. >>>> Indeed, the partitioning algorithm developed by Serge could be used to >>>> partition the DC dataset in a collection of chunks. >>>> Osman could then load a part of contiguous chunks of this collection >>>> into a single-node MySQL server. >>>> >>>> Do you think this proposal could be attainable soon ? Indeed, French >>>> physicists are interested in studying a representative sample of >>>> the DC >>>> dataset and this solution would help them a lot. >>>> >>>> Thanks, >>>> >>>> Fabrice >>>> >>> >>> ######################################################################## >>> >>> Use REPLY-ALL to reply to list >>> >>> To unsubscribe from the QSERV-L list, click the following link: >>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1 >> > -- Dominique Boutigny - CNRS / CC-IN2P3 Now at SLAC National Accelerator Laboratory Mail : [log in to unmask] - [log in to unmask] Office : +1 650-926-5759 - Cellular : +1 774-232-0912 ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1