Dear Georges en Teng,

Sorry, I was travelling and I missed your previous email.
We're now ingesting ~ 100TB dataset inside Qserv and some progress should have been done related to data-loading.
Igor Gaponenko and John Gates are responsible for data ingestion and should have information about current state of the art. Would you like to get in touch with them?

FYI, I do not think we add additional indexes on our chunked tables (only PK), for space reason.

Regards,

Fabrice


De: "George Beckett" <[log in to unmask]>
À: "Fabrice Jammes" <[log in to unmask]>
Cc: "LI Teng" <[log in to unmask]>
Envoyé: Jeudi 7 Décembre 2017 11:37:21
Objet: RE: Qserv ingest/ indexing performance, questions about

Dear Fabrice,

 

Have you had a chance to consider my previous email? Teng has made some further progress w/ MySQL and Qserv, for ingesting and indexing a ~100GB test table, using ‘myisamchk’ as follows:

 

-        Setting a larger sort_buffer_size produces significant performance improvements, up to around 16GB.

-        The choice of indexing mode (recover vs. safe-recover) is significant and, given we have plenty of spare disk space on our server, the ‘recover’ mode also produces good speed-up.

 

We’re also considering using multithreading (parallel-recover), though note this is classed as ‘Beta’, so wonder if that will lead to problems. Teng is best placed to explain further (and correct anything above that I may have misunderstood).

 

Thanks again,

George.

 

Cc: Teng.

 

--

Dr George Beckett,

University of Edinburgh

 

Email: [log in to unmask]

Tel: +44 131 651 3577 (EPCC) /

      +44 131 6688 255 (Astronomy)

Web: http://www.epcc.ed.ac.uk/

 

The University of Edinburgh is a charitable body, registered in

Scotland, with registration number SC005336.

 

From: BECKETT George
Sent: 16 November 2017 16:40
To: [log in to unmask] <[log in to unmask]>
Cc: LI Teng <[log in to unmask]>
Subject: Qserv ingest/ indexing performance, questions about

 

Dear Fabrice,

 

I hope this email finds you well. I’d like to introduce you to a colleague of mine, Teng Li (cc’ed), who has recently joined the LSST:UK team here in Edinburgh.

 

Teng is working on an evaluation of Qserv, which has a couple of different angles. First, we are creating a suite of benchmark queries with which to test performance of Qserv for what will be crucial use cases for UK astronomers. Second, we are investigating the ingest mechanism for Qserv, given that we expect to need to quickly ingest large tables from LSST (and other surveys) during the Commissioning and Operations phases.

 

Teng has set up a couple of Qserv and MySQL instances into which he is ingesting copies of the UKIDSS and SDSS datasets. However, he has found the process to be very slow for the largest tables (especially indexing). A look through the LSST webpages led us to a report that you prepared some time ago on Qserv ingest/ indexing:

 

https://dev.lsstcorp.org/trac/wiki/mysqlLargeTablesAtIn2p3

 

and

 

https://dev.lsstcorp.org/trac/wiki/mysqlLargeTables.

 

Based on this, we have some ideas of how to improve ingest – e.g. looking at influence of sort_buffer_size and other indexing parameters. However, before we go too far with this investigation, we thought it worthwhile to check with you if there is any more up-to-date results, or if the situation has changed in the last four years.

 

Thanks in advance for any advice. I’ll leave it with Teng to answer any more technical questions you may have on our planned experiments, or our setup.

 

Speak to you soon,

George.

 

Cc: Teng.

 

--

Dr George Beckett,

University of Edinburgh

 

Email: [log in to unmask]

Tel: +44 131 651 3577 (EPCC) /

      +44 131 6688 255 (Astronomy)

Web: http://www.epcc.ed.ac.uk/

 

The University of Edinburgh is a charitable body, registered in

Scotland, with registration number SC005336.

 




Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1