Last week, we got memman working with mmap and mlock and there was a huge speed boost (done in less than half the time). The downside was the mlock call took several seconds and broke the worker scheduling. Jobs were only being taken from the highest priority scheduler and interactive jobs had a significant delay before the worker would get to them (2 or 3 minutes). So I've been looking into the problem and it looks like I've got it fixed. I don't entirely understand what is happening, though. If things run in the following sequence, the queries run very fast. The main slowdown is waiting for mlock to finish. For example [2016-06-10T21:28:34.651Z] ... ScanScheduler::commandStart QI=290477:6487; [2016-06-10T21:28:34.651Z] ... QueryRunner::runQuery() QI=290477:6487; [2016-06-10T21:28:34.651Z] ... QI=290477:6487; waitForMemMan begin Waiting for mlock here - we may have had to wait for a few mlock calls. I suspect mlock for this function took about 3 seconds. [2016-06-10T21:28:49.579Z] ... QI=290477:6487; waitForMemMan end At this point, the sql query is sent to mysql, and in < 0.2 seconds it's ready to transmit results [2016-06-10T21:28:49.707Z] ... _transmit last=1 QI=290477:6487; [2016-06-10T21:28:49.707Z] ... _transmit last=1 QI=290477:6487; [2016-06-10T21:28:49.708Z] ... QI=290477:6487; processing sec=15 [2016-06-10T21:28:49.770Z] ... BlendScheduler::commandFinish QI=290477:6487; And in 0.3 seconds the query is done So, if we go through the trouble of using mmap and mlock, the queries run fast and most of the time (80-90%) is spent in mlock. These are very simple queries, but that seems strange as neither mmap nor mlock are expected to read the file into memory, which is what I thought to be expensive. And this works just as well for a single query as it does for a group of queries on the same chunk. There's this strange mlock bottle neck, but if you pay the price, everything else is much faster. It's also worth noting that the system load is really low when using mlock. Without mlock, the system load would be around 120, but with it the load would be a fairly steady 40. For the mlock speed up to work, the mlock call must be completed before passing the query to mysql, or the speedup vanishes. Also, if more than one mlock call is running at the same time, the speedup vanishes. To get the scheduler to work properly, waiting for the mlock call must happen outside the scheduler. I've got code in tickets/DM-6518 that appears to work but needs cleanup, and a bit more testing. However, SELECT COUNT(*) FROM Object; takes about 30sec and SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2; takes about 30min, which is much better than 3.8min and 1hr 15min respectively. -John And test results: Before mlock ,'SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2;' would take 1hr 15min. DM-5709 - older without fix to let scheduler work properly but using mlock. group SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2; 3539300 27 min 51.44 sec SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30; 321080583 31 min 11.31 sec select count(*) FROM Object WHERE u_apFluxSigma between 0 and 2.27e-30; 475244843 31 min 43.64 sec SELECT COUNT(*) FROM Object; 3min 50sec ----------------------------------------------------------------------------------- DM-6518 - latest - has fix for scheduler and uses mlock. Solo queries SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2; 27 min 39.02 sec SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30; 5 min 13.92 sec SELECT COUNT(*) FROM Object; 31.61 sec group SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2; 30 min 56.24 sec SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30; 17 min 9.31 sec select count(*) FROM Object WHERE u_apFluxSigma between 0 and 2.27e-30; 17 min 9.00 sec SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 2 AND 3; 30 min 53.80 sec SELECT COUNT(*) FROM Object; 55.64 sec and 34.50 sec ######################################################################## Use REPLY-ALL to reply to list To unsubscribe from the QSERV-L list, click the following link: https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1