LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  June 2016

QSERV-L June 2016

Subject:

worker scheduling with mmap and mlock

From:

John Gates <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Fri, 10 Jun 2016 15:49:17 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (94 lines)

Last week, we got memman working with mmap and mlock and there was a 
huge speed boost (done in less than half the time). The downside was the 
mlock call took several seconds and broke the worker scheduling. Jobs 
were only being taken from the highest priority scheduler and 
interactive jobs had a significant delay before the worker would get to 
them (2 or 3 minutes). So I've been looking into the problem and it 
looks like I've got it fixed. I don't entirely understand what is 
happening, though.

If things run in the following sequence, the queries run very fast. The 
main slowdown is waiting for mlock to finish. For example

[2016-06-10T21:28:34.651Z] ... ScanScheduler::commandStart QI=290477:6487;
[2016-06-10T21:28:34.651Z] ... QueryRunner::runQuery() QI=290477:6487;
[2016-06-10T21:28:34.651Z] ... QI=290477:6487; waitForMemMan begin
        Waiting for mlock here - we may have had to wait for a few mlock 
calls. I suspect mlock for this function took about 3 seconds.
[2016-06-10T21:28:49.579Z] ... QI=290477:6487; waitForMemMan end
        At this point, the sql query is sent to mysql, and in < 0.2 
seconds it's ready to transmit results
[2016-06-10T21:28:49.707Z] ... _transmit last=1 QI=290477:6487;
[2016-06-10T21:28:49.707Z] ... _transmit last=1 QI=290477:6487;
[2016-06-10T21:28:49.708Z] ... QI=290477:6487; processing sec=15
[2016-06-10T21:28:49.770Z] ... BlendScheduler::commandFinish QI=290477:6487;
        And in 0.3 seconds the query is done

So, if we go through the trouble of using mmap and mlock, the queries 
run fast and most of the time (80-90%) is spent in mlock. These are very 
simple queries, but that seems strange as neither mmap nor mlock are 
expected to read the file into memory, which is what I thought to be 
expensive. And this works just as well for a single query as it does for 
a group of queries on the same chunk. There's this strange mlock bottle 
neck, but if you pay the price, everything else is much faster. It's 
also worth noting that the system load is really low when using mlock. 
Without mlock, the system load would be around 120, but with it the load 
would be a fairly steady 40.

For the mlock speed up to work, the mlock call must be completed before 
passing the query to mysql, or the speedup vanishes. Also, if more than 
one mlock call is running at the same time, the speedup vanishes. To get 
the scheduler to work properly, waiting for the mlock call must happen 
outside the scheduler.

I've got code in tickets/DM-6518 that appears to work but needs cleanup, 
and a bit more testing. However, SELECT COUNT(*) FROM Object; takes 
about 30sec and SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 
AND 2; takes about 30min, which is much better than 3.8min and 1hr 15min 
respectively.


-John


And test results:

Before mlock ,'SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 
2;'  would take 1hr 15min.

DM-5709 - older without fix to let scheduler work properly but using mlock.
group
SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2; 3539300   
27 min 51.44 sec
SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30; 
321080583   31 min 11.31 sec
select count(*) FROM Object WHERE u_apFluxSigma between 0 and 2.27e-30; 
475244843   31 min 43.64 sec
SELECT COUNT(*) FROM Object;   3min 50sec

-----------------------------------------------------------------------------------
DM-6518 - latest - has fix for scheduler and uses mlock.
Solo queries
SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2;    27 min 
39.02 sec
SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30;   
5 min 13.92 sec
SELECT COUNT(*) FROM Object;   31.61 sec

group
SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 1 AND 2;    30 min 
56.24 sec
SELECT count(*) from Object WHERE u_apFluxSigma between 0 and 2.2e-30;   
17 min 9.31 sec
select count(*) FROM Object WHERE u_apFluxSigma between 0 and 
2.27e-30;   17 min 9.00 sec
SELECT COUNT(*) FROM Source WHERE flux_sinc BETWEEN 2 AND 3;    30 min 
53.80 sec
SELECT COUNT(*) FROM Object;    55.64 sec and 34.50 sec

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use