LISTSERV mailing list manager LISTSERV 16.5

Help for QSERV-L Archives


QSERV-L Archives

QSERV-L Archives


QSERV-L@LISTSERV.SLAC.STANFORD.EDU


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

QSERV-L Home

QSERV-L Home

QSERV-L  February 2016

QSERV-L February 2016

Subject:

Re: About secondary index size

From:

Serge Monkewitz <[log in to unmask]>

Reply-To:

General discussion for qserv (LSST prototype baseline catalog)

Date:

Thu, 18 Feb 2016 13:12:50 -0800

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (80 lines)

On Feb 18, 2016, at 12:13 PM, Fabrice Jammes <[log in to unmask]> wrote:
> 
> Hi Serge,
> 
> Fabio might get some SSD or SSD+NVM-express storage to test secondary index performance.
> That's why I'm trying to better understand how secondary index size is computed.
> So, in my current understanding, having chunkID on 4 bytes and objectId on 8 bytes seems reasonable and would lead to a InnoDB table file of 456GB*3 = 1.5TB.
> If this is correct, a 2TB SSD machine would be fine for our tests.

> Serge, would you agree with this proposal?

Assuming the following:

create table mapping (
	objectId bigint not null,
	chunkId int not null,
	primary key (objectId, chunkId)
) engine=innodb row_format=compact;

I get a per record size of (8 + 4 + 5 + 6 + 7) = 30 bytes. Assuming pages are generally only half full, my guess at worst case size is 2.28 TB (removing 2 bytes doesn’t bring it under 2TB). I defer to others on size estimates though, as I have very little InnoDB experience.

Still, my 2cents are that 2TB is already quite nice. Even if my worst case estimate isn’t overly pessimistic, we can test at almost full scale. And we can try out row_format=compressed (or page_compressed=1 with say lz4 compression) and probably get all the way to full scale in 2TB.

Just FYI though - there are significantly bigger NVMe SSDs available. The Intel P3608 is available at 3.2 and 4TB, and Samsung’s PM1725 goes up to 6.4TB in the HHHL form factor. They are unsurprisingly very expensive relative to 2TB offerings.

Cheers,
Serge

> Regards,
> 
> On 02/18/2016 11:50 AM, Jacek Becla wrote:
>> also, smallint/mediumint types are mysql specific and we were
>> trying to avoid using them.
>> 
>> 
>> On 02/18/2016 11:00 AM, Serge Monkewitz wrote:
>>> Chunk ids probably fit into 2 bytes with the current baseline 20k chunks, but chunk IDs are not contiguous integers so I’m not 100% sure. It’s obviously not my call, but personally I would like to have the flexibility to try > 65536 chunks, or to encode spatial binning at finer-than-chunk granularity in the secondary index. Is the storage overhead of an extra 2 bytes per row a deal breaker somehow?
>>> 
>>>> On Feb 18, 2016, at 10:50 AM, Jacek Becla <[log in to unmask]> wrote:
>>>> 
>>>> Fabrice, I think we planned for regular 4 byte int for chunkid.
>>>> 
>>>> But check with others from the group, I am way too busy with the
>>>> overall dm type things right now...
>>>> 
>>>> 
>>>> 
>>>> On 02/18/2016 09:45 AM, Fabrice Jammes wrote:
>>>>> Sorry for nitpicking, but it seems that 8 bytes (a SMALLINT for chunkID
>>>>> and a BIGINT for objectId) would be enough to store one raw:
>>>>> http://dev.mysql.com/doc/refman/5.5/en/integer-types.html
>>>>> 
>>>>> This would lead to an index of:
>>>>> 38 * 10^9 objects * (2+8 bytes) = 380 GB
>>>>> 
>>>>> And, with overhead, 1.2TB.
>>>>> 
>>>>> Can I send these number to Fabio please?
>>>>> 
>>>>> Cheers,
>>>> 
>>>> ######################################################################## 
>>>> Use REPLY-ALL to reply to list
>>>> 
>>>> To unsubscribe from the QSERV-L list, click the following link:
>>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>>> 
>> 
>> ########################################################################
>> Use REPLY-ALL to reply to list
>> 
>> To unsubscribe from the QSERV-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2018
February 2018
January 2018
December 2017
August 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012

ATOM RSS1 RSS2



LISTSERV.SLAC.STANFORD.EDU

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager

Privacy Notice, Security Notice and Terms of Use