Print

Print


Given the schedule, less complexity is always better. That alone sort of 
tips it to RAID with an LVM. Not particularly speedy as you have more 
disks participating in each I/O operation but it probably would be OK 
for qserv purposes. As for reliability, well it will be better in the 
first 4 years or so, after that the probability of multiple disk failures 
increases (you can get better data on that from Lance).My two cents 
worth.

Andy

On Fri, 16 Jan 2015, Salnikov, Andrei A. wrote:

> Hi Jacek,
>
> we could certainly make our tools smarter to use several disks
> for data storage, but it would need some effort. I think even
> without RAID we could still combine them all into one large
> space either at the controller level or with OS-level tools
> (LVM or ZFS). I have slight preference for RAID though, all
> disks fail and we do not want to spend too much time recovering
> data when it happens, having sysadmins rebuilding array is
> certainly less of a distraction (for us). So RAID6 if we want
> more reliability or RAID5 if we need more space. I'd probably go
> for RAID6, with 8 drives in an array it's higher risk for second
> disk failure. We could also do RAID6 with all 10 disks and
> then LVM on top of that for system/temp/data.
>
> Cheers,
> Andy
>
> Jacek Becla wrote on 2015-01-16:
>> Any opinions about RAID vs JBOD?
>>
>> Options I can think of:
>>
>> 1) LVM. 8 disks, RAIDed, for data. 1 disk for mysql_temp,
>>     1 disk for system and Qserv software etc
>> 2) JBOD. Problem: we need to deal with distributing data
>>     across disks. AndyS: how difficult would it be to extend
>>     the loader so that it shards chunks (say round robin)
>>     across multiple disks on a machine (and do symlink for
>>     each table in mysql_data directory)? Unless Andy can
>>     easily do it in loader, I think I'd rather not go there
>>     now, because it will add another level of complexity and
>>     delay testing.
>> 3) JBOD. We use mysql partitioning to distribute data
>>     across 8 disks. Probably easier than "2", but still,
>>     it is an extra level of complexity, will delays tests.
>> I am tempted to just go with "1" unless Andy can magically
>> enhance the loader easily.
>>
>> Opinions?
>>
>> Jacek
>>
>>
>>
>> I am leaning towards JBOD, 8 disks for data
>>
>>
>> On 01/16/2015 04:34 AM, Yvan Calas wrote:
>>> Hi,
>>>
>>>> On 15 Jan 2015, at 00:16, Fabrice Jammes <[log in to unmask]>
>>>> wrote:
>>>>
>>>> Here's what Qserv team would like to have on the cluster, in addition
>>>> to what we have defined previously:
>>>>
>>>> - Scientific Linux on all node, in order to get C++11 support
>>>
>>> Do you need SL7 only because of C++11, or is there any other reason? Is
>>> it possible to have full C++11 support on SL6 nodes actually?
>>>
>>> As I already told you, it might take time to install SL7 on servers at
>>> CC-IN2P3 (probably 2 months or more).
>>>
>>>> - 10TB shared storage available for all nodes and able to support a
>> large amount of io during data-loading
>>>
>>> The features of the new Dell machines are as follow:
>>>
>>> - DELL PowerEdge R620
>>>    + 2 x Processors Intel Xeon E5-2603v2 1.80 Ghz 4 cores, 10 Mo cache,
>>>    6.4 GT/s , 80W + RAM: 16 Go DDR-3 1600MHz (2x8Go) + 10 x 1 TB disk
>>>    Nearline SAS 6 Gbps 7200 Tpm 2,5" - hotplug + 1 x RAID card H710p
>>>    with 1 GB nvram + 1 x 1 GbE with 4 ports Broadcom® 5720 Base-T card
>>>    + 1 x iDRAC 7 Enterprise card + redundant power supply
>>>> and some questions:
>>>>
>>>> - what will be the disk architecture (which kind of RAID, or something
>> else, or nothing?)
>>>
>>> Since there is 10TB on each server, we plan to configure them in RAID-6
>> (2 parity disks - 7.4 TB available), or in RAID-5 (one parity disk - 8.4TB
>> available) in you need more space on each node. If you are thinking of
>> another better RAID configuration for qserv, please let us know ;)
>>>
>>> Note that we plan to install the 25 first machines int the computing
>>> center at the beginning of week 5 (26-27/01/2015).
>>>
>>>> - we don't know a lot about Puppet and would like to know which kind of
>> feature it offers (system monitoring, service restart, ...)?
>>>
>>> qserv admins at CC-IN2P3 (mainly myself) will write a puppet module in
>>> order to:
>>>
>>> - deploy the qserv software automatically,
>>> - tune the OS and qserv parameters.
>>>
>>> Moreover, since the soft will run as qserv user (as it was the case last
>> year on the 300+ nodes), I guess that you will be able to restart the
>> service, change qserv configuration files if needed, etc. using sudo.
>>>
>>> The monitoring will be based on Nagios (probes to define and write) and
>> collectd/smurf mainly. However plots generated by smurf will be only
>> accessible from inside CC-IN2P3. If some extra monitoring is needed, we
>> will deploy it.
>>>
>>>
>>>> Would it be possible to talk to a Puppet/monitoring expert when these
>> kind of questions occurs?
>>>
>>> I am not an expert of puppet, but I can try to answer to all of your
>> questions ;) If I don't know the answer, I will ask to Mattieu, and if
>> really needed you can contact him directly.
>>>
>>> Cheers,
>>>
>>> Yvan
>>>
>>>
>>> ---
>>> Yvan Calas
>>> CC-IN2P3 -- Storage Group
>>> 21 Avenue Pierre de Coubertin
>>> CS70202
>>> F-69627 Villeurbanne Cedex
>>> Tel: +33 4 72 69 41 73
>>>
>>>
>>> ########################################################################
>>> Use REPLY-ALL to reply to list
>>>
>>> To unsubscribe from the QSERV-L list, click the following link:
>>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>>>
>>
>> ########################################################################
>> Use REPLY-ALL to reply to list
>>
>> To unsubscribe from the QSERV-L list, click the following link:
>> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>
>
> ########################################################################
> Use REPLY-ALL to reply to list
>
> To unsubscribe from the QSERV-L list, click the following link:
> https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1
>

########################################################################
Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1