Hello,

At JRES2015, I've seen of presentation made by Nicolas Muller and Jerome Petazzoni (a Docker core developper):
https://conf-ng.jres.org/2015/planning.html#article_180

I discussed a long time with Nicolas Muller and he gave me some interesting informations:

1. Kernel Panic:
    * Nicolas confirms Docker is used on numerous production infrastructures on bare-metal cluster without problems.
    * Nicolas think we should upgrade to a newer kernel (Andy H. link push also in this way: https://conf-ng.jres.org/2015/planning.html#article_180)
    * Nicolas also told me that Docker work better on Debian/Ubuntu (because of the AUFS file system support)

    => Ahmed and Yvan, could we try to switch to EPEL latest kernel on IN2P3 cluster (for ccqserv125...149 first)? Can I do it by myself, or does it require some Puppet magic?
    If this doesn't solve the "Kernel Panic" problem, then I propose to implement Ahmed proposal:
        i.e. put container in VM, which mount data on local block devices, so that we can restart VM by ourself. Could we please install Debian in these VM?

2. Docker images management on wide-clusters:
    * Docker swarm should be enough for our needs as a shmux replacement,
    * Mesos is very powerfull and easy to install, but some feature are commercial,
    * Docker UCP (https://www.docker.com/universal-control-plane) is a Docker commercial solution to manage cluster of containers, if needed we could have it for cheap price because Docker is looking for its first customers,
    * Kubernetes is a complex solution and Nicolas doesn't recommend to use it for simple requirements like ours.

3. Configuration management
    Nicolas recommend to create data (read-only?) containers to store/deploy Qserv configuration.
    He think data containers shouldn't be used for our real data and we should mount it as a we do know.

4. Log management
    Nicolas recommend to externalize the logs outside containers (this is what John has done on IN2P3 cluster)

All of this will of course require some additional JIRA tickets.

Regards,

Use REPLY-ALL to reply to list

To unsubscribe from the QSERV-L list, click the following link:
https://listserv.slac.stanford.edu/cgi-bin/wa?SUBED1=QSERV-L&A=1