Hello,
At JRES2015, I've seen of presentation made by Nicolas Muller and
Jerome Petazzoni (a Docker core developper):
https://conf-ng.jres.org/2015/planning.html#article_180
I discussed a long time with Nicolas Muller and he gave me some
interesting informations:
1. Kernel Panic:
* Nicolas confirms Docker is used on numerous production
infrastructures on bare-metal cluster without problems.
* Nicolas think we should upgrade to a newer kernel (Andy H. link push also in this
way: https://conf-ng.jres.org/2015/planning.html#article_180)
* Nicolas also told me that Docker work better on Debian/Ubuntu
(because of the AUFS file system support)
=> Ahmed and Yvan, could we try to switch to EPEL latest
kernel on IN2P3 cluster (for ccqserv125...149 first)? Can I do
it by myself, or does it require some Puppet magic?
If this doesn't solve the "Kernel Panic" problem, then I propose
to implement Ahmed proposal:
i.e. put container in VM, which mount data on local block
devices, so that we can restart VM by ourself. Could we please
install Debian in these VM?
2. Docker images management on wide-clusters:
* Docker swarm should be enough for our needs as a shmux
replacement,
* Mesos is very powerfull and easy to install, but some feature
are commercial,
* Docker UCP (https://www.docker.com/universal-control-plane)
is a Docker commercial solution to manage cluster of containers, if
needed we could have it for cheap price because Docker is looking
for its first customers,
* Kubernetes is a complex solution and Nicolas doesn't recommend
to use it for simple requirements like ours.
3. Configuration management
Nicolas recommend to create data (read-only?) containers to
store/deploy Qserv configuration.
He think data containers shouldn't be used for our real data and
we should mount it as a we do know.
4. Log management
Nicolas recommend to externalize the logs outside
containers (this is what John
has done on IN2P3 cluster)
All of this will of course require some additional JIRA tickets.
Regards,