CoreOS and Nuxeo: How We Built nuxeo.io


Mon 07 April 2014 By Damien Metzler

If you follow the Nuxeo blogs, you may have seen that we have been working with Docker for a few months now. Not because it's very trendy, but because we will be using it in the infrastructure of nuxeo.io. In this post I will explain how we see our global infrastructure.

A Platform Designed for Failure


One of the things we learned when reading about building a cloud platform is that the system must be resilient to failure. This means that a host may be down, a process may halt, etc., but your platform must live with that. If a node of your cluster goes down, the service shouldn't be altered, and the system must react automatically to re-balance the missing services on other nodes - very fast.

This is where Docker comes in. Since the kernel is shared by all the containers, the startup time of a service is very quick - no need to provision a virtual machine for instance. To start a new Nuxeo container, it only takes about 30 seconds.

Cluster Management


When you need to start a lot of containers, you have to find a way to manage them. This means knowing if they are started, where they are located, etc... In order to do that, we found etcd, a distributed registry across a cluster. When we set a key in that registry from one host, it is then accessible from all other hosts in the cluster. Moreover, we can watch for key modifications and setup a TTL for a key. This last feature allows us to very quickly setup a heartbeat mechanism.

As etcd is part of CoreOS, we look at that distribution which is a tiny distribution that embeds Docker, etcd and systemd. Systemd allows us to start Docker containers, but we don't use it directly. We use fleet which can be seen as a "systemd over a cluster" - it is also bundled with CoreOS.

With all that stuff, we have a running cluster that is synchronized, and on which we can run services. Each time we start a service, we register it in etcd with its attributes (ip, port). For instance we have a docker registry service that holds our private container images.

Data Free Runtime Containers


Remember now that we want to be able to deal with failures. If a Nuxeo server goes down, or is frozen or whatever, we destroy it and restart it with fleet. That means we loose all data may be held in our container. We could restart the instance, analyze things and get back the data, but it's too long and requires an administrator operation.

In our cluster configuration, every Docker image in the stop status is destroyed. We treat our container as an execution part of the platform - it can run everywhere in the cluster. This means that the database and the binary manager must be exernalized, and the logs as well.

Dynamic Virtual Hosting and Load Balancing


The other part of the cluster is the public facing part. In order to achieve that we need to route and balance the requests for a given host to the proper container serving the request.

A dynamic proxy doesn't route requests based on a configuration file, but on a route database that can be altered without having to restart the process.

In our cluster we have such a database - it is etcd. Each domain we want to serve will have its own key, and each key will reference an environment id (i.e. a running Nuxeo instance).

There are already some dynamic proxy implementations, but no one fit our needs.


As none of them met our needs, we decided to develop our own reverse proxy called Gogeta, reusing some basic ideas from several tools we looked at:

Big Infrastructure Picture


In the following picture we can follow how a request is issued to a container:


On the bottom, fleet helps us to start new containers in the cluster.

Conclusion


Docker is a good piece of software to build a fault tolerant infrastructure. CoreOS provides us with a cool integration of it, providing us the missing tools to manage the cluster.

nuxeo.io will be soon open-sourced (some parts of it are already done), so you can play with it and give us some feedback. When we think it is ready for testing, then we will announce it. Stay tuned !


Category: Product & Development
Tagged: CoreOS, Docker, Insight
Check out the features of our latest Nuxeo Platform Download Nuxeo