If you follow the Nuxeo blogs, you may have seen that we have been working with Docker for a few months now. Not because it's very trendy, but because we will be using it in the infrastructure of nuxeo.io. In this post I will explain how we see our global infrastructure.
A Platform Designed for Failure
One of the things we learned when reading about building a cloud platform is that the system must be resilient to failure. This means that a host may be down, a process may halt, etc., but your platform must live with that. If a node of your cluster goes down, the service shouldn't be altered, and the system must react automatically to re-balance the missing services on other nodes - very fast.
This is where Docker comes in. Since the kernel is shared by all the containers, the startup time of a service is very quick - no need to provision a virtual machine for instance. To start a new Nuxeo container, it only takes about 30 seconds.
When you need to start a lot of containers, you have to find a way to manage them. This means knowing if they are started, where they are located, etc... In order to do that, we found etcd, a distributed registry across a cluster. When we set a key in that registry from one host, it is then accessible from all other hosts in the cluster. Moreover, we can watch for key modifications and setup a TTL for a key. This last feature allows us to very quickly setup a heartbeat mechanism.
As etcd is part of CoreOS, we look at that distribution which is a tiny distribution that embeds Docker, etcd and systemd. Systemd allows us to start Docker containers, but we don't use it directly. We use fleet which can be seen as a "systemd over a cluster" - it is also bundled with CoreOS.
With all that stuff, we have a running cluster that is synchronized, and on which we can run services. Each time we start a service, we register it in etcd with its attributes (ip, port). For instance we have a docker registry service that holds our private container images.
Data Free Runtime Containers
Remember now that we want to be able to deal with failures. If a Nuxeo server goes down, or is frozen or whatever, we destroy it and restart it with fleet. That means we loose all data may be held in our container. We could restart the instance, analyze things and get back the data, but it's too long and requires an administrator operation.
In our cluster configuration, every Docker image in the stop status is destroyed. We treat our container as an execution part of the platform - it can run everywhere in the cluster. This means that the database and the binary manager must be exernalized, and the logs as well.
Dynamic Virtual Hosting and Load Balancing
The other part of the cluster is the public facing part. In order to achieve that we need to route and balance the requests for a given host to the proper container serving the request.
A dynamic proxy doesn't route requests based on a configuration file, but on a route database that can be altered without having to restart the process.
In our cluster we have such a database - it is etcd. Each domain we want to serve will have its own key, and each key will reference an environment id (i.e. a running Nuxeo instance).
There are already some dynamic proxy implementations, but no one fit our needs.
- Hipache stores its data in Redis,
- Strowger uses some Flynn primitives that we don't want to depend on,
- Active-proxy: This was just a POC and hasn't evolved in a long time (meaning several months, which is very long in the Docker ecosystem ;-) ), and
- Boxcars: Based on a configuration file and has some use cases we don't need.
As none of them met our needs, we decided to develop our own reverse proxy called Gogeta, reusing some basic ideas from several tools we looked at:
- Written in Go: generates a native executable and has some basic primitives to run a proxy server,
- Gets and watches its configuration from etcd, and
- Keep it small and simple (KISS)
Big Infrastructure Picture
In the following picture we can follow how a request is issued to a container:
- The user enters http://mydomain.nuxeo.io/.
- It ends on the front load balancer that randomly sends the request to one of the coreOS hosts on the Gogeta endpoint.
- Gogeta reads etcd in the /domains keys to know that it must proxy on the NXIO-0001 container.
- Gogeta gets the properties of NXIO-0001 in etcd, checks that the status is okay and proxies the request.
On the bottom, fleet helps us to start new containers in the cluster.
Docker is a good piece of software to build a fault tolerant infrastructure. CoreOS provides us with a cool integration of it, providing us the missing tools to manage the cluster.
nuxeo.io will be soon open-sourced (some parts of it are already done), so you can play with it and give us some feedback. When we think it is ready for testing, then we will announce it. Stay tuned !