Monitoring Nuxeo Docker Container Logs with Logstash, Elasticsearch and Kibana


Wed 02 April 2014 By Laurent Doguin

Logstash
A while ago I wrote a blog post about CoreOS and monitoring the Nuxeo Platform. It covered gathering the metrics of a Nuxeo Platform Server, the associated PostgreSQL database and the host system. The particularity of this setup was that the host was a Docker container running on CoreOS.

Today I will cover another aspect of monitoring - the log files. The goal is to store all the log entries from Nuxeo, Apache and PostgreSQL inside Elasticsearch. Then, they are easy to browse with Kibana. To forward the logs to Elasticsearch, I will use LogStash.

The first step was to setup Docker containers with Logstash, Elasticsearch and Kibana. It's easy as a lot of images already exist in the Docker index.

Elasticsearch


Elasticsearch is an open source distributed search engine. This is where all my log entries will be stored. And it actually has its own official image. So we can use it directly:

docker run -name elasticsearch -h elasticsearch -d -P dockerfile/elasticsearch

This command pulls the image directly from the public Docker index and runs it as daemon. As we have not declared any port mapping, you need to inspect the running container. The docker ps is enough for that. You should see something like:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
92ea2c4fdb69 dockerfile/elasticsearch:latest /opt/elasticsearch/b 6 seconds ago Up 5 seconds 0.0.0.0:49153->9200/tcp, 0.0.0.0:49154->9300/tcp elasticSearch

This means that on my docker host, the Elasticsearch server 9200 port is accessible from the port 49153 and 9300 from 49154.

Kibana


Now about Kibana. It's an AngularJS application backed by the Elasticsearch team. This app lets you query and display results from an Elasticsearch instance. It's particularly well suited for log aggregators like Logstash or Flume, among others.

I found a cool Dockerfile and its related blog post. I updated it a little to have the latest Kibana version.

Basically, it installs nginx, the latest Kibana release and sets them up. Here's the Dockerfile:

# Kibana image started from arcus-io/docker-kibana
FROM base
MAINTAINER ldoguin
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y wget nginx-full unzip
RUN (cd /tmp && wget --no-check-certificate http://download.elasticsearch.org/kibana/kibana/kibana-latest.zip -O pkg.zip && unzip pkg.zip && cd kibana-* && cp -rf ./* /usr/share/nginx/www/)
RUN echo "daemon off;" >> /etc/nginx/nginx.conf
ADD run.sh /usr/local/bin/run
RUN chmod +x /usr/local/bin/run
RUN rm -rf /tmp/*

EXPOSE 80
CMD ["/usr/local/bin/run"]

This is the run script. It gathers environment variables called ES_HOST and ES_PORT. They should, as you probably guessed, already be the Elasticsearch container hostname and the mapped port 9200 (49153 for my current setup). They are then used by the Kibana configuration file to build the address Kibana uses to query Elasticsearch. The queries are run through your browser, so you can't use the private Docker port and address you usually retrieve by linking containers.

#!/bin/bash
sed -i "s/elasticsearch: "http://"+window.location.hostname+":9200",/elasticsearch: "http://$ES_HOST:$ES_PORT",/g" /usr/share/nginx/www/config.js
nginx -c /etc/nginx/nginx.conf

This is the command I use to run the Kibana container:

docker run -d -P -h kibana -name kibana -e ES_HOST=docker -e ES_PORT=49430 nuxeo/kibana

Environment variables are given to the container using the -e option. Now you should have a running Kibana.

Logstash


Logstash is also backed by the Elasticsearch team.

Logstash helps you take logs and other event data from your systems and store them in a central place.

Again, I start from the arcus-io Dockerfile to get a slightly different version with the OpenJDK and the latest Logstash version.

# Logstash image started from https://github.com/arcus-io/docker-logstash
FROM ubuntu:quantal
MAINTAINER ldoguin "[email protected]"

RUN echo "deb http://archive.ubuntu.com/ubuntu quantal main universe multiverse" > /etc/apt/sources.list
RUN apt-get update

# Small trick to Install fuse because of container permission issue.
RUN apt-get -y install fuse || true
RUN rm -rf /var/lib/dpkg/info/fuse.postinst
RUN apt-get -y install fuse

RUN apt-get install -y wget openjdk-7-jdk
RUN apt-get install -y curl
RUN curl -O https://download.elasticsearch.org/logstash/logstash/logstash-1.4.0.beta2.tar.gz
RUN tar zxvf logstash-1.4.0.beta2.tar.gz
RUN mv logstash-1.4.0.beta2 /opt/logstash

ADD run.sh /usr/local/bin/run
ADD logstash.conf /opt/logstash/logstash.conf

EXPOSE 514
EXPOSE 9200
EXPOSE 9292
EXPOSE 9300
CMD ["/usr/local/bin/run"]

And this is the associated run.sh file:

#!/bin/bash
ES_HOST=${ES_HOST:-127.0.0.1}
ES_PORT=${ES_PORT:-9300}

sed -i "s/elasticsearch { host => docker }/elasticsearch { host => "$ES_HOST" port => $ES_PORT }/g" /opt/logstash/logstash.conf

/opt/logstash/bin/logstash -f /opt/logstash/logstash.conf --verbose

I setup my own logstash configuration to handle log entries coming from PostgreSQL, Nuxeo Platform and Apache.

Here we replace the Elasticsearch settings by the one given by the environment variable. Then I can run it like this:

docker run -d -P -volumes-from nuxeoserver -e ES_PORT=49431 -e ES_HOST=docker -name logstash nuxeo/logstash

The -volumes-from option is here to tell our container to use the volumes defined in the nuxeoserver container. It means I already have a nuxeoserver container running with some defined volumes. These volumes are the folders containing the logs I am interested in.

This is the command I used to run my nuxeoserver container:

docker run -h nuxeoserver -P -d -volumes-from logsData -name nuxeoserver nuxeo/nuxeo

The Nuxeo container references another volume called logsData. Here's how I run it:

docker run -v /var/log/apache2 -v /var/log/postgresql/ -v /var/log/nuxeo/ -name logsData busybox true

This is a data-only container. What it does is simply run a container with predefined volumes and exits immediately. But you can still use it thanks to the -volumes-from option. It makes it easy to share volumes between containers or to export data stored in the data-only container. In a production environment for instance, I would have probably added the Nuxeo and PostgreSQL data directories.

If you want to learn more about the data-only container I suggest you read this great post from Michael Crosby.

Anyway, now everything is set up. If you go to Kibana, you should see the default welcome page. There is a link to their default Logstash dashboard. Click on it and you'll see your log event piling up.

Kibana


Category: Product & Development
Tagged: Docker, Elasticsearch, How to