Getting Started with Graylog on MacOS

Graylog is an amazing open source tool for recording, viewing and analysing application logs. The performance and full text indexing of the underlying elasticsearch database means you probably won’t need any other logging tools throughout your whole organisation.

Application developers writing to Graylog may need a local instance to retrieve diagnostic information during development. This post gives some quick instructions for setting up a local logging server using docker.

Docker is used for managing lightweight operating system containers with isolated process and network namespaces – each container can have a process of PID 10 using TCP port 80. Unfortunately, Mac OS X’s kernel doesn’t contain the necessary APIs to run a Linux-based container and so we need to use a Virtual Machine running Linux itself. Boot2docker, installed via Homebrew, comes to our rescue to make the whole process as simple as possible:

brew install Caskroom/cask/virtualbox
brew install Caskroom/cask/boot2docker
brew install docker

Now all the tools are installed we need to download and start up the Linux VM:

boot2docker download
boot2docker init
boot2docker start

Another complexity of the docker host being in a virtual machine is that the docker command line interface needs to connect across a virtual network interface. Boot2docker helps us here by telling us the required settings. I put the ‘export’ lines in my ~/.profile so that the docker command will work without any special setup in the future.

boot2docker shellinit
# outputs:
Writing /Users/djb/.boot2docker/certs/boot2docker-vm/ca.pem
Writing /Users/djb/.boot2docker/certs/boot2docker-vm/cert.pem
Writing /Users/djb/.boot2docker/certs/boot2docker-vm/key.pem
    export DOCKER_HOST=tcp://
    export DOCKER_CERT_PATH=/Users/djb/.boot2docker/certs/boot2docker-vm
    export DOCKER_TLS_VERIFY=1

To test your new docker installation you can run the hello-world container:

docker run hello-world

Assuming that went okay we can now get started with Graylog! Run the following command to download the image for containers we’ll create. This may take some time as there are a few hundred MB of Linux distribution and Java libraries to fetch.

docker pull graylog2/allinone

You can now create a new container. The numeric arguments are the TCP ports that will be exposed by the container.

docker run -t -p 9000:9000 -p 12201:12201 graylog2/allinone

Once the flurry of activity in your console comes to a stop the container is fully up and running. You can visit the admin page by visiting http://ip-of-server:9000. The IP address is that of your boot2docker virtual machine, which you found earlier using boot2docker shellinit. The default username and password to the web interface are admin and admin.

Once you’re inside the web interface Graylog will warn you that there are no inputs running. To fix this, browse to http://ip-of-server:9000/system/inputs and add an input of type GELF HTTP running on port 12201. The name of the input isn’t important unless you have plans for hundreds of them so I’ve unimaginatively called mine GELF HTTP.

Now we’ve got a running server configured enough for testing, but no logs! A real application will produce logs of its own but for the purpose of demonstration we can write a script to read in your local computer’s system log and send the messages to Graylog for indexing. Note that you’d never do this in production as Graylog is perfectly capable of using the syslog UDP protocol, which would avoid the need to write any code.

#!/usr/bin/env python
import urllib2
import json
path = "/var/log/system.log"
with open(path) as log:
    log = f.readlines()
for line in f:
    message = {}
    message['version'] = '1.1'
    message['host'] = 'localhost'
    message['short_message'] = line
    req = urllib2.Request('')
    req.add_header('Content-Type', 'application/json')
    response = urllib2.urlopen(req, json.dumps(message))

Once your logs are in the server you can start searching them. Try querying for kernel to find all the messages logged from the Mac OS X kernel. Having a full text index of our logs is useful but belies the promise of elasticsearch’s structured storage. A more useful implementation would log application-specific information as fields in the GELF message. For example, with the following code a single click a complete history of logs for a particular client could be retrieved:

message['client_ip'] = ipFromApplicationVariable