Visualizing POCSAG telegrams with collectd, Logstash, InfluxDB and Grafana

As you may know our SMS alerting solution ZABOS captures POCSAG telegrams and notifies assigned fire departments of new missions by sending out plain-old SMS notifications. Alerting by SMS is just an additional service – the firefighters still use their proprietary hardware. To visualize incoming POCSAG telegrams and outgoing SMS notifications we have been using the combination of InfluxDB and Grafana for about 5 months with great success. We already discussed the basic installation and configuration of InfluxDB and Grafana on another place.

In this blog post we want to share with you how we parse the log messages and how they are visualized in Grafana.

Architectural overview

The ZABOS hardware appliances are located near the broadcasting towers for receiving a clear signal. ZABOS collects incoming POCSAG messages using a DVB-T stick. Because of regulations a direct connection to the cell tower is prohibited. A more detailed description can be found in a previous blog post (German only). Due to some technical constraints we are only able to receive the logs of the appliances during their nightly backups when they are securely transferred to a central server.

After the backup has been copied we forward the log file with the help of logstash-forwarder to a Logstash instance. The Logstash instance then analyzes the data, extracts the metrics and uses an output filter to forward the metrics to InfluxDB.

Bulk importing statistical data

In our original setup we used statsd to push the analyzed data into InfluxDB. While this makes perfect sense for live metrics it has a heavy drawback for bulk imports: The statsd protocol does not have an option to provide the timestamp of the metric. The timestamp of every metric pushed from statsd to InfluxDB contains the current time. There is no way a custom timestamp can be set through the statsd protocol.

This circumstance means that all our metrics of the bulk import of the last 24 hours have been compressed into one timeframe of a few minutes. Instead of Grafana showing the POCSAG and SMS messages distributed over the whole day, they were displayed in a tiny timeframe shortly after the backup has been run.

After checking the options we decided to switch from statsd to collectd. collectd allows you to provide a custom timestamp. Our Logstash input filter assigns a timestamp which is present in the log messages. A nice side effect of using collectd is InfluxDB’s native support. We no longer need to use the statsd node.js wrapper. Logstash can directly send the metrics to InfluxDB’s collectd backend through the graphite output filter.

Parsing the log files with Logstash

The log file we receive during the backup contains a lot of information that are required for support cases. It also contains the Radio Identification Code of the incoming POCSAG telegram and the amount of outgoing SMS:

  1. An incoming POCSAG telegram is logged like
  2. A batch of outgoing SMS looks like

Both log message formats can be easily converted into structured data with the help of a custom Logstash Grok filter:

As mentioned previously we use graphite to forward the data to InfluxDB. Based upon the alarm_type we put the metric in different metric slots:

Visualizing the activity with Grafana

Due to Grafana’s auto-detection of metrics it was easy to build the graphs for outgoing SMS and incoming POCSAG telegrams.

Outgoing SMS messages during fire practice alarms visualized with Grafana

Outgoing SMS messages during fire practice alarms

Incoming POCSAG telegrams during practice alarms visualized with Grafana

Incoming POCSAG telegrams during fire practice alarms

Incoming POCSAG alarms during the day visualized with Grafana

Incoming POCSAG alarms distributed over the whole day. The longer bars are POCSAG beacons which are sent in a periodical 2 minute interval.

TL;DR

In this blog post we described our approach to collect metrics from a bulk-imported log file and how we visualize them using InfluxDB and Grafana.