In an earlier post, we covered basics of setting up CollectD along with InfluxDB and Grafana to monitor system metrics. Real world though has more complex use cases and we are going to tackle one today. In the process we are going to touch on configuring custom plugin, custom types in CollectD DB and then pushing this data to InfluxDB for Grafana to consume. It is highly recommended that you digest earlier post and keep setup ready to play along but brave souls can continue at their own risk
Use case:
Let’s pick up a sample hypothetical application which manages the restaurant orders. Each order is entered with a unique order number and total order amount in a DB table. The requirement is that we need to plot total orders placed per minute. So we should eventually see a trend of order amount over period of a time.
Setup
I am going to use one of collected agent boxes from setup outlined in previous blog post. Let’s first install MySQL there (Available as MariaDB package in CentOS 7)
sudo yum install mariadb-server
For connecting to MySQL you can use usual command. Password is going to be blank since this is a fresh installation, so just hit enter when prompted for password:
Let’s create a Database and create a table to store the data we need:
create database hotel_app;
use hotel_app;
create table orders (order_id bigint auto_increment primary key, order_amount bigint, time TIMESTAMP);
So our table stores order amount and time when it was processed and maps it to a order ID. Now let’s setup a simple cron which will keep adding random entries to this table which we can consume later.
* * * * * sudo mysql -u root < /home/vagrant/test.sql
I added following SQL line in test.sql 5 times so that I get more random entries, you can add as per your taste:
INSERT into hotel_app.orders set order_amount = FLOOR(RAND() * 4001) + 100;
Configuration: CollectD
We have now setup our source of data so let’s get on to querying it and passing it via CollectD. We want to query for order amounts for last one minute and pass them on, so the SQL query looks like:
select order_id, order_amount from orders where time >= DATE_SUB(NOW(), INTERVAL 60 SECOND);
Before we configure the database query in CollectD plugin, do verify that logfile plugin with following configuration is in place. This plugin will enable us to get log data in case we run into any errors. This config will go in /opt/collectd/etc/collectd.conf
LoadPlugin logfile
<Plugin "logfile">
LogLevel "notice"
File "/var/log/collectd.log"
Timestamp true
</Plugin>
In order for CollectD to connect to a database and query we need dbi plugin - which is a short form for Database Interface. Database interface allows us to connect to database in a neutral manner and gets the data. Let’s quickly look at config and run through details:
<Plugin dbi>
<Query "amount_per_order">
Statement "select order_id, order_amount from orders where time >= DATE_SUB(NOW(), INTERVAL 60 SECOND);"
# Use with MySQL 5.0.0 or later
MinVersion 50000
<Result>
Type "AmountTrend"
InstancePrefix "amount_per_order"
InstancesFrom "order_id"
ValuesFrom "order_amount"
</Result>
</Query>
<Database "hotel_app">
Driver "mysql"
DriverOption "host" "localhost"
DriverOption "username" "root"
DriverOption "password" ""
DriverOption "dbname" "hotel_app"
SelectDB "hotel_app"
Query "amount_per_order"
</Database>
</Plugin>
<LoadPlugin dbi>
Interval 60
</LoadPlugin>
So some important points from above config:
* The query block configures the SQL query and results. We could have configured multiple result blocks from same query - but for now let's keep it simple
* The query block's name must match "Query" config in Database block.
* The result block reads the results and creates data instances based on column configured in "InstancesFrom" - and must be only one per result set. There can be multiple "ValuesFrom" which are associated with a given data instance. So let's say one order ID can have other details/values like order total, order details etc.
* Finally a small quirk - the Query block must be before Database block - as the data is interpreted from top to bottom.
In result block above we have mentioned “Type” of result as “AmountTrend” - which is a type that CollectD must know the schema/definition for. It by default supports 4 standard (ABSOLUTE, COUNTER, DERIVE, GUAGE know more about Data Sources here) types and some more custom types defined in types.db. For our use case we are going to define a new type and add following line to /opt/collectd/share/collectd/types.db:
AmountTrend value:ABSOLUTE:0:U
All this is saying is that we have one value of type ABSOLUTE and the lower bound being 0 and upper bound is unknown/infinite. If we expect some limits on data range - we can configure those for example 0:500
CollectD : Building Plugin
Now let’s restart the CollectD service and see how things go. Oh crap! We have some errors:
{From sudo service collectd status}
Mar 29 05:27:30 agent1.hs.com collectd[12935]: plugin_load: Could not find plugin "dbi" in /opt/collectd/lib...ectd
{From Collectd Log file}
[2016-03-29 05:28:55] plugin_load: Could not find plugin "dbi" in /opt/collectd/lib/collectd
The way collectD works is all plugins we configure in conf file use underlying “SO libraries” to make the calls. If you navigate to directory /opt/collectd/lib/collectd you will notice quite a few of those files. What we are missing is a dbi.so - which will help make the underlying calls. The best way is to compile the plugin from source for your target platform. I used this StackOverflow conversation to compile the dbi.so - follow instructions there. Except for don’t checkout the version mentioned of source code from git, simply checkout master.
After installing the SO to intended location we also need to install libdbi libraries - this library is used for mysql specific calls.
sudo yum install -y libdbi-dbd-mysql
Data & Graphs:
After this if you restart collectd service and go to InfluxDB UI, choose collectd as DB and query for data, you should see incoming data.
Finally if we can form the query in Grafana and see results. I have pasted the query from Grafana editor below for reference:
SELECT sum("value") FROM "dbi_" WHERE $timeFilter GROUP BY time(1m) fill(none)
The restaurant owner is now happy that he can track total orders placed per minute ;)
Conclusion
We used database plugin to query and report results in this post but this logic can be extended to a variety of plugins and the kind of things that can be monitored is literally endless. Would love to hear your CollectD stories and thoughts.
Looking for help with observability stack implementation and consulting? do check out how we’re helping startups & enterprises as an observability consulting services provider and Grafana experts.