Configure the RRD machine

Prerequisites

For the examples given here the following items must be installed on the RRD system.

For the rest of this document it is assumed that you are running Linux on your RRD system. This is not the only possible option, the necessary items are also available for other types of systems. It is beyond the scope of this document to describe where to get the above mentioned items precompiled for your system and how to install them. Refer to the documentation of your distribution and/or the documentation of the individual sources for more information.

Collecting and storing performance data

Introduction

In this chapter the terms collector and database will be used frequently. The collector is the script that queries the LEAF system via SNMP and stores the retrieved values in a database, in this case an RRD database.

An RRD database can be defined to contain all sorts of information, datasets, in any combination you like. It is in general good practice to keep information of different types in different databases, but you'll have to find out for yourself which dataset definition will give you the most flexible solution for your situation.

In the following examples two datasets will be defined, one for network traffic statistics and one for cpuload.

Personally I like to structure the RRD related directories in such a way that there is a clear distinction between collectors and databases, and also between databases belonging to different hosts. In these examples the following directory structure is assumed:

/home/rrd/
       |
       +--- collectors/
       |
       +--- databases/
                 |
                 +--- leafhost/
                 |
                 +--- host2/
                 |
                   ... etc ...

After defining a database and creating the corresponding collector, the collector must be scheduled to run at regular intervals. This must be done for each collector/database. Cron is your friend here. An option that I favor myself is to have only one entry in /etc/crontab. This entry calls the overall collector script, which in turn calls each of the individual collector scripts. This avoids that for each new collector the system crontab file must be edited. In this case your /etc/crontab would have the following entry:

# /etc/crontab

...

# overall collector script
*/5 *   * * *   rrd    /home/rrd/collectors/collect-all

#

This means that the overall collector script is started every 5 minutes. The overall collector file /home/rrd/collectors/collect-all could look like:

#!/bin/sh
# Overall collector script

# Script for collecting interface statistics
/home/rrd/collectors/interface.pl

# Script for collecting cpu load
/home/rrd/collectors/cpuload.pl

Example 1: network traffic

Define the RRD database

If the number of interfaces on the LEAF system is fixed and will never change, you may choose to keep the traffic statistics of both interfaces in one database. If not, it's probably easier to define a database per interface. This makes it easier extend your RRD system for more interfaces that you may get on your LEAF system. Here a database for only one interface is created.

To create a new database, go to the data directory for the targeted host and create the dataset with the options as described below:

cd /home/rrd/databases/leafhost
rrdtool create eth0.rrd \
        -step 300 \
        DS:bytes_in:COUNTER:600:U:U \
        DS:bytes_out:COUNTER:600:U:U \
        RRA:AVERAGE:0.5:1:864 \
        RRA:AVERAGE:0.5:6:672 \
        RRA:AVERAGE:0.5:24:744 \
        RRA:AVERAGE:0.5:288:730

This has created a new database named eth0.rrd which expects new data every 300 seconds (step size). This is exactly the same as the schedule defined in the crontab file above.

The database contains two datasets, i.e. bytes_in and bytes_out, both of the type COUNTER.

Three round robin archives are defined containing avaraged values:

  • 864 samples of 1 step (5 seconds). This is a period of 3 days. Since the step size is one the actual value is stored and no average is calculated.

  • 672 averaged samples over 6 steps (30 minutes). This is a period of 2 weeks.

  • 744 averaged samples over 24 steps (2 hours). This a period of 2 weeks.

  • 730 averaged samples over 288 steps (1 day). This is a period of 2 years.

Create the collector

The data that can be retrieved from an SNMP agent is defined in a Management Information Base MIB). The objects in the MIB containing the interface traffic counters that are necessary for this example are:

  • .iso.org.dod.internet.mgmt.mib-2.interfaces.ifNumber = .1.3.6.1.2.1.2.1

  • .iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifDescr = .1.3.6.1.2.1.2.2.1.2

  • .iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifInOctets = .1.3.6.1.2.1.2.2.1.10

  • .iso.org.dod.internet.mgmt.mib-2.interfaces.ifTable.ifEntry.ifOutOctets = .1.3.6.1.2.1.2.2.1.16

In the sample script below the LEAF system is queried for the number of interfaces. The correct interface is selected based on the interface name and then the counters for bytes_in and bytes_out are read. Finally this information is stored into the database.

#!/usr/bin/perl

# interface.pl

use SNMP;
use RRDs;

$oid_ifNumber    = ".1.3.6.1.2.1.2.1";
$oid_ifDescr     = ".1.3.6.1.2.1.2.2.1.2";
$oid_ifInOctets  = ".1.3.6.1.2.1.2.2.1.10";
$oid_ifOutOctets = ".1.3.6.1.2.1.2.2.1.16";

$database = "/home/rrd/databases/leafhost/eth0.rrd";


#
# Open snmp session and get interface data
#
$session = new SNMP::Session(
                        DestHost  => "leafhost",
                        Community => "zaphod",
                        Version   => '2');
die "SNMP session creation error: $SNMP::Session::ErrorStr" unless (defined $session);

$numInts = $session->get($oid_ifNumber . ".0");

for $i (1..$numInts) {
    $name = $session->get($oid_ifDescr . "." . $i);
    if ( $name eq "eth0" ) {
        $in = $session->get($oid_ifInOctets . "." . $i);
        $out = $session->get($oid_ifOutOctets . "." . $i);
    }
}

die $session->{ErrorStr} if ($session->{ErrorStr});


#
# Update the database
#
RRDs::update ($database, "N:".$in.":".$out);
my $Err = RRDs::error;
die "Error while updating: $Err\n" if $Err;

#

Ofcourse this is only an example. You can use this to extend it to your own needs.

Example 2: cpu load

Define the RRD database

On Linux systems three types of cpu load (process time) exist, i.e. user, system, nice and idle. We will now define a database in which to store this information.

cd /home/rrd/databases/leafhost
rrdtool create cpuload.rrd \
        --step 300 \
        DS:user:COUNTER:600:0:100 \
        DS:system:COUNTER:600:0:100 \
        DS:nice:COUNTER:600:0:100 \
        DS:idle:COUNTER:600:0:100 \
        RRA:AVERAGE:0.5:1:864 \
        RRA:AVERAGE:0.5:6:672 \
        RRA:AVERAGE:0.5:24:744 \
        RRA:AVERAGE:0.5:288:730

The definition of this database has much in common with the previous database. Now four datasets have been defined instead of two. The definition of the round robin archives is the same.

Create the collector

The cpu load information is represented by the following objects in the MIB:

  • .iso.org.dod.internet.private.enterprises.ucdavis.systemStats.ssCpuRawUser = .1.3.6.1.4.1.2021.11.50

  • .iso.org.dod.internet.private.enterprises.ucdavis.systemStats.ssCpuRawNice = .1.3.6.1.4.1.2021.11.51

  • .iso.org.dod.internet.private.enterprises.ucdavis.systemStats.ssCpuRawSystem = .1.3.6.1.4.1.2021.11.52

  • .iso.org.dod.internet.private.enterprises.ucdavis.systemStats.ssCpuRawIdle = .1.3.6.1.4.1.2021.11.53

And this information can be retrieved and stored with the following script:

#!/usr/bin/perl

# cpuload.pl

use SNMP;
use RRDs;

$oid_ssCpuRawUser    = ".1.3.6.1.4.1.2021.11.50";
$oid_ssCpuRawSystem  = ".1.3.6.1.4.1.2021.11.51";
$oid_ssCpuRawNice    = ".1.3.6.1.4.1.2021.11.52";
$oid_ssCpuRawIdle    = ".1.3.6.1.4.1.2021.11.53";

$database = "/home/rrd/databases/leafhost/cpuload.rrd";


#
# Open snmp session and get interface data
#
$session = new SNMP::Session(
                        DestHost  => "leafhost",
                        Community => "zaphod",
                        Version   => '2');
die "SNMP session creation error: $SNMP::Session::ErrorStr" unless (defined $session);

$cpuUser   = $session->get($oid_ssCpuRawUser . ".0");
$cpuSystem = $session->get($oid_ssCpuRawSystem . ".0");
$cpuNice   = $session->get($oid_ssCpuRawNice . ".0");
$cpuIdle   = $session->get($oid_ssCpuRawIdle . ".0");


#
# Update the database
#
RRDs::update ($database, "N:".$cpuUser.":".$cpuSystem.":".$cpuNice.":".$cpuIdle);
my $Err = RRDs::error;
die "Error while updating: $Err\n" if $Err;

#

Retrieving and presenting performance data

Introduction

After you finished the scripts and the overall collector has been called a few times by cron, it's time to make some graphics.

The follwoing assumptions are made with respect to the configuration of the webserver:

  • An alias /images/ is defined for /var/www/images/

  • The images directory has a subdirectory rrdimg in which the rrd graphs will be created

For ease of reuse a separate php file is used in which the generic functions for drawing graphs are defined. This file is included by the other scripts.

Example 1: network traffic

First a file graphs.php is defined that contains the functions to draw the graphs.

<?php

## graphs.php
##
## A set of php functions to create rrd graphs


function interface ($start)
{
    $database = "/home/rrd/databases/leafhost/eth0.rrd";
    $imgfile = "eth0$start.gif";

    $opts = array( "--start", "$start",
        "--vertical-label", "Bytes/sec",
        "--width", "400",
        "DEF:in=$database:bytes_in:AVERAGE",
        "DEF:out=$database:bytes_out:AVERAGE",
        "LINE2:in#00ff00:In",
        "LINE2:out#ff0000:Out"
    );

    make_graph ($imgfile, $opts);
}


function make_graph ($file, $options)
{
    $ret = rrd_graph("/var/www/images/rrdimg/$file", $options, count($options));

    ## if $ret is an array, then rrd_graph was successful
    ##
    if ( is_array($ret) ) {
        echo "<img src=\"/images/rrdimg/$file\" border=0>";
    }
    else {
        $err = rrd_error();
        echo "<p><b>$err</b></p>";
    }
}

?>

Then the actual page that contains the network traffic graphs can be created.

<html>
    <head>
        <title>Interface statistics</title>
    </head>
    <body>
        <h1>Interface statistics</h1>
<?php
        require "graphs.php";

        print "<h2>Daily graph</h2>\n";
        interface ("-1d");
        print "<h2>Weekly graph</h2>\n";
        interface ("-1w");
        print "<h2>Monthly graph</h2>\n";
        interface ("-1m");
?>
    </body>
</html>

Now fire-up your browser and access the page that you just created. Sit back and enjoy !!

Example 2: cpu load

First we add a function to draw cpuload garphs to the file graphs.php.

<?php

## functions.php
##
## A set of php functions to create rrd graphs

...

function cpuload ($start)
{
    $database = "/home/rrd/databases/leafhost/cpuload.rrd";
    $imgfile = "cpu$start.gif";

    $opts = array( "--start", "$start",
        "--vertical-label", "Load (%)",
        "--width", "400",
        "DEF:user=$database:user:AVERAGE",
        "DEF:nice=$database:nice:AVERAGE",
        "DEF:system=$database:system:AVERAGE",
        "AREA:system#00ffff:System",
        "STACK:user#00ff00:User",
        "STACK:nice#0000ff:Nice",
    );

    make_graph ($imgfile, $opts);
}

?>

And then the actual CPU load page is created. This is almost too easy ;-)

<html>
    <head>
        <title>CPU Load statistics</title>
    </head>
    <body>
        <h1>CPU Load statistics</h1>
<?php
        require "graphs.php";

        print "<h2>Daily graph</h2>\n";
        cpuload ("-1d");
        print "<h2>Weekly graph</h2>\n";
        cpuload ("-1w");
        print "<h2>Monthly graph</h2>\n";
        cpuload ("-1m");
?>
    </body>
</html>