This here is the description how you can integrate the Ganglia web 2.0 interface with RRDCached when using my Ganglia RPMs and my AIX RPMs.
The original description can be found here at the Ganglia wiki and most information contained here has been taken from there.
Introduction
From the rrdcached manpage, rrdcached is "a daemon that receives updates to existing RRD files, accumulates them and, if enough have been received or a defined time has passed, writes the updates to the RRD file.".
rrdcached is useful in environments where Ganglia is used to monitor a lot of servers (1000+) and/or a lot of metrics. In these large environments, if the rrd files are stored in traditional hard disks, then the server will experience high I/O wait resulting from the constant updating of a large number of rrd files. The commonly accepted workaround is to store these rrd files in tmpfs which eliminates the issue. The down side is that these rrd files need to be manually backed up regularly to prevent data loss if the server crashes or shuts down.
rrdcached in general can prevent the I/O storm caused by updates of large number of rrd files by staggering the writes to disk and in practise has been quite successful in lowering the I/O wait of the servers running gmetad/web frontend.
Requirements
To integrate Ganglia with rrdcached, you will need at least the following RPM versions:
ganglia-gmetad
- versions >= 3.4.0-2
- versions >= 3.5.0-2
- versions >= 3.6.0-2
- any higher version
rrdtool-cached
- versions >= 1.4.7-3
- versions >= 1.4.8-2
- any higher version
Installation of rrdcached
For Ganglia, two processes need to access the rrdcached daemon, namely gmetad and apache. The gmetad process is owned by nobody
and the apache process (named 'httpd
') is owned by apache
. When you start the rrdcached daemon, you will need to make sure the UNIX Domain Socket is both read and writeable by the above mentioned processes.
After you have installed rrdtool
, ganglia-gmetad
you will need to perform the following customizations. The following facts are assumed here:
- Your apache web server runs with owner and group set to '
nobody
'.
- Your RRD files are stored in
/var/lib/ganglia/rrds
(default location) with owner and group set to 'nobody
'.
- Your
rrdcached
runs with owner and group set to 'rrdcache
' (default installation, please note there is a 8-character limit for user/group on AIX (at least the older versions)).
Please change the config file ('/etc/sysconfig/rrdcached
') for rrdcached
so that it contains the following content (please note the end of line continuation character '\
' that should not be in your config file):
# Please read the manpage for more info about possible runtime options.
RRDCACHED_ADDRESS="unix:/var/lib/rrdcached/rrdcached.sock"
RRDCACHED_OPTIONS="-s nobody -m 664 -l ${RRDCACHED_ADDRESS} -s nobody -m 777 -P FLUSH,STATS,HELP \
-l unix:/var/lib/rrdcached/rrdcached.limited.sock -b /var/lib/ganglia/rrds -B -w 300 -z 300"
|
Then you will need to start the rrdcached
daemon with the following command:
/etc/rc.d/init.d/rrdcached start
|
This will perform the following steps:
- rrdcached will run as the user
nobody
, writable by the nobody
group with permissions of 644.
- The pid file will automatically be placed by the startup/stop script
/etc/rc.d/init.d/rrdcached
in the directory /var/run/rrdcached
under the name rrdcached.pid
.
- The default UNIX Domain Socket
rrdached.sock
will be created in the directory /var/lib/rrdcached
.
- An additional "limited" socket will also be created in the directory
/var/lib/rrdcached
under the name rrdcached.limited.sock
. This socket is the one you should point the ganglia-webfrontend cgis to. This socket is only usable for the FLUSH
, STATS
, and HELP
commands and prevent errant web apps from writing arbitrary rrds.
- The
-b
option specifies the base directory to change to and -B
restricts file access to the paths within the directory specified by -b
.
- The 'timeout' value (please see the manpage to
rrdcached
) is set to 5 minutes (= 300 seconds) with -w
.
- The 'delay' value (please see the manpage to
rrdcached
) is set to 5 minutes (= 300 seconds) with -z
.
Configuring Ganglia
Now we need to configure Ganglia to talk to rrdcached. This requires two changes, one related to the gmetad init script and one in the web frontend's conf.php.
The first change refers to the gmetad init script (/etc/rc.d/init.d/gmetad
). The only way to tell gmetad to talk to rrdcached is by setting the environment variable RRDCACHED_ADDRESS
. This can either be added to the gmetad init script or set globally.
export RRDCACHED_ADDRESS="unix:/var/lib/rrdcached/rrdcached.sock"
|
For the second change - to configure the web frontend - you need to update/set the $rrdcached_socket
variable in conf.php
of your Ganglia web 2.0 interface and you're done.
# If rrdcached is being used, this argument must specify the
# socket to use.
#
# ganglia-web only requires, and should use, the low-privilege socket
# created with the -L option to rrdcached. gmetad requires, and must use,
# the fully privileged socket created with the -l option to rrdcached.
$conf['rrdcached_socket'] = "unix:/var/lib/rrdcached/rrdcached.limited.sock";
|
Now when you startup gmetad, all RRDTool commands will go through the caching daemon. Same goes to the graphing functions in the web frontend.
If everything goes well you should see similar output as here and your RRD files (in /var/lib/ganglia/rrds
) will only be updated every 5 minutes:
root@gmetad:/> ls -l /var/run/rrdcached/
total 8
-rw-r--r-- 1 root system 9 Aug 20 20:59 rrdcached.pid
root@gmetad:/>
root@gmetad:/> ls -la /var/lib/rrdcached/
total 0
drwxr-xr-x 2 rrdcache rrdcache 256 Aug 20 20:59 .
drwxr-xr-x 8 bin bin 256 Aug 20 10:32 ..
srwxrwxrwx 1 root nobody 0 Aug 20 20:59 rrdcached.limited.sock
srw-rw-r-- 1 root nobody 0 Aug 20 20:59 rrdcached.sock
|
Automatic startup of daemons
If you have enabled gmetad and rrdcached to startup by default, make sure that rrdcached starts before gmetad and gmetad stops first before rrdcached.
This could be for instance be accomplished by renaming the rrdcached and gmetad symbolic links in /etc/rc.d/rc2.d
and /etc/rc.d/rc3.d
:
root@p770-wpar1:/> ls -l /etc/rc.d/rc[23].d/*rrdcached*
lrwxrwxrwx 1 root system 19 Aug 20 10:32 /etc/rc.d/rc2.d/Krrdcached -> ../init.d/rrdcached
lrwxrwxrwx 1 root system 19 Aug 20 10:32 /etc/rc.d/rc2.d/S0rrdcached -> ../init.d/rrdcached
lrwxrwxrwx 1 root system 19 Aug 20 10:32 /etc/rc.d/rc3.d/Krrdcached -> ../init.d/rrdcached
lrwxrwxrwx 1 root system 19 Aug 20 10:32 /etc/rc.d/rc3.d/S0rrdcached -> ../init.d/rrdcached
|