Description
The Power5/6/7 extensions (maximum of 23 metrics) are contained in a separate DSO module (written in C) called mod_ibmpower
. If installed, this DSO module is loaded during runtime/startup of gmond.
This DSO contains mostly IBM Power5/6/7 specific metrics that can be used to monitor those system properties.
Platform availability:
- AIX: minimum AIX5L V5.1 ML04 level or higher required
- Linux: SLES 9 or higher required, RHEL 4 Update 3 or higher required
Config file: /etc/ganglia/conf.d/ibmpower.conf
Available metrics
The following 23 additional metrics are available:
capped
cpu_entitlement
cpu_in_lpar
cpu_in_machine
cpu_in_pool
cpu_pool_id
cpu_pool_idle
cpu_used
disk_read
disk_write
disk_iops
fwversion
kernel64bit
lpar
lpar_name
lpar_num
modelname
oslevel
serial_num
smt
splpar
weight
cpu_in_syspool
- for Power6/7 only, at least AIX5L V5.3 TL07 level or higher required
- on a Power5 system: same value as
cpu_in_pool
Detailed description
-
capped
- Type: String value
- returns "yes" if the system is a POWER5/6 Shared Processor LPAR which is running in capped mode or "no" otherwise
-
cpu_entitlement
- Type: Float value
- returns the Capacity Entitlement of the system in units of physical CPUs
-
cpu_in_lpar
- Type: Integer value
- returns the number of CPUs the OS sees in the system. In a Power5/6/7 Shared Processor LPAR this returns the number of virtual CPUs. When SMT is enabled this number is doubled.
-
cpu_in_machine
- Type: Integer value
- returns the number of physical CPUs in the whole system
-
cpu_in_pool
- Type: Integer value
- returns the number of physical CPUs in the Shared Processor Pool
-
cpu_pool_id
- Type: Integer value
- returns the shared processor pool ID this LPAR belongs to
-
cpu_pool_idle
- Type: Float value
- returns in fractional numbers of physical CPUs how much the Shared Processor Pool is idle
-
cpu_used
- Type: Float value
- returns in fractional numbers of physical CPUs how much compute resources this shared processor LPAR has used since the last time this metric was measured
-
disk_read
- Type: Float value
- returns in units of kB the total read I/O per second of the system
-
disk_write
- Type: Float value
- returns in units of kB the total write I/O per second of the system
-
disk_iops
- Type: Float value
- returns the total number of I/O operations (read + write) per second
-
fwversion
- Type: String value
- returns the current firmware version of the IBM Power System
-
kernel64bit
- Type: String value
- returns "yes" if the running kernel is a 64-bit kernel or "no" otherwise
-
lpar
- Type: String value
- returns "yes" if the system is a LPAR or "no" otherwise
-
lpar_name
- Type: String value
- returns the name of the LPAR as defined on the Hardware Management Console (HMC) or some reasonable message otherwise
-
lpar_num
- Type: Integer value
- returns the partition ID of the LPAR as defined on the Hardware Management Console (HMC) or some reasonable message otherwise
-
modelname
- Type: String value
- returns the machine model name
-
oslevel
- Type: String value
- returns the version string as provided by the AIX command 'oslevel‘
-
serial_num
- Type: String value
- returns the serial number of the system as provided by the AIX command 'uname‘
-
smt
- Type: String value
- returns "yes" if SMT is enabled or "no" otherwise
-
splpar
- Type: String value
- returns "yes" if the system is running in a shared processor LPAR or "no" otherwise
-
weight
- Type: Integer value
- returns the weight of the LPAR running in uncapped mode
-
cpu_in_syspool
(on a Power5 system: same value as cpu_in_pool
)
- Type: Integer value
- returns the number of cores contained in the system shared processor pool
Config file example
modules {
module {
name = "ibmpower_module"
path = "modibmpower.so"
}
}
collection_group {
collect_once = yes
time_threshold = 1200
metric {
name = "fwversion"
title = "Firmware Version"
}
metric {
name = "kernel64bit"
title = "Kernel 64 bit?"
}
metric {
name = "lpar"
title = "LPAR Mode?"
}
metric {
name = "lpar_num"
title = "LPAR Number"
}
metric {
name = "model_name"
title = "Machine Name"
}
metric {
name = "serial_num"
title = "System Serial Number"
}
metric {
name = "splpar"
title = "Shared Processor LPAR?"
}
}
collection_group {
collect_every = 180
time_threshold = 1200
metric {
name = "cpu_in_machine"
title = "Cores in Machine"
}
metric {
name = "lpar_name"
title = "LPAR Name"
}
metric {
name = "oslevel"
title = "Output of 'oslevel -s'"
}
}
collection_group {
collect_every = 15
time_threshold = 180
metric {
name = "capped"
title = "Capped Mode?"
}
metric {
name = "cpu_pool_id"
title = "Shared processor pool ID of this LPAR"
}
metric {
name = "cpu_entitlement"
title = "CPU Entitlement"
value_threshold = 0.01
}
metric {
name = "cpu_in_lpar"
title = "Number of Virtual CPUs in LPAR"
value_threshold = 1
}
metric {
name = "cpu_in_pool"
title = "Number of Cores in Pool"
value_threshold = 1
}
metric {
name = "cpu_in_syspool"
title = "Number of Cores in System Pool"
value_threshold = 1
}
metric {
name = "disk_iops"
title = "Total number I/O operations per second"
value_threshold = 1.0
}
metric {
name = "disk_read"
title = "Total Disk Read I/O per second"
value_threshold = 1.0
}
metric {
name = "disk_write"
title = "Total Disk Write I/O per second"
value_threshold = 1.0
}
metric {
name = "smt"
title = "SMT enabled?"
}
metric {
name = "weight"
title = "LPAR Weight"
value_threshold = 1
}
}
collection_group {
collect_every = 15
time_threshold = 60
metric {
name = "cpu_pool_idle"
title = "CPU Pool Idle"
value_threshold = 0.0001
}
metric {
name = "cpu_used"
title = "Physical Cores Used"
value_threshold = 0.0001
}
}
Change History AIX
- Version 1.4: Feb 09, 2012
- added new metric cpu_pool_id (
cpu_pool_id_func()
)
- Version 1.3: Apr 27, 2010
- added sanity check for
cpu_pool_idle_func()
- added new metric
fwversion
(fwversion_func()
)
- Version 1.2: Feb 10, 2010
- added IO ops/sec metric (
disk_iops_func()
)
- changed metric type from
GANGLIA_VALUE_FLOAT
to GANGLIA_VALUE_DOUBLE
and changed unit to bytes/sec for disk_read_func()
and disk_write_func()
- added
model_name
metric (model_name_func()
)
- Version 1.1: Jan 21, 2010
- improved
cpu_used()
function
- fixed defuncts caused by open pipes (
popen()
without pclose()
)
- added checks for possible libperfstat counter resets in
cpu_pool_idle_func()
cpu_used_func()
disk_read_func()
disk_write_func()
- Version 1.0: Dec 11, 2008
Change History Linux
- Version 0.4: Feb 09, 2012
- added new metric
cpu_pool_id
(cpu_pool_id_func()
)
- Version 0.3: Apr 27, 2010
- added sanity check for
cpu_pool_idle_func()
- added new metric
fwversion
(fwversion_func()
)
- fixed
cpu_used_func()
for systems which have /proc/ppc64/lparcfg
and the purr
stanza does exist but returns garbage because the CPU does not have a PURR register, e.g., true for PowerPC970 (JS20, JS21)
- Version 0.2: Feb 10, 2010
- improved
cpu_used()
function
- added IO ops/sec metric (
disk_iops_func()
)
- changed metric type from
GANGLIA_VALUE_FLOAT
to GANGLIA_VALUE_DOUBLE
and changed unit to bytes/sec for disk_read_func()
and disk_write_func()
- added model_name metric (
model_name_func()
)
- Version 0.1: Dec 11, 2008