Nagios: Check HP D2600 with snmp

I need to monitor a HP DL2600 Storage array on a HP DL380G8.

I wrote a little script that does the Job. It works for me but maybe not in your environment 😎

[root@testme nagios]# cat check_d2600
#!/bin/bash

#
# check the external storage array on HP DL 380
#
# version 1.0 17.07.13  J.H.
#
# used MIB’s:
#
# CPQIDA-MIB (….232.3.2….)
# CPQHLTH-MIB(….232.6.2….)
# CPQSTSY-MIB(….232.8.2….)
#
# —————————————————————-
#
# insert Line in commands.cfg
#
# public — edit to your Community string
#
# # check D2600 Storage
# define command{
#        command_name    check_d2600
#        command_line    $USER1$/check_d2600 $HOSTADDRESS$ public
#        }
#
# —————————————————————-
#
# define service
#
# define service{
#        use                             generic-service
#        host_name                       testme.db-serv
#        service_description             Ext.Storage Condition
#        check_command                   check_d2600
#        }
# —————————————————————-
#

if [ $# -ne 2 ]
then
echo „Arguments missing !“
echo „“
echo „Usage: check_d2600 <HOSTIP> <SNMP-COMMUNITY>“
echo „“
exit 2
fi

ServerIP=$1
Community=$2

# —————————————————————-
#
# experimentell Logical Array and Logical drivenumber on external Array
#
#
LogArray=3
LogDrive=1
#
# —————————————————————-

errors=0
StorageHealth=2

BoxModel=`snmpget -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.8.2.1.1.4.3.2 | awk -F= ‚{print $NF}‘ | awk -F: ‚{print $NF}’`

# check Box condition –> doesn’t check Drives
#
# conditions 1=other, 2=ok, 3=degraded
#
BoxCond=`snmpget -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.8.2.1.1.8.3.2 | awk -F : ‚{ print $NF }‘ | tr -d ‚ ‚`
#
if [[ $BoxCond -ne 2 ]]; then
StorageHealth=$BoxCond
fi

# check Logical Drive Condition –> checks only 1 drive !, if you have more drives, you need multiple entries
#
# conditions 1=other, 2=ok, 3=degraded
#
Checkhd=`snmpget -v 1 -c$Community $ServerIP .1.3.6.1.4.1.232.3.2.3.1.1.4.$LogArray.$LogDrive | awk -F: ‚{print $NF}‘ | tr -d ‚ ‚`
#
if [[ $Checkhd -ne 2 ]]; then
StorageHealth=$Checkhd
fi

#
# If Box Status isn’t 2(ok) check for details
#

if [[ $StorageHealth -ne 2 ]] ; then

# count errors in log
healthlogs=`snmpwalk -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.6.2.11.3.1.8 | grep „External Storage Enclosure“ | wc -l`
let „healthlogs -= 1“

# read HealthLog entries from: External Storage Enclosure

# htl=`snmpwalk -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.6.2.11.3.1.8 | grep „External Storage Enclosure“ |tr „()“ „\n“`
# htl=`snmpwalk -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.6.2.11.3.1.8 | grep „External Storage Enclosure“ |awk -F= ‚{print $NF }‘ | awk -F: ‚{print $NF }’`
# htl=`snmpwalk -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.6.2.11.3.1.8 | grep „External Storage Enclosure“ |awk -F= ‚{print $NF }‘ | sed ’s/\(4)\)/\n/g’`
htl=`snmpwalk -v 2c -c $Community $ServerIP .1.3.6.1.4.1.232.6.2.11.3.1.8 | grep „External Storage Enclosure“ |awk -F= ‚{print $NF }’`

OIFS=$IFS
IFS=“)“
arr2=$htl
arr3=()
count=0

for x in $arr2
do
var0=$(echo $x | awk -F: ‚{printf $NF}‘)
var1=$(echo ${var0//[\“]/})
arr3+=($var1)
done

IFS=$OIFS
logcount=${#arr3[@]}
# echo „count Elements= ${#arr3[@]}“
errormsg=““
lastmsg=““

# Get more details
#
#
# Powersupply ?
Checkps=`snmpget -v 1 -c$Community $ServerIP SNMPv2-SMI::enterprises.232.8.2.1.1.11.3.2 | awk -F: ‚{print $NF}‘ | tr -d ‚ ‚`
if [[ $Checkps -ne 2 ]] ; then
errormsg=“PowerSupply Error“
# get last log entry for enclosure EXPERIMENTELL
lastmsg=${arr3[$logcount-1]}
fi

# Fans ?
Checkfan=`snmpget -v 1 -c$Community $ServerIP SNMPv2-SMI::enterprises.232.8.2.1.1.7.3.2 | awk -F: ‚{print $NF}‘ | tr -d ‚ ‚`
if [[ $Checkfan -ne 2 ]] ; then
errormsg=“Fan Module Error Storage$BoxModel“
# doesn’t write into errorLog 🙁
fi

# Temperature ?
Checktemp=`snmpget -v 1 -c$Community $ServerIP SNMPv2-SMI::enterprises.232.8.2.1.1.9.3.2 | awk -F: ‚{print $NF}‘ | tr -d ‚ ‚`
if [[ $Checktemp -ne 2 ]] ; then
errormsg=“OverTemp Error Storage$BoxModel“
fi

# Disks  / Raid ?
Checkhd=`snmpget -v 1 -c$Community $ServerIP SNMPv2-SMI::enterprises.232.3.2.3.1.1.4.$LogArray.$LogDrive | awk -F: ‚{print $NF}‘ | tr -d ‚ ‚`
if [[ $Checkhd -ne 2 ]] ; then
errormsg=“LogDrive error in Storage$BoxModel;“
# get last log entry for enclosure EXPERIMENTELL
lastmsg=${arr3[$logcount-1]}

fi

#    echo „Failure Detected, check Storage$BoxModel on Server at: $ServerIP“
echo „ERROR: $errormsg $lastmsg“
exit 2
else
echo „OK: Storage$BoxModel is ok“

exit 0
fi

 


Nagios State Info ext. Storage
Nagios State Info ext. Storage

Service State Information : Everything ok
OK: Storage „D2600 SAS AJ940A“ is ok

 

 

 

 

 

 

 

 

 

Nagios check ext. Storage failure
Nagios check ext. Storage failure

Error: One power supply failed:

ERROR: PowerSupply Error  External Storage Enclosure Power Supply Failure (Power Supply 1, Box 1, Port 1E, Slot 4

 

 

 

 

Download nagios plugin:

Download