I've had this rolling around for a while and finally have some time to work on it. This community is the perfect place to drop this for expansion, correction, improvement,etc...
The idea is a pre/post change script that can be run via whatever method (manually via cli or via CDT, etc..) that would run various checks on the system and do some sort of compare afterwards. There would have to be some sort of logic to help the engineer spot issues or potential issues which is part of what I haven't started working on.
Phase 1 is identifying useful information that we need to be looking at and collecting it. See below. Calls out the ping-gateways.sh script which I will post below as well. I really need to add to the list of things to check to include CoreXL information, be VSX aware, etc...
The main thing I'm hoping to get for this phase is ideas for what to look for and the best way to collect it in a format that will make phase 2 easier.
Phase 2 is manipulating the data and comparing pre/post and presenting it in a decent manner to help speed up and improve the validation efforts. (Not started)
#!/bin/bash
#
# checkup.sh
#
# Initially created by Ivan Moore
#
# Pre/Post Change Validation script
#
# 1.0.0 Initial version which just collects data
#
VERSION=1.0.0
MYNAME=`hostname`
date=`date '+%d%b%Y'`
time=`date '+%X'`
. /etc/profile.d/CP.sh
TLOG="/var/log/tmp/$MYNAME/$MYNAME-$time.txt"
USAGE='Usage:\tcheckup.sh\n '
collect () {
echo Running $1
echo "#################" >>$TLOG
echo "Running $1" >>$TLOG
echo "#################" >>$TLOG
echo "#" >>$TLOG
echo " " >>$TLOG
$1 >>$TLOG
echo " " >>$TLOG
}
#######
#
echo "Setting up some temp space and our log file"
#
mkdir /var/log/tmp/$MYNAME >/dev/null 2>&1
cd /var/log/tmp/$MYNAME
echo "############# Validation Log ############" >$TLOG
echo "############# $date ############" >>$TLOG
echo "############# $time ############" >>$TLOG
echo "#########################################" >>$TLOG
echo "" >>$TLOG
######
#
#
#
collect "cphaprob -a if"
collect "cphaprob -l list"
collect "cphaprob stat"
collect "fwaccel stats"
collect "fwaccel stats -s"
collect "fwaccel stat"
collect "ping-gateways.sh"
collect "netstat -i"
dmesg | tail >>$TLOG
echo "Running ethtool and checking bonds"
for IF in `ifconfig -a | grep -v grep | grep HWaddr | awk '{print $1}' | grep -v bond`
do
echo -n "$IF: ">>$TLOG;
ethtool -S "$IF" >>$TLOG
done
for i in `ifconfig | grep bond | awk ' { print $1 }' | grep -v "\."`; do \
cphaconf show_bond $i >>$TLOG;\
done
collect "cpview -p"
#######
echo " "
echo " "
echo "Some bits to look at."
echo " "
echo "### Interface Errors - If Any"
echo " "
netstat -i | grep -v "\." | awk '{ print $1 "\t\t" $5 "\t" $6 "\t" $7 "\t" $9 "\t" $10 "\t" $11 }'
echo " "
echo "Cluster State: " `cphaprob stat | grep "(local)" | awk {' print $5'}`
echo "Currenet # of Connections: " `fw tab -t connections -s | grep "connections" | awk {' print $4 '}`
echo " "
echo "Full log file can be found here:"
echo " $TLOG"
echo " "
#!/bin/bash
#
# ping-gateways.sh
#
# [Ivan Moore]
#
# Determine all next hop gateways from routing table and ping
# to make sure we can reach them. If no ping response is received
# check ARP table to see if we have L2 just in case the device is
# not allowed to respond.
#
# 11/19/2015
VERSION=1.0.0
# Program name: ping script to ping next-hop addresses
date
ip route | awk '/via/ { print $3 }' | sort -u | while read output
do
ping -c 1 -w 1 "$output" > /dev/null
if [ $? -eq 0 ]; then
echo "node $output is up"
else
echo " "
echo "node $output is down"
echo "Checking for an ARP entry in case PING is disabled:"
arp -a $output &
TASK_PID=$!
sleep 20
kill $TASK_PID >/dev/null 2>&1
echo " "
fi
done