Update: After reading this article, consider reading the update to version 1.2 as well.
I’ve just made an update to the check_apache2.sh script which I’ve released several days ago. It now supports completely optional warning/critical thresholds to monitor requests per second more accurate.
Changelog
- Options -wr an -cr to set warning/critical thresholds for request per seconds
- Slightly altered the perfdata output so that uptime now only shows up when the -ext-info/-e option is being used
- Perfdata output without –ext-info/-e now only shows requests per second, cpu load, idle and busy workers
I didn’t remove the total request/uptime perfdata yet since I’m still thinking about using them in some way. I just don’t know how at the moment. If I don’t get an idea how to use them in a more appropriate way, I’ll remove them with the next update in which options for warning/critical thresholds for idle or busy workers will come along (the reason why one have to use the cryptic -wr and -cr options and not just -w and -c) and some code optimizations as well (this includes using more ; though). One more note: I’m always interested in your opinion. If something didn’t work for you or when you’ve stumbled upon something what could’ve done better, I’d be glad if you let me know to improve the script.
The script
You may want to download the script from NagiosExchange or copy’n'paste the output below.
#!/bin/sh # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA PROGNAME=`basename $0` VERSION="Version 1.1," AUTHOR="2009, Mike Adolphs (http://www.matejunkie.com/)" print_version() { echo "$VERSION $AUTHOR" } print_help() { print_version $PROGNAME $VERSION echo "" echo "Description" echo "$PROGNAME is a Nagios plugin to check the Apache's server status" echo "via the famous mod_status. Make sure, that you have permissions" echo "to access http://yourserver:port/server-status and that the Nagios" echo "User has the right to call 'ps ax' and 'ps -Ao pcpu,args', otherwise" echo "the script refuses to work. This will likely occur only on hardened" echo "systems." echo "" echo "How it works:" echo "The script first checks whether Apache is running and then gets the" echo "server's status page. Req/sec are generated by a diff from the amount" echo "of total requests being served by the Apache due to the fact that" echo "mod_status only provides an average value for the server's runtime." echo "The same applies to the CPU utilization value. Therefore we use ps" echo "to get more realistic data." echo "" echo "The script itself is written sh-compliant and free software under" echo "the terms of the GPLv2. There's a default value for each variable. It" echo "probably works just out of the box?" echo "" echo "$PROGNAME -H/--hostname localhost -p/--port 80 -t/--timeout 3 -o \$HOME" echo "-b /usr/sbin -e" echo "" echo "Options:" echo " -b/--binary-dir)" echo " You might need to choose the directory where your apache2 binary is" echo " located. Otherwise the check will fail. Default is: /usr/sbin" echo " -H/--hostname)" echo " You may define a hostname. Default is: localhost" echo " -p/--port)" echo " You may define a port. Default is: 80" echo " -t/--timeout)" echo " You might want to define a timeout (in sec) for the wget call. Default" echo " is: 3" echo " -o/--output-directory)" echo " You may define a directory where a local copy of server-status" echo " is being stored to spare the Apache. Default is: $HOME" echo " -e/--extended-info)" echo " Provides additional performance data: Total amount of requests, total" echo " amount of transferred bytes, bytes per second, bytes per request and" echo " The servers uptime. Default is off." echo " -wr/--warning-req)" echo " Sets a warning level for requests per second and must be used together" echo " with the -c/--critical-req option. Default is off." echo " -cr/--critical-req)" echo " Sets a critical level for requests per second and must be used together" echo " with the -w/--warning-req option. Default is off." exit $ST_UK } ST_OK=0 ST_WR=1 ST_CR=2 ST_UK=3 hostname="localhost" port=80 timeout=3 output_dir=$HOME binary_dir="/usr/sbin" ext_info=0 running=0 wcdiff_req=0 wclvls_req=0 while test -n "$1"; do case "$1" in --help|-h) print_help exit $ST_UK ;; --version|-v) print_version $PROGNAME $VERSION exit $ST_UK ;; --hostname|-H) hostname=$2 shift ;; --port|-p) port=$2 shift ;; --timeout|-t) timeout=$2 shift ;; --output_directory|-o) output_dir=$2 shift ;; --binary_dir|-b) binay_dir=$2 shift ;; --extendd-info|-e) ext_info=1 ;; --warning-req|-wr) warn_req=$2 shift ;; --critical-req|-cr) crit_req=$2 shift ;; *) echo "Unknown argument: $1" print_help exit $ST_UK ;; esac shift done # get functions get_wcdiff_req() { if [ ! -z "$warn_req" -a ! -z "$crit_req" ] then wclvls_req=1 if [ ${warn_req} -gt ${crit_req} ] then wcdiff_req=1 fi elif [ ! -z "$warn_req" -a -z "$crit_req" ] then wcdiff_req=2 elif [ -z "$warn_req" -a ! -z "$crit_req" ] then wcdiff_req=3 fi } get_processes() { ps ax | grep -c ${binary_dir}/[a]pache2 } get_status() { wget -q -t 3 -T ${timeout} http://${hostname}:${port}/server-status?auto -O ${output_dir}/server-status sleep 1 wget -q -t 3 -T ${timeout} http://${hostname}:${port}/server-status?auto -O ${output_dir}/server-status.1 } get_total_req() { total_req=`cat ${output_dir}/server-status | grep 'Total Accesses:' | awk '{print $3}'` } get_total_kb() { total_kb=`cat ${output_dir}/server-status | grep 'Total kBytes:' | awk '{print $3}'` } get_uptime() { uptime=`cat ${output_dir}/server-status | grep 'Uptime:' | awk '{print $2}'` } get_cpu_load() { cpu_load="$(cpu_load=0; ps -Ao pcpu,args | grep '/usr/sbin/apache2' | awk '{print $1}' | while read line do cpu_load=`echo "scale=3; $cpu_load + $line" | bc -l` echo $cpu_load done)" cpu_load=`echo $cpu_load | awk '{print $NF}' | sed 's/^\./0./'` } get_req_psec() { tmp1_req_psec=`cat ${output_dir}/server-status | grep 'Total Accesses:' | awk '{print $3}'` tmp2_req_psec=`cat ${output_dir}/server-status.1 | grep 'Total Accesses:' | awk '{print $3}'` req_psec=`echo "scale=2; ${tmp2_req_psec} - ${tmp1_req_psec}" | bc -l` } get_bytes_psec() { bytes_psec=`cat ${output_dir}/server-status | grep 'BytesPerSec:' | awk '{print $2}' | sed 's/^\./0./'` } get_bytes_preq() { bytes_preq=`cat ${output_dir}/server-status | grep 'BytesPerReq:' | awk '{print $2}' | sed 's/^\./0./'` } get_wkrs_busy() { wkrs_busy=`cat ${output_dir}/server-status | grep 'BusyWorkers:' | awk '{print $2}'` } get_wkrs_idle() { wkrs_idle=`cat ${output_dir}/server-status | grep 'IdleWorkers:' | awk '{print $2}'` } # check functions check_processes() { if [ $1 -lt 1 ] then echo "UNKNOWN - Your Apache server seems not to run. Is your Nagios privileged to run 'ps ax' and is the Apache2 binary really loc ated in $binary_dir?" exit $ST_UK fi } check_output() { stat_output=`stat -c %s ${output_dir}/server-status` if [ "$stat_output" = 0 ] then echo "UNKNOWN - Local copy of server-status is empty. Are we allowed to access http://${hostname}:${port}/server-status?" exit $ST_UK fi } # Let's do this get_wcdiff_req if [ "$wcdiff_req" = 1 ] then echo "Please adjust your warning/critical thresholds. The warning must be lower than the critical level!" exit $ST_UK elif [ "$wcdiff_req" = 2 ] then echo "Please also set a critical value when you want to use warning/critical thresholds!" exit $ST_UK elif [ "$wcdiff_req" = 3 ] then echo "Please also set a warning value when you want to use warning/critical thresholds!" exit $ST_UK else running=`get_processes` check_processes $running get_status check_output case $ext_info in 1) get_req_psec; get_cpu_load; get_uptime; get_wkrs_busy; get_wkrs_idle; get_total_req; get_total_kb; get_bytes_psec; get_bytes_preq; perfdata="'req_psec'=$req_psec 'cpu_load'=$cpu_load 'uptime'=$uptime 'workers_busy'=$wkrs_busy 'workers_idle'=$wkrs_idle 'tota l_req'=$total_req 'total_kb'=$total_kb 'bytes_psec'=$bytes_psec 'bytes_preq'=$bytes_preq" ;; *) get_req_psec; get_cpu_load; get_wkrs_busy; get_wkrs_idle; perfdata="'req_psec'=$req_psec 'cpu_load'=$cpu_load 'workers_busy'=$wkrs_busy 'workers_idle'=$wkrs_idle" ;; esac if [ ${wclvls_req} = 1 ] then if [ ${req_psec} -ge ${warn_req} -a ${req_psec} -lt ${crit_req} ] then echo "WARNING - Apache serves $req_psec Requests per second with an average CPU utilization of $cpu_load% since $uptime second s. Amount of workers currently busy: $wkrs_busy, currently idle: $wkrs_idle! | $perfdata" exit $ST_WR elif [ ${req_psec} -ge ${crit_req} ] then echo "CRITICAL - Apache serves $req_psec Requests per second with an average CPU utilization of $cpu_load% since $uptime secon ds. Amount of workers currently busy: $wkrs_busy, currently idle: $wkrs_idle! | $perfdata" exit $ST_CR else echo "OK - Apache serves $req_psec Requests per second with an average CPU utilization of $cpu_load% since $uptime seconds. Am ount of workers currently busy: $wkrs_busy, currently idle: $wkrs_idle! | $perfdata" exit $ST_OK fi else echo "OK - Apache serves $req_psec Requests per second with an average CPU utilization of $cpu_load% since $uptime seconds. Amount of workers currently busy: $wkrs_busy, currently idle: $wkrs_idle! | $perfdata" exit $ST_OK fi fi
The License
As always this little script is ment to be sh-compliant and released under the terms of the GPL Version 2 only. Feel free to subscribe via rss to get updates on this one. Options for warning/critical levels for busy/idle workers will be included in the near future.
[...] Nagios Plugins Available: check_apache2.sh (v1.1)check_apache2.pycheck_bind.shcheck_memory.sh [...]