I needed a script that checks:
- whether a cronjob ran at its determined date (that means it runs regularly)
- if the cronjob ran successfully
It should also be monitor-able via Nagios. That means, it should provide a meaningful output for when something went wrong with the cronjob, e.g. “CRITICAL: cronjob did not run!”
Of course, someone already had the same requirements and built a Perl-script for it, which in turn can be used as a NRPE-plugin.
For this, the cronjob to be monitored has to be edited to log every output and its exit code to a logfile. The script then does the above- mentioned checks. You can find a more detailed explanation here.
The Perl script is missing two things:
- both parameters (last execution time and exit code) are mandatory
- a Python implementation!
To fix these things, I wrote a small script by myself, that is hosted on Github and here (since it’s rather small):
#!/usr/bin/env python
"""
This script can check whether a file was last modified before a time X. If it was modified after that,
an error is raised. It can also check, whether the file contains
a specific string (e.g. "exit 0"). If it's not there, an error is raised.
Example:
check_exit_code.py --exitcode -t 20 /path/to/logfile
"""
import os
import argparse
import time
import re
# Parse arguments
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("logfile", help="Specify the logfile to monitor.")
parser.add_argument("-t", "--time", help="Time in minutes which a log file can be unmodified before raising CRITICAL"
" alert.", type=int)
parser.add_argument("--exitcode", help="If specified, check if \"exit 0\" exists in logfile.", action="store_true")
args = parser.parse_args()
if not (args.time or args.exitcode):
print "At least one argument (--time, --exitcode) is required."
exit(2)
#variables
log = args.logfile
# Check if file exists and is readable by user
try:
f = open(log)
except IOError:
print "Cannot open the file. Check if it exists and you have the right permissions to open it."
exit(1)
# check exit code
if args.exitcode:
logfile = f.read()
if not re.search("exit 0", logfile, flags=re.I):
print "CRITICAL: Exit code not found in File %s" % log
exit(2)
# Check last file modification time
if args.time:
# File Modification time in seconds since epoch
file_mod_time = os.stat(log).st_mtime
# Time in seconds since epoch for time, in which logfile can be unmodified.
t = time.time()
should_time = t - (args.time * 3600)
# Time in minutes since last modification of file
last_time = (t - file_mod_time) / 60
if last_time > args.time:
print "CRITICAL: {} last modified {:.2f} minutes. Threshold set to {:.2f} minutes".format(log, last_time,
args.time)
exit(2)
# If nothin went wrong, print OK and exit.
print "OK. Exitcode found or last modification time not exceeded."
exit(0)