Monday, December 28, 2009

Fun with Nagios, part 1: timeliness of web content

I mentioned in an earlier post that I'm a big fan of the Nagios open source monitoring system. With very little work (on Linux, a couple of package installs and a bit of service configuration should do the trick) you can have a system up and running that will check all your servers for availability, disk space, memory and CPU usage and alert you when any resources exceed their limits.

It's pretty easy to add additional Nagios checks that will monitor all kinds of other useful things. I'll be making a few posts this week on that subject; today's deals with checking that your web site content is up to date.

One of my company's products is a web service that provides a speech-to-text transcription of news videos. The service pulls in videos from RSS feeds provided by the news source, runs the speech-to-text analysis and posts the results in another RSS feed (one per news source). As well as making sure that the output RSS feed is available 24x7, I wanted to check that new articles are being added each day. There are about 75 different news sources and thus output RSS feeds to check, so automation was pretty much mandatory.

This turned out to be a breeze, using Nagios' trusty check_http command and making use of the fact that you can pass the results of a command execution as a parameter to check_http. It's standard for each article in an RSS feed to have an attribute containing the publishing date; in my case this looks like:


Mon, 28 Dec 2009 23:40:04 GMT


To check that this feed has at least one article published today, here's the custom command from the RSS feed server's Nagios configuration file:

check_command check_http! -u /feeds/1010 -s "`date +\"%d %b %Y\"`"

This uses the standard -s parameter to check for a string in the HTTP response from the URL http://feedserver.mydomain.com/feeds/1010 but makes the string to be checked the output of the "date" command in the format used by the output RSS feed, e.g. "28 Dec 2009" in my case.

It was easy to add checks for the rest of the output feeds (which all have the same date format) by creating additional checks and just varying the feed URL.

The only other thing to watch here is that Nagios runs all its checks 24x7 by default, but the first new article of the day might not be published until some time well after midnight. To avoid getting "out of date content" alerts as soon as the date changes , you can set up a custom time period in the Nagios server's timeperiods.cfg file, e.g.

define timeperiod{
timeperiod_name news-content
alias Times when content from today should be present
sunday 8:00-24:00
monday 3:00-24:00
tuesday 3:00-24:00
wednesday 3:00-24:00
thursday 3:00-24:00
friday 3:00-24:00
saturday 8:00-24:00
}

This says that at least one article from today should be present at all times after 3am on weekdays and after 8am on weekends - we learned by experience that weekends are slower news days. Use the custom time period in the configuration entry for your date check:

define service{
use generic-service
host_name feedserver
check_period news-content
service_description Content updated: feed 1010
check_command check_http! -u /feeds/1010 -s "`date +\"%d %b %Y\"`"
}

and you're good to go.

More on useful Nagios checks later in the week; I'd love to hear your own favourite ones in the comments.

Tuesday, December 8, 2009

Exchanging Answers

I'm a regular user of the programming Q&A site stackoverflow.com and its companion for sysadmins, serverfault.com. Both sites have saved me a lot of time digging around the wilds of the web for answers, so I've tried to give back by answering questions whenever I can.

There are a few reasons why I like these sites: the flat organization of the questions and answers makes it really easy to narrow down the things you're looking for; the email and RSS notifications of new questions and answers are highly configurable; and - an unexpectedly good incentive for me - you can build up reputation points for asking good questions and giving good answers (as voted on by the other users of the sites).

There's now a similar Q&A site for software testers at http://testing.stackexchange.com - if Server Fault is Stack Overflow with a beard, ponytail and sandals, Testing.Stackexchange might be SO with half moon glasses and a clipboard. I think this could be a great forum to concentrate testing information; there's a lot of test related stuff on Stack Overflow already, but it's often buried with more development-oriented content. I'll be hanging out there regularly from now on.