Optimal Operations

Tuesday, May 10, 2011

Testing, 1-2-3

There's a new kid on the block for QA / testing question and answer sites - the SQA Stack Exchange, brought to you by the good folks behind Stack Overflow and Server Fault among others.

I hope this one gets more traction than the older testing.stackexchange site, which never got the volume of users and questions to take off. Testing questions tend to have less well-defined answers than programming ones, but there's already a good chunk of useful knowledge and opinions on the SQA Stack Exchange. Check it out and ask / answer some questions, you'll feel better !

Friday, October 8, 2010

Morning, Campers

I'm heading to the Atlassian AtlasCamp next week with three other guys from work; it should be two and a half days of fun and intensity (plus seafood and beer ;-)), learning how to extend JIRA / Fisheye / Crucible and meeting lots of developers from Atlassian and external plugin companies. I'll report back on how it goes in another post.

We're trying to extend JIRA in quite a few ways at work, so I'm hoping to find answers to a few things (and ideally apply the "lazy engineer" principle of stealing code that someone has already written rather than doing it myself !). The main ones at this point are:

Using JIRA for test case management (TCM)
Getting build analytics out of JIRA, Fisheye and Perforce that will show all the changes that went into a release, without having to drill down into individual JIRA issues

Thursday, July 29, 2010

Migrating a Hudson instance

Quick post to hopefully help others with an error I got when moving our Hudson instance from Windows to its new home on a Linux server.

The basic migration is super easy - just zip up the Hudson home directory (default on Windows XP is C:\Documents and Settings\[username running Hudson]\.hudson) and restore it onto the new server. However, when you start your new Hudson instance you may see one of both of the following errors in the console output:

SEVERE: Timer task hudson.model.LoadStatistics$LoadStatisticsUpdater@74e8f8c5 failed
java.lang.AssertionError: class hudson.node_monitors.DiskSpaceMonitor is missing its descriptor
at hudson.model.Hudson.getDescriptorOrDie(Hudson.java:937)
at hudson.node_monitors.NodeMonitor.getDescriptor(NodeMonitor.java:83)
at hudson.node_monitors.NodeMonitor.getDescriptor(NodeMonitor.java:67)
at hudson.util.DescribableList.get(DescribableList.java:104)
at hudson.model.ComputerSet.(ComputerSet.java:356)
at hudson.model.LoadStatistics$LoadStatisticsUpdater.doRun(LoadStatistics.java:214)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)

Jul 29, 2010 1:47:39 PM hudson.triggers.SafeTimerTask run
SEVERE: Timer task hudson.model.LoadStatistics$LoadStatisticsUpdater@74e8f8c5 failed
java.lang.NoClassDefFoundError
at hudson.model.LoadStatistics$LoadStatisticsUpdater.doRun(LoadStatistics.java:214)
at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:54)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)

The fix is easy - just stop Hudson, delete the file nodeMonitors.xml in the main Hudson directory, and restart Hudson. It looks as if that file contains data that's specific to the old system; Hudson will recreate the file after restart.

Wednesday, July 7, 2010

Hudson as a CVS Watcher on Windows

I'm currently consulting at a large corporate outfit in Silicon Valley - sometimes it's a bit like going back in time to the mid-90s. They're using CVS and most developers don't have much visibility into the repository, so one of the first things I set up was an email to interested people every time somebody checks into CVS.

Hudson was the perfect choice for this since I want to move on to continuous build, test and deployment eventually, but it doesn't hurt to start with a baby step. The surprise on the faces of people who haven't come across Hudson or CI tools in general is something to behold !

I had to run Hudson on a Windows machine, which came with its own little set of issues, so here's a step by step for future reference.

Install a Windows CVS command line client. This proved a little hard to track down as the main cvsnt.org site doesn't seem to have any free downloads any more. I eventually found the CVSNT client bundled with the CvsGui project at http://www.wincvs.org/ - it comes with a separate installer, cvsnt_steup.exe, that lets you install just the client utilities.
Install and run Hudson - just get the WAR file from the download page, save it locally and run "java -jar hudson.war"
Set up Hudson as a Windows service - in typical easy-to-use Hudson fashion, you do this from within Hudson itself by going to the "Manage Hudson" page and selecting "Install as Windows service".
Make sure that the Hudson service will run as a user that's already set up to access the CVS repository - for example, if your repository access is via extssh, you'll need to make sure that the user already has the SSH host connection info saved locally, so that you don't get prompts about saving the connection info when Hudson tries to do a CVS update. On Windows XP (yep, my corporate paymaster is still on XP), right-click My Computer in the Start menu, click Manage, then Services under Services and Applications. Right click the Hudson service, select Properties, then enter the user name and Windows password in the Log On tab.
Restart your machine - I found that this was the only reliable way to get Hudson to come back up after making changes in steps 3 and 4.

You should now be good to go and set up your Hudson jobs to poll the CVS repository and report any changes via email; Hudson should be running at http://localhost:8080 by default.

For extra email configurability I used the email-ext plugin; this lets you send email for all successful builds (not just failures and fixes like the default Hudson behaviour) and include all kinds of info in the email body, such as a list of file changes in configurable format.

More to follow as I set up the build and test stuff; we have a client written in Adobe flex talking to a SOAP API with Tibco on the back end, so there should be some, ahem, interesting challenges there ....

Friday, June 18, 2010

Location Irrelevance

Warning: this is a bit more political than my usual tech-heavy posts.

I was leafing through James Bach's blog the other day; James is one of today's leading writers about software testing and always well worth a read. He referred to an excellent post by Pradeep Soundararajan, one of my other favourite testing thinkers and writers, which really nailed some important test patterns that I've come across the need for frequently, but never figured out how to summarize. I highly recommend that you read Pradeep's post before you carry on.

At the end of his post, James referred to Pradeep as "one of the leading Indian testers". This made me feel a bit uncomfortable, enough so to comment on the post, and James commented back:

"I think culture is relevant, and nationality often associates to culture. There is a distinctive Indian testing sub-culture. I also think there is an American testing culture, too. I wouldn't mind being called an American tester."

I can't agree with that at all. Software testing expertise shouldn't be about culture, location or nationality at all, unless you're in the really specialized test areas of localization or internationalization. Given the frequently negative connotation that badly-handled outsourcing projects have given to software professionals outside of North America and Western Europe, I think it does no good at all to classify anyone by nationality or location - or sex, musical taste, number of prehensile toes, or anything else other than ability - when discussing their professional achievements.

One of the great things about the internet is that it's levelled the global playing field for writing, testing and using computer software to a massive extent. Let's keep it that way, recognize the achievements of software professionals all over the world for what they are, and call a leading tester a leading tester, without confining them to some largely meaningless subcategory.

Thursday, June 10, 2010

The Tech Ops Nazi

I attended the excellent Atlassian Starter Day on Wednesday; it was a great session with many highlights (including a surprise appearance by Tom Cruise !) and worthy of multiple posts.

With DevOps Days USA fast approaching (I'm on one of the panels), it was interesting to hear multiple speakers at Starter Day talk about devops concepts. One of the highlights for me was Jochen Frey, Scout Labs' CTO, talking about how to run an effective startup engineering team (and how to mess it up). He seemed pretty sleep-deprived, citing Scout Labs' recent acquisition by Lithium Technologies as the reason, but got one of the biggest rounds of applause of the day for soldiering through to the end.

Jochen especially got my attention when he described the importance of having a "tech ops nazi" on your team. This is a kind of QA / program manager / IT ops hybrid person who essentially acts (alone or with their team) as a buffer between the developers and the deployed code, checking multiple criteria before new code is deployed:

the code builds successfully
all the expected changes are included in each build
all the tests pass on the deployment platform
the deployment configuration is standardized

A lot of this can and should be automated - see Eric Ries' excellent "Continuous Deployment in 5 Easy Steps" for some ideas here - but there's no substitute for a human with one foot in development and the other in deployment to deal with the edge cases that always come up.

I've been doing this kind of role myself for the last few years and really enjoy it, so it was nice to get some validation. That said, I'd rather think of myself as a devops enabler than a tech ops nazi !

Friday, April 2, 2010

EC2 Marks the Spot

I've been using Amazon EC2 at work and for personal projects for about three years now. It's got to the stage where I can't remember how we used to manage without being able to spin up a test or development machine on demand.

One downside of EC2 compared to other hosting options is the price. The cheapest on-demand Linux instance costs 8.5 cents per hour, which works out to just over $60 a month - a bit expensive if you just want to run something like a web server and don't need full sytem access. However, Amazon just came up with a way to cut this cost significantly by introducing Spot Instances.

You can read all the details by following the link, but the basic idea is that you bid on spare EC2 capacity by specifying the maximum price per hour that you're prepared to pay. If the current spot price (which varies continually based on supply and demand) is less than your bid, Amazon will start up your instance and keep it running until the spot price becomes higher than your bid. At that point your instance will be terminated.

The surprising thing I found after monitoring the small Linux instance spot price for a few days is that it's a LOT less than the on-demand price - it's stayed between 2.9c and 3.1c per hour. That means that if you bid, say, 5 cents per hour for your spot instance, you'll be pretty certain of getting a cheap, long-running instance unless there is a sudden spike in demand and the spot price goes over 5c / hour.

The only down side I've found so far, apart from the necessity to be prepared for the instance shutting down without warning (which you ought to do for all EC2 instances anyway) is that you seem to have to wait a while longer for your instance to be started up; the ones I tried took one or two hours, rather than the usual 15 minutes or so.

As a way to warn me if the EC2 spot price is getting close to my bid price, I wrote a Nagios plugin, check_ec2_spot_price, that will send me a warning if the spot price goes above a specified value. You can download it from the Nagios Exchange.

I'm also working on a Munin plugin that will graph the EC2 spot price over time; I'll post that here too when it's ready.

Optimal Operations

Tuesday, May 10, 2011

Testing, 1-2-3

Friday, October 8, 2010

Morning, Campers

Thursday, July 29, 2010

Migrating a Hudson instance

Wednesday, July 7, 2010

Hudson as a CVS Watcher on Windows

Friday, June 18, 2010

Location Irrelevance

Thursday, June 10, 2010

The Tech Ops Nazi

Friday, April 2, 2010

EC2 Marks the Spot

About Me

Talks I Did in 2013

Subscribe To Optimal Operations

Blog Archive

My Blog List

Stack Overflow flair

Server Fault flair