The Code Dump

A place a coder rants at...

Testing Techniques: Managing External Resources

| Comments

A friend approached me with one of the known problems in the testing world - How do you keep external resources under a test harness? Having heard the question a few times before, I thought I’d share my thoughts, and mainly put together the common advice that drifts around the web.

The Dilemma

Nowadays, it’s hard to get more than a 100 lines of code before adding an external resource to our code. It might be a web service to manage something, or some convoluted API to receive data from or just about anything. Usually, writing tests for code that directly talks with these resources using the resources themselves is very problematic, for numerous reasons:

  • It significantly slows the tests, because it requires network access and processing on the service’s side.

  • It might cost you money, send emails, tweet stuff and do things you’d rather not do 300 times a day as you run your tests.

  • Making your code handle error conditions with the service is hard or impossible, as you can’t control when those occur.

Basically, all of these factors usually amount up to you having crappy tests that you rarely run. That sucks.

Decouple & Isolate

The best solution I’m aware of is simply isolating the thing. We usually strive to wrap whatever service we’re using with a single-point interface. The decoupling is great since I’ve yet to encounter a service with an API that matched my thinking of the domain problem. Wrapping it up allows us to keep using our own language and logic throughout the system.

A benefit of that is we now have a simple interface or facade we need to stub/mock out during tests. That’s usually relatively easy, and allows us to run our tests blazingly fast and test all those hard to reach to corner cases.

But what if the service changes?

That’s the finishing touch. You should still maintain a suite of tests that run against the real service. Those should be the plain tests that make sure you’re using the API right and that would break if anything you’re relying on changes. These tests won’t be part of your regular suite that gets run constantly. Instead have your CI server run them daily/weekly and let you know when something changes.

This puts us basically in a win-win situation, with us being able to run our tests quickly and yet have the assurance that we won’t miss API changes and the likes.

Happy testing!

You should subscribe to my feed and follow me on twitter!

Design is Simpler Now: Embrace the Extract

| Comments

For the past 5 years or so I’ve been searching for ways to produce better designed code. I hate the fact I basically can’t put my finger on why certain designs aren’t as good as others.

That’s why I was really blown away when I first learned about the SOLID principles and started practicing TDD. At last I have found rules that gave me the capability to weigh designs, and a process that helped push me towards what feels like better code.

But even 5 rules were too much for me!

SOLID, no doubt, drives better design. My problem was incorporating it natively with my every day coding. Call me dumb, but I just can’t bring myself to contemplate 5 different aspects whenever I whip up a class. I still find it as an excellent checklist to go through when I’m considering refactorings, but thinking about it constantly just drained a big part of my concentration.

For a few months now I’ve been getting the feeling that my OOD toolset has reduced quite a lot to the very essence. That feeling was also magnified by reading GOOS and pretty much everything written by J. B. Rainsberger here and here.

The first tool I use heavily (and I mean heavily, my mind has managed to get OCD about it) is duplication - or DRY. This tool alone makes any codebase a magnitude better. I’ve written plenty about DRY before.

But, just yesterday I realized that other than that, I mainly concentrate on one thing, as I contemplated on twitter:

I think I can sum up all my OOD skills with “wait, shouldn’t this be in a different class/method?” Wondering if that’s a good thing…

Yup, that’s the trick. I was quickly assured by two amazing guys that have been doing this longer than I’ve been breathing, agile manifesto authors:

Ron Jeffries: Yes it is a good thing. I would suspect you also note duplication?

James Grenning: Think of the alternative.. you are asking the right question

You see that? Noticing duplication and moving stuff somewhere else. That’s all there’s to it. This simple question directs at you the Single Responsibility Principle and generally, along with DRY, covers most of the bases needed to adhere to the elements of simple design.

The main question I ask myself now every time I think of a problem, start changing a function, write a test, and at just about anytime I’m coding is “is this the right place for this?” And quite often the answer is “no.” Push this forward and beautiful designs show up, designs of short, cohesive classes. So, to sum it up: Embrace the Extract.

You should subscribe to my feed and follow me on twitter!

Crafting Up - Community is Key

| Comments

It’s been almost a year now since the founding of our local Software Craftsmanship group. This, for me, is a huge dream-come-true.

For years I’ve been looking for a good community around here to join, went to several meetups and looked around to no avail. My frustration grew about a year ago when I noticed the Chicago community is so buzzing with activity, people there have a meetup every day almost. That’s why when Uri started organizing the first meeting I jumped in whole-heartedly.

In just a few months the meeting has influenced me quite a lot. First of all I got to meet a lot of new, smart and interesting people I never would have otherwise. It’s not easy to find people that are as passionate about our profession as I am, yet our group didn’t disappoint me.

The meetings also supply my need to pair with new people. Pair programming is a magical way of working and sharing knowledge, and I’ve yet to have a session with a new pair without picking up something new. I love the first minutes where we have to find a common language to get things started, and even more the high fives of getting a green bar.

Also, a good community is the best way to get feedback. I can say I’m trying to leech this to the max. I’ve already gave talks/sessions at 2 meetings, bugging people frequently on twitter and the mailing list. A varied community of like minded people allows you to get different outlooks and insights to things you’ve been neck-deep in for a while.

And last but not least, a good community might make magic stuff happen. I don’t know how, but I’m sure our group had something to do with the fact some of us got to have dinner with Uncle Bob and Brett Schuchert, two awesome coders and Clean Code authors, on their last visit here.

Bottom line, be part of a community, and if there isn’t one around you help start it! It’s a great source of kindred spirits, an invaluable and rare resource!

You should subscribe to my feed and follow me on twitter!

Making Embedded GitHub Gists Show Up on RSS Readers

| Comments

Just a quick let-you-know: I found out that the gists I use to embed code in my posts don’t show up on RSS readers (e.g. Google Reader).

I know how annoying it is not to be able to read a blog fully from my reader for me, and so found a nice Wordpress plugin called Embed GitHub Gist that handling embedding gists elegantly and also automatically makes sure the code will be displayed even on readers.

I’ve even updated my latest post (about Chef and EC2) to work with it, and new posts from now on will look good too :)

Using Chef to Automatically Configure New EC2 Instances

| Comments

This is a follow up post to my post about using Puppet to get the same result. In the comments to that post I was told by a few people that chef can make my life easier and I decided to give a try. Here’s what I came up with.

In this post, as in the previous one, our goal is to be able to start a new EC2 instance with one command, which will in turn be created and started with Apache running.

First of all, instead of having to set up our own server to tell the newly created instances what to do, we are going to use a hosted chef server on Opscode’s server. The hosting is free for 5 nodes, and so you can try this out without having to pay them. Go to Opscode’s site and register a new user, then also add a new organization.

On our system, we need to start by installing chef. You will also want to install the dependencies needed to make chef talk with EC2 (these are not installed automatically when installing the gem because they’re optional):

Now, we need to setup a chef repository. This repository will contain our cookbooks (libraries that contain recipes, which are scripts for doing stuff, like installing apache) and roles (which map recipes to nodes), among other stuff. To get it run:

In the repository create a .chef directory. Now back on Opscode’s site, you need to download 3 files: your organization’s validator key, your user’s key and a generated knife.rb. Once installed, copy them all to the .chef directory:

These will be used by the new instances to connect to Opscode and identify themselves as truly being created by you (this saves us from having to hack an awkward solution for this to work on Puppet). Add to your knife.rb file your AWS credentials:

We will now fetch the apache2 cookbook, which will allow us to install apache on our instances by adding a single configuration line. To download an existing cookbook, do the following:

You can see what other cookbooks are made available by looking around here. Now, we’ll create a role for our instances. Create the file roles/appserver.rb with this data:

And to update our Opscode server with the new cookbook and role:

We’re getting really close now! You should have a security group define in AWS that has port 22 (SSH) open, for knife to be able to connect to it and configure it, and port 80 (HTTP) for our Apache to be available. I called mine “chef”. You will also need to decide with AMI (image) to use, you can find a list of AMIs supplied by Opscode here. And now, to create an instance with one command line, as promised:

This will take a while, as knife will create the instance, connect to it, install ruby, chef itself, apache etc. Once it says it has finished simply copy the public DNS of the newly created image (it should be printed once knife finishes) and open it in your browser. My, what a sense of accomplishment one gets from seeing the string “It works!”

I find this a lot easier, cleaner, stream-lined and fun. I’m still learning the ropes with chef, but it has already surprised by being easy to change, being completely git-integrated and by Opscode’s fast support (even for non-paying customers). You can dig further in these links.

You should subscribe to my feed and follow me on twitter!

Fake It Till You Make It - Team Edition

| Comments

Fake it till you make it is a known pattern in Test Driven Development implementation, which means one writes code that acts like it knows what it’s doing in order to know what it’s doing. This is a powerful technique and I’ve already written how using the same trick on the individual scale can help you make your team better.

I just recently realized that I had already seen this principle applied to a whole team which then caused a whole department to follow suit.

Back in 2005, I had the luck to join a particularly interesting team. Hanging around the section the team was part of clearly showed that all other teams regard that specific team (let’s call it A Team) as a highly skilled team. People said they were the XP (Extreme Programming) team, and were generally looked at as an example of how a good team should work.

After joining the team I got a look from the inside of what was really going on. All the developers were highly talented, but being “The XP Team”? Hah! 2 guys have read Kent Beck’s (amazingly awesome) Extreme Programming Explained and simply started pairing and writing automated unit tests before the code.

Simply starting with those 2 small parts of the XP way of doing things got them improved results which then got the rest of the section interested. By simply saying they were going to try that XP thing and saying it made their lives better, the A Team got the ball rolling for the whole section without never even trying to start an Agile Transition.

And this wasn’t a one trick poney! About 2 years later, the same thing happened with Scrum. One teammate read a good intro to it (back when it was still a free PDF), told the rest of the team which then decided to give a try. After a few sprints of seeing how organized standup meetings and the like actually helped our process we decided to keep it.

We didn’t try to “get everyone to realize this is the best way”. Some people happened to come inside our room during standups, or see the scrum board. Those alone got people interested and from then on again, A Team got the section to advance nicely.

This is a marvelous story that I only now realize how rare it is. Simply because the team looked to the rest of the section like they knew what they were doing it got all of them to try agile without having to break down walls or bust open doors. Sometimes just doing what feels right is enough.

Fake it till you make it is just another way of saying “If you build it they will come”!

You should subscribe to my feed and follow me on twitter!

You Owe it to Yourself to be Old-School

| Comments

I love watching House. My favorite episodes are those where he manages to debug an illness not by knowing an obscure desease, but by having the holistic knowledge of how the body works and thus being able to deduce the real problem.

I find this correlates very much to a set of tools and knowledge a lot of coders are missing that has tremendous value. Joel Spolsky wrote years ago that developers should learn C in order to have a thorough understanding of their environment. I actually think this should be taken a few notches further.

Learn C and some systems programming and you have the ability to grasp basics of most tools you use. How can you spot and truly understand memory leaks without having to manage memory allocation by yourself?

What would you do if some code you wrote or application you use suddenly simply blurts out it has a connection error? Or the Apache server you’re installing is acting up on you? My #1 power tool for these situations is simply opening wireshark and look at what goes through my wire. Learn the basics of TCP/IP and you’ll be able to debug most network problems swiftly.

And don’t get me started on using the shell. No matter what you think, having shell-fu pays off daily. Any text manipulation you’re thinking of, most simple processing tasks - you can whip up a oneliner to do it in less time than most IDEs take to start up.

And the reasons just go on and on. Reading important functions from the Linux kernel will help you understand why Java suddenly won’t fork child processes. Knowing how known security issues work (injections, buffer overflows, etc.) is the only way for you to catch security mistakes at the drawing-on-the-board stage and not at the shit-the-DB-is-stolen stage.

I don’t care if you’re doing Rails and never need to see the outside of a pointer. There’s nothing like having the holistic grasp of things to help you solve problems quickly, a-la Dirk Gently. All these points I’ve made in this post? All real problems solved in the last couple of months with some old-school chops.

Do yourself good - read K&R for some C understanding. Read the first chapters of TCP/IP Illustrated. Read Linux Kernel Development (3rd Edition) for a nice walk-through of the interesting parts. This knowledge won’t get obsolete anytime soon. Can you say that about your favorite framework?

You should subscribe to my feed and follow me on twitter!

Stop Wasting My Code

| Comments

During my service in the army I had the opportunity to move around some electronic equipment from place to place. A lot of it was pretty old (and by that I mean it predates me), but worked perfectly where it was. We had systems running for decades without a problem, but once we unplugged them and moved them to a different room they went dead.

Over time we’ve identified this phenomenon and simply noted that things that aren’t in use stop functioning. It used to puzzle me, but eventually I came to accept this. What still is hard for me to accept though, is the fact that this is exactly the same with software as it is with hardware, if not worse.

I thought I learned this lesson a few years ago, after reading the Pragmatic Programmer and having it hammer YAGNI and KISS to my head, but I keep getting surprised every time I find out that I’ve just done it again.

Actually, learning Git has made this problem rear its ugly head again. Git makes it easy to write up some code and then keep it somewhere. I’d either stash some changes or keep a side branch with some work I started. The really bad part is adding this code to production code, simply because it’s there. The problem is that code gets stale if it’s not really used, and fast.

I can’t think of a single case where we added code before it was actually needed and got something good out of it. Fact is, every line of code you write before there’s a real use case or actual need for is just you guessing. And we’re mostly guessing anyway about stuff we actually need to get done, so why add more ambiguity in there?

As I read in Growing Object-Oriented Software code isn’t sacred simply because it’s there, and it won’t take as long to write it again if you’ll need to. Don’t be afraid to delete code that isn’t actually needed just because you put two hours in it. The time you’ll spend maintaining it will take much more.

This is exactly the Lean definition of Waste - everything not adding value to customers, and adding code just for you to feel better isn’t helping your customers. I now consider waste as one of my sworn enemies. At my work I’ve decided to take on myself the role of do-we-really-need-that dude. It means being a PITA sometimes, but it pays off tenfold.

Next time you feel tempted to commit that code you’re not sure you’ll need anymore, keep in mind the best code is no code.

You should subscribe to my feed and follow me on twitter

Book Review: Growing Object-Oriented Software

| Comments

Starting with a test means that we have to describe what we want to achieve before we consider how.

2010, for me, was a year with quite a good reading list. It was when I first got to read some really good books such as Clean Code, Agile Software Development, TDD by Example and Apprenticeship Patterns. These are all stellar books I highly recommend.

Yes, indeed it was an awesome year and yet I can tell you that the best book I read this year is Growing Object-Oriented Software, Guided by Tests (GOOS, for short).

I actually never heard of the authors before 2010. As opposed to books by authors such as Kent Beck and Robert Martin which one regularly hears about, I was quite astonished that I kept hearing about this book in different places.

I heard talks mention it, I saw lots of tweets about it and quite a few people that I highly value were praising it. This picked my interest and boy, am I glad I decided to add it to my pile.

I’ve read a lot about better development, better testing and better everything. And yet, I’ve never come across a book as thourough and as comprehensive as GOOS. If you read my other reviews you will see that what usually buys me over are good code walk-throughs. Now let me tell you, you haven’t seen a good walk-through until you’ve seen GOOS.

Code isn’t sacred just because it exists, and the second time won’t take as long.

On the one hand, the book is loaded with practical tips for making your tests better, faster, more readable and maintainable. It covers the nuances of testing ORM systems, GUIs, multi-threading problems and more.

On the other hand, every page turn is greeted with more nuggets of OOP lore. Actually, seeing all this wisdom clustered so tightly by people that have been struggling with these problems for over a decade now seems illegal to me. Are we really allowed to learn so many secrets of the profession this fast? Surely some sort of blood sacrifice has to be made?

Once we start a major rework we can’t stop until finished. There’s a reason surgeons prefer keyhole surgery to opening up a patient.

I’ve read GOOS over the course of a few months, consuming chapters little by little and letting the knowledge sink in. I was amazed at how much this affected my way of thinking about OOP and TDD, pretty much right off the covers. I already blogged about how my new OOP-Spidey-Sense helped us improve our architecture.

I’ll finish with saying this book is a game-changer for me, even though I’ve been doing TDD for a few years now. To the authors, Nat and Steve, I take my hat off. They have earned a place of honor in my Deserve-A-Beer list.

And to sum up all these great quotes from GOOS, here’s another gem:

The last thing we should have to do is crack open the debugger and step through the tested code to find the point of disagreement.

You should subscribe to my feed and follow me on twitter!

Using Puppet to Automatically Configure New EC2 Instances

| Comments

Note: I posted an update about doing the same with chef here.

This is a quickie techie post that summarizes a few hours of learning that I wish someone else had put up on the web before me. I assume some knowledge about Puppet, and recommend the Pro Puppet book and heard good stuff about Puppet 2.7 Cookbook.

So, I wanted to be able to configure via Puppet the way our new instances should be configured, and then be able to easily spawn new instances that will get configured by said puppet. The first part is installing puppetmaster. I decided to manually setup an EC2 instance that will act as the puppet master:

Under /etc/puppet/manifests/site.pp we place the “main” entry point for the configuration. This is the file that is responsible for including the rest of the files. I copied the structure from somewhere where the actual classes were put under /etc/puppet/manifests/classes and import it in site.pp. Do note that currently this setup only supports a single type of node, but supporting more should be doable using external nodes to classify the node types.

Auto-signing new instances

A common problem with puppet setups is that whenever a new puppet connects to the puppet master it hands it a certificate which you then have to automatically sign before the puppetmaster will agree to configure it. This is problematic in setups like mine where I want to be able to spawn new instances with a script and don’t hassle with jumping between the machines right after the certificate was sent and approving it. I found two ways to circumvent this:

1. Simply auto-signing everything and relying on firewalls

In case you can allow yourself to firewall the puppetmaster port (tcp/8140) to be only accessible to trusted instances, you do not actually need to sign the certificates, you can tell puppet to trust whatever it gets and leave the security in the hands of your trusty firewall. With EC2 this is extremely easy:

  • Setup a security group, I’ll call mine “puppets”
  • Add a security exception to the puppetmaster that allows access to all instances in the “puppets” group
  • Create all puppet instances in the “puppets” security group
  • Configure puppet to automatically sign all requests: echo “*” > /etc/puppet/autosign.conf

I decided to go with this solution since it’s simpler and less likely to get broken. I didn’t see it documented anywhere else. The downside is that you’ve got to have your puppetmaster on EC2 too.

2. Automatically identifying new instances and adding them

This is a solution I saw mentioned a few times online. Using the EC2 API tools write a script that gets the DNS names of all the trusted instances you’ve got and write them. Once you have this getting it to run with a cron job every minute will do the trick. This can be done with sophisticated scripts, but for my (very initial) testing, this seemed to work:

Getting new instances to connect to the master

The last piece of the puzzle. Since we use Ubuntu, we could simply use the Canonical-supplied AMIs. These support user-data scripts that are executed as root once the system boots. Below is a simple script that does this:

  1. Update the instance
  2. Add the “puppet” entry to DNS - puppet expects the master to be accessible via “puppet” DNS resolution. This little snippet gets the current IP of the master via our DNS name and writes it to /etc/hosts
  3. Install & enable puppet and voila!

Once all of this is up and running, creating a new instance is as easy as:

ec2-run-instances -g puppets --user-data-file start_puppet.sh -t m1.small -k key-pair ami-a403f7cd

Happy puppeting!

You should subscribe to my feed and follow me on twitter!