Posts Tagged ‘testing’

Extend Your Toolbox: Custom Matchers

Posted in Programming, testing on February 4th, 2012 by Aviv Ben-Yosef – Be the first to comment

I’d like to point out a really nice testing practice that I’ve been loving more and more lately.

Just about every mature testing framework out there supports the concept of custom matchers, which provide us with the ability to define our very own assertions seamlessly into the tests. Even though this ability is quite old, we don’t see it used too often and I think that’s a shame. I’ve seen this practice heavily used in the mind expanding GOOS book and just now am starting to realize its awesomeness.

toolbox

Your Testing Toolbox


Note: examples in this post are shown in Ruby using RSpec’s matchers but the concept is pretty much identical (as can be seen for example in Java’s Hamcrest Matchers).

Matchers 101

Creating your own matcher usually means creating a Matcher class that performs the assertions, supplies human readable error messages and a nice constructor.

Here’s an example from the RSpec documentation:

RSpec::Matchers.define :be_a_multiple_of do |expected|
  match do |actual|
    actual % expected == 0
  end
end

Matchers increase readability and intent

As you should know, one of the most important rules for design is Reveals Intent. Take a quick look here, which way do you think reveals more intent?

# This
response['X-Runtime'].should =~ /[\d\.]+/

# .. or this?
response['X-Runtime'].should be_a_number
view raw intent.rb This Gist brought to you by GitHub.

Also, which error message do you prefer? “expected false to be true” or something along the lines of “expected comment to be anonymous”?

Matchers create robust tests

The most important advantage of all is how using matchers easily allows you to steer away from fragile tests which are the bane of a lot of testing efforts.
The mark of good tests is that a change in your code doesn’t require you to perform changes in multiple tests that don’t really care for the change.
Take this code for example:

expected_comment = Comment.new(anonymous: true, user: "the dude", reply_to: nil)
commentor.should_receive(:add).with(expected_comment)

This might seem like a standard test, but that’s not really the case. A test should assert for a single piece of knowledge, and this test actually checks several. If the purpose of this test is to check the behavior of anonymous comments, why should it change if we no longer allow replies? Or if we no longer require users for posting comments?

The magic of matchers is exactly here. You create a new matcher to check specifically the aspect your test cares about and *boom*, you’re decoupled!

commentor.should_receive(:add).with(anonymous_comment)

This simple change makes your tests DRY and cool.

Happy testing!

Your should subscribe to my feed or follow me on twitter!

Looking Back on 18 months of Testing and TDD at a Startup

Posted in Programming, testing on January 6th, 2012 by Aviv Ben-Yosef – 4 Comments

As we’re approaching a year and half here at BillGuard, I’ve started thinking back a bit about our testing habits and how well that’s turned out.

I’ve seen a lot of posts about testing in startups, some saying startups shouldn’t bother to test because they’ll have to change the whole damn thing 5 minutes after they’re done, others claim testing is the only reason they were able to keep working. Here are some of my thoughts looking back.

Our Background

When we started, only two of us had a test-infected background out of the five technical guys, me being big on TDD. Two other developers never wrote tests before. We agreed that tests were important, but that’s about it. I set up a continuous integration server and with that we were off. With time, the habit of writing tests spread out among the team. Some are TDD passionate, some write tests after the fact, but we generally all believe that tests should be written extensively.

None of us, ever

Not everything is worth testing

We’ve seen several quite rapid changes to our UI. Having less tests in this area makes sense. We rely on QA for making sure all buttons are displayed etc. To make this clear: we have no selenium-like tests for UI components but have tests for most logic being done by the UI. I think this is generally a good practice, since having to maintain selenium tests would be hard when you throw things around a lot and change flows. Some basic automated sanity tests pretty much does it.

Everyone learned to love tests

I love seeing other guys in the team delete a line of code to see which test breaks and understand why it’s there. Even more I love the frowning face when no tests break. This addiction to tests shows how much value the team’s getting out of having solid tests, hands down. No need to stress this further I believe.

Tests save our asses repeatedly

Having an extensive suite of tests allows us to make rapid changes to our code base, as is needed in most startups, and rely on the solid tests to tell us whether we’ve screwed something up. All the code that has anything whatsoever to do with sensitive and important information is heavily tested which is a huge bonus and a necessity in our line of business (personal finance protection).

TDD is just magical with complex algorithms

We have quite a few complex algorithms that require multiple entities and ideas to perform. I find that the parts we’re most satisfied with maintainability-wise are the heavily TDD-ed algorithms we’ve got. Being written with rigorous TDD gives us so many advantages:

  • This critical code usually has a lot less defects.
  • The code is a lot more readable, well decomposed and allows for easy changing once we find out a need for tweaking the algorithms.
  • Working in TDD magically forces us to form our problem domain better, making us have a language of our own in talking about the problem. This happens less naturally in other forms of working on algorithms.

Summing our testing experiences

All in all, I think the whole team would agree that dedicating time to writing thorough tests is proving itself valuable and because of that people are writing more and more tests without any of us ever stopping and saying “we should write tests” (well, I swear I didn’t do it too much). It happens naturally when people get the value out of it. It’s fun seeing how today BillGuard has become a company that organically values testing so much I don’t even feel a great need to stress it to new people because they’ll quickly see there’s no real other way. We’re far from being the poster children of Clean Code, but I’ve got my fingers crossed.

If you’re interested in accomplishing the same at your work, you might find this recent post of mine of some help.

You should subscribe to my feed or follow me on twitter!

Sometimes Tests Have to Fail

Posted in Programming, testing on April 3rd, 2011 by Aviv Ben-Yosef – Be the first to comment

A friend asked me about a common problem that pops up in real-world projects and testing: What do you do when you test code with random properties?

A simple example might be handing out jobs to a few workers. If your algorithm for doing that is random, you can usually assert that no one of 3 workers gets all 10 jobs, for example. But, being random, that assert should eventually fail. We’ll assume that with the frequency the team runs the tests, a failure is expected every few days.

Surely no one wants to see the tests fail a couple times a week (especially if you’re keeping score for who broke the build). On the other hand, you’d like to keep the tests. What is a pragmatic coder to do?

If you’re not that meticulous to your suite rarely failing, you might just leave it as it is, which, I think, sucks.

The mega-tester’s approach, which I’ve tried in the past, is usually to stub out the random number generator with values that make sure the failures won’t happen. This is usually cost-effective only for the simplest of cases, and the more complex ones results in brittle tests that are coupled to the implementation and that might need to be changed frequently.

What I rather is to postpone the problem! Say we change our test’s parameters to 10 workers and 3000 jobs. The chances of one worker getting all jobs becomes quite minor. This tweak of parameters in the test is usually simple to do and can guarantee quite a safety net.

And still, sometimes bad stuff happen. 64bit hash collisions are somewhere, out there in the world. If you’re one of those guys that are bugged by that chance, I give you a simple JUnit rule that will retry a specific test in case it fails, making it twice as unlikely to fail. Those 64bit collisions are now more like 128bit! woohoo!

The rule allows you to simply annotate a test to make it retry in case it fails:

public class RetrierTest {
  private static int count = 0;

  @Rule public RetryRule rule = new RetryRule();

  @Test
  @Retry
  public void failsFirst() throws Exception {
    count++;
    assertEquals(2, count);
  }
}

And the implementation is as simple as:

@Retention(RetentionPolicy.RUNTIME)
public @interface Retry {}

view raw Retry.java This Gist brought to you by GitHub.
public class RetryRule implements MethodRule {
  @Override public Statement apply(final Statement base, final FrameworkMethod method, Object target) {
    return new Statement() {
      @Override public void evaluate() throws Throwable {
        try {
          base.evaluate();
        } catch (Throwable t) {
          Retry retry = method.getAnnotation(Retry.class);
          if (retry != null) {
            base.evaluate();
          } else {
            throw t;
          }
        }
      }
    };
  }
}

With the tests so unlikely to fail, I’d start a lottery at work for whoever breaks them.

Happy testing!

You should subscribe to my feed and follow me on twitter!

Testing Techniques: Managing External Resources

Posted in Programming, testing on April 1st, 2011 by Aviv Ben-Yosef – 1 Comment

A friend approached me with one of the known problems in the testing world – How do you keep external resources under a test harness? Having heard the question a few times before, I thought I’d share my thoughts, and mainly put together the common advice that drifts around the web.

The Dilemma

Nowadays, it’s hard to get more than a 100 lines of code before adding an external resource to our code. It might be a web service to manage something, or some convoluted API to receive data from or just about anything. Usually, writing tests for code that directly talks with these resources using the resources themselves is very problematic, for numerous reasons:

  • It significantly slows the tests, because it requires network access and processing on the service’s side.
  • It might cost you money, send emails, tweet stuff and do things you’d rather not do 300 times a day as you run your tests.
  • Making your code handle error conditions with the service is hard or impossible, as you can’t control when those occur.

Basically, all of these factors usually amount up to you having crappy tests that you rarely run. That sucks.

Decouple & Isolate

The best solution I’m aware of is simply isolating the thing. We usually strive to wrap whatever service we’re using with a single-point interface. The decoupling is great since I’ve yet to encounter a service with an API that matched my thinking of the domain problem. Wrapping it up allows us to keep using our own language and logic throughout the system.

A benefit of that is we now have a simple interface or facade we need to stub/mock out during tests. That’s usually relatively easy, and allows us to run our tests blazingly fast and test all those hard to reach to corner cases.

But what if the service changes?

That’s the finishing touch. You should still maintain a suite of tests that run against the real service. Those should be the plain tests that make sure you’re using the API right and that would break if anything you’re relying on changes. These tests won’t be part of your regular suite that gets run constantly. Instead have your CI server run them daily/weekly and let you know when something changes.

This puts us basically in a win-win situation, with us being able to run our tests quickly and yet have the assurance that we won’t miss API changes and the likes.

Happy testing!

You should subscribe to my feed and follow me on twitter!

Adding GOOS Sauce to GWT MVP

Posted in Programming on December 18th, 2010 by Aviv Ben-Yosef – Be the first to comment

For a few months now I’ve been using Google Web Toolkit. One thing that was bothering me was that even when following the praised MVP (Model-View-Presenter) pattern as per the documentation, you pretty quickly get into messy land.

Here’s a snippet from the official GWT MVP tutorial:

In this example, you see that our Presenter, when bound, registers a click handler for a button, in order to perform some action when it is called. This might seem nice and all, but there’s a smell. This is a violation of the Law of Demeter (the missing SOLID rule, one might say). This simply makes it harder to test, since we now have to add another layer of indirection between the SUT and its collaborators. Instead of making the view a tiny bit smarter, we use it as a dumb collection of widgets the presenter manages. This is clearly not in “Tell, don’t ask” form.

The thing that really bothers me is how coupled the presenter gets with its view. Take the above example, and say that you decided that it would be better to have two “save” buttons on the UI. Does the presenter really care? Should it even change? And what if you actually want the save button to change to a remove button when the user picked something? Should the presenter now deal with getSaveOrRemoveButton() ? Of course not.

GOOS it up

After beating around this bush for quite some time, I decided to try and find a better way. I’m currently reading the brilliant Growing Object Oriented Software book, and decided to try its approach to push a better implementation. After a bit of refactoring I got this:

This might seem like a tiny change. And it is. But it makes all the difference in the world in how more responsive your design gets, especially in our world where the view is most likely to change a dozen times before settling on something. Once there are enough of these, I push the presenter as a dependency into the view, and let it call the presenter directly. The funny thing is this style is actually implicitly mentioned in the second part of the GWT MVP tutorial. Just some GOOSing helped us get to a better, more malleable design!

Don’t be afraid to do something differently than the documentation, especially if you gave it a fair shot and it didn’t work out.

You should subscribe to my feed and follow me on twitter!

Say No to Null Checks

Posted in Programming, testing on November 14th, 2010 by Aviv Ben-Yosef – 4 Comments

Hey, do you check your methods’ arguments to make sure they’re not null?

Today, I got into a little discussion with a teammate about testing contracts of methods: should we check for null in every public method?

I was against it, and he was for it.

The simple reasons to do it are, first, that it makes your code more defensive. You fail explicitly instead of failing implicitly when the code tries to dereference the null object. Another argument was that given 20 callers of an interface, it’s easier to test in the interface for the precondition than to test each and every one of the callers. And, of course, that it is better API implementation, and that even if a class isn’t part of one’s public API now it might very well become part of it in the future, so why not add the tests now?

I’ll tackle these all. First, I have to agree that some null checks are required, at the boundaries of your system. I believe a system should have a paranoid barrier, before it everything is as suspect as someone going on a bus with a heavy coat on a hot day – that’s just waiting to blow. Once you’ve passed the barrier you know things are secure and no longer need to be paranoid.

So yes, some null checks are of course required.

But, because we want our API to be user-friendly and error-proof does that mean we need to make every public method in our code paranoid just in case it will become part of the public API at some point? 5 letters: YAGNI! :)

The interesting part is the testing of the callers. I agree, if we have to write the test 20 times for each caller, it will get tedious. But we don’t write the same thing twice, do we? As good old J.B Rainsberger teaches, what we actually need are collaboration tests. Each of the callers collaborate with the interface. And so, we create a collaboration test that makes sure the user is using the interface according to the contract. These are usually abstract tests that require us to create derivatives that implement a factory method for creating the calling class. This way we write the tests only once and make explicit the interface and contract, even in dynamic language.

In general, this is a powerful solution, that solves a basic problem with defensive programming. Say we do test for nullity wherever possible, what do we do then? Our system is likely to crash or throw an exception any way, since what is the interface to do? Obviously something is wrong if we were called in a way that doesn’t match the contract, so is the hassle worth it? I think testing for nullity everywhere is a thing of the past, especially once you adopt dynamic programming and get used to the fact that most of the times you can’t even be sure the object you’ve got will answer the methods you’re about to use, so what difference does a null check make?

So let’s write some awesome collaboration tests tests and get cracking!

You should subscribe to my feed and follow me on twitter!

Case Study: Refactoring Interfaces with TDDed Tests

Posted in Programming, testing on June 2nd, 2010 by Aviv Ben-Yosef – 1 Comment

I’ve been practicing TDD for a couple of years now, and keep learning all the time.

In the past year I’ve been mainly working on a single project, the longest I’ve worked on a project with TDD. Putting aside how fun it is (TDD saved me quite a few times for me to be sure it’s worthwhile), working on a project for so long I finally got to see some of the main problems people have against TDD.

With the hundreds of test you have, refactoring on the class-interface level (that is, the interfaces of classes, and not inside classes) can be problematic, with you having to update all the tests.

I’m still learning how to handle this efficiently, and would like to share an experience I had today. This is an example of a problem regarding 2 collaborators and an interface change. Such refactorings in a TDD environment weren’t mentioned in the excellent “TDD by Example” book and similar works, so I’m pretty much guessing here.

The example:

The change we’re interested in is making “eject” simply open the lid, without rewinding, and making the rewind operation public. This is in order to allow LazyPerson to take the tape out, without having to wait. Gary Bernhardt wrote about this kind of changes a bit. I agree, the fact I need to make such a change is against the OCP. What can I say, I’m not perfect and made a design mistake. Saying “that’s not OCP” doesn’t help me – I’ve got this code and tests, and I need to change them.

I used to succumb to the temptation and make all the changes in one sweep. That means changing all the tests and the classes, then running the tests and hope they still pass. This, of course, is a crappy way of doing this. Had I been able to actually perform such tasks, I’d write less tests. The secret is baby steps. The pressure Kent Beck puts on baby steps and gradually working towards change made me consider this and force myself to find a safe way of doing this.

I decided to start with the VCR and its test (the AutomaticPresenter doesn’t use the VCR itself but an interface, and the tests use test doubles. This means changing one part won’t break the other’s unit tests). The path to enlightenment lies in finding the baby step that allows starting the refactoring without breaking the rest of the tests. I decided to add a test for the should-now-be-public “rewind” operation, while not breaking the existing tests.

The solution is adding a default value for telling the “eject” function whether it should rewind or not. This means existing users (be them tests or not) will still get the previous behavior, and new tests can start work with the new interface (in Java I’d probably do this with method overloading):

This got me to green pretty fast. Now I can slowly remove every rewind-related assertion from the old tests and also add the “should_rewind=False” flag to them, all with quick-green cycles. And we’re done with the first half.

The next move is to change the AutomaticPresenter to call “rewind” before “eject”, which is now really easy to do in the tests. Once we hit green, we remove the “should_rewind” flag and be done with the refactoring. Baby steps save the day:

Being able to get the refactoring working so easily makes me happy, but I’m still not sure this is the smartest way, and there are harder refactorings to master ahead. Yet, I hope this will help TDD adopters see that it’s possible to handle refactorings even with many tests, because once the right baby-step is found, each test can take practically seconds to convert.

I’d really love getting feedback on this cycle.

You should subscribe to my RSS feed or follow me on twitter!

Python (nose) Test Coverage on Buildbot

Posted in Programming, testing on May 9th, 2010 by Aviv Ben-Yosef – 2 Comments

Once we got our builds happily running on Buildbot, there’s really no reason not to add coverage since it’s so easy (especially if you get bragging rights over your non-TDDers teammates).

All you have to do is this (code is based on this blog post, with adaptations to work on slaves that don’t share directories with the master, since the createSummary method runs on the master):