Sunday, October 10, 2010

Test Principles - Part II

a designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.

Antoine de Saint Exupéry

Part I of my blog series Test Principles suggested the usage of Design by Contract to reduce and detect bugs, introduced by test code.

In addition of detecting bugs, tests serve as documentation of how to use an API and what a user can expect in a specific scenario. But this documentation is almost useless if the effort required to understand the test code is comparable to reading the code implementing the API. If it is difficult to understand how a test works and what it is intended to do than it is less likely to find a logical bug.

Tests should be focused and easy to understand

To improve the readability of tests I make sure that they do have the following properties:

  • all statements of my tests are defined at the same level of abstraction
  • all code that is only technically required to setup a test scenario is hidden behind test helper methods or test builders to keep the test focused
  • all tests are self contained by defining all test values inside the test not fields or constants
  • all test values are obtained from test methods named in a way emphasizing the important properties of the returned values

These are the hard and fast rules, but of course test design like all software design has to do trade-offs. For example when to much redundancy is introduced.

Mixing statements of different levels of abstraction inside one block of code is a bad practice, even for test code. But sometimes developers consider test code to be less valuable than production code and tend to care not much about its quality. I value test code as much as production code because:

  • continuous improvement of production code relies heavily on a huge test base. If not written properly and free of duplication, maintenance will become a nightmare.
  • tests are the foundation of safe refactorings and economic changes of business contracts. If a contract gets violated or changed, it is the test detecting it first and providing the most valuable feedback.
  • developers in need of looking up the contract or usage of an API under development can do that easily by using the tests as an index. If not structured properly or not documenting every scenario of importance, reading code is the last resort of exact documentation.

This list isn't even close to be complete, but should be motivation enough to care about the quality of tests. To make things clear I will start with an example. Let's consider the class Mail from Part I once again.

This time I write the test tagging a mail with a label, the mail is yet not tagged with, straight forward. This implementation is very common and can be improved in many ways:

The way I do setup a set of labels is very low level compared to tagging the mail with a label. But tagging the mail is the main purpose why we write the test, making its level of abstraction the level the whole test has to be formulated at. Let me improve the test:

I moved the code creating the set of labels to a test helper method and gave it a name reflecting what I expect.

In case you wonder why the test helper method is populated with all the Java assert statements, this is my approach to use Design by Contract for test helper methods as explained in Part I of my Test Principles series.

All the statements are now at a similar level of abstraction. The test code still lacks to be focused on what to test and does not express which properties of the test values are significant and which are not. Tests written this way are both, difficult to maintain and understand.

A focused test contains only:

  • non technical statements
  • test values significant to define the scenario
  • statements using the API under test
  • statements verifying the expectations

When I create the test value mail I do not care about the values for the sender, recipient, subject and message. To avoid burdening the reader with all the information he has not to care about, I move the creation of the mail out of the test.

Now the test itself is focused on the aspect of tagging the mail with a label, while lacking to be self contained. The test alone reveals nothing about the test value mail, which, if already tagged with LABEL_3, would make the test pass, even for an empty implementation of the method Mail.tag(String label).

This approach of organizing tests is very common but I would not suggest it. But it is a good start to discuss further improvements.

When I inspect a test, I prefer to have all the informations relevant for understanding the test introduced by the test itself, instead of looking up test values in fields of the test instance.

Let's inspect the last refactored test code again. To verify the test does something useful, we have to lookup the initialization of the test value mail. Jumping from one code block to another, collecting all the relevant information required to get a complete picture of the test, increases the complexity and reduces the readability of the test. To increase the readability, I add the statement constructing the test value mail back to the test method, narrowed to the input parameters significant to make the test plausible.

Now everything that is important to understand how I try to test the aspect of mail tagging gets defined by the test method itself. Browsing through the code is not necessary anymore, which in my opinion does increase the readability. But there is still one major improvement left: the references to the three label constants.

The definition of literal values in a test method or class has one major drawback: the criteria of how the values were chosen is not explicit, but has to be guess by inspecting their values.

Let's deal with the three label constants left in the test. LABEL_1, LABEL_2 and LABEL_3 are defined outside the test method. This forces the reader to browse away from the test which reduces the readability. Further, when inspecting their values the reader has to guess why they were chosen. They have:

  • different values
  • the same prefix label
  • a numeric suffix
  • the numbers are positive and greater 0
  • the numbers are natural
  • ...

The question arising is: which of those properties are of importance for the test and which are not? It would be easier to understand the test if the criterion of how the test values were chosen would be explicit. Here is another improvement:

I replaced the method createMail(Set<String>) with anyMailTaggedWith(Set<String>) emphasizing which property of the test value is of importance: an arbitrary mail tagged with a specific set of labels.

The specific set of labels used is the result of the factory method anyLabels() which does create an arbitrary non empty set of labels. Why is a non empty set of importance to me, you may ask? I prefer to test boundary values explicit in a separate test instead of relying to get one by chance. The last method introduced, anyLabelDifferentFrom(Set<String>), has to return any label not contained in the set of labels passed in. Now the test makes its choices of proper test values explicit.

For those of you interested in how to implement the test value factory methods used in the test, I added some example implementations below (which do not return randomly generated values).

Tests written the way I have explained let you jump right into any test. You do not have to remember test class constants or test instance fields to get an understanding of the test method you are about to inspect due to its self containment. Further the test is focused on the aspect it's intended to test, leaving technical details about setting up the test scenario to test helper methods. Those are named emphasizing the properties of importance to the test which does help to keep all the statements defining the test on the same level of abstraction, improving the readability of the test.

In an upcoming blog I will explain the rules I follow for naming test value factory methods, prefixing them with any, to improve the readability and why I use randomly generated values in preference to literal values.

TO BE CONTINUED ...